Manipulate content in Word Files

With the following examples, you will learn how to use the GemBox.Document component to manipulate content within Word documents using both C# and VB.NET:

Insert HTML Content

The following example shows how to insert plan text, HTML text, and image at specific document positions.

using GemBox.Document;
using System.Linq;

class Program
{
    static void Main()
    {
        // If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY");

        var document = DocumentModel.Load("%#ManipulateContent.docx%");
        var section = document.Sections[0];

        // Set content of 1st paragraph using plain text.
        section.Blocks[0].Content.LoadText("Inserted plain text to first paragraph.");

        // Set content of 2nd paragraph using hyperlink.
        var hyperlink = new Hyperlink(document, "https://www.gemboxsoftware.com/", "Inserted hyperlink.");
        section.Blocks[1].Content.Set(hyperlink.Content);

        // Insert HTML text at the end of 3rd paragraph.
        section.Blocks[2].Content.End
            .LoadText("<p style='color:orange'>Inserted HTML text with orange color.</p>",
                new HtmlLoadOptions() { InheritCharacterFormat = true, InheritParagraphFormat = true });

        // Insert picture at the beginning of last paragraph.
        var picture = new Picture(document, "%#Dices.png%", 40, 30);
        section.Blocks.Last().Content.Start.InsertRange(picture.Content);

        document.Save("InsertContent.%OutputFileType%");
    }
}
Imports GemBox.Document
Imports System.Linq

Module Program

    Sub Main()

        ' If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY")

        Dim document = DocumentModel.Load("%#ManipulateContent.docx%")
        Dim section = document.Sections(0)

        ' Set content of 1st paragraph using plain text.
        section.Blocks(0).Content.LoadText("Inserted plain text to first paragraph.")

        ' Set content of 2nd paragraph using hyperlink.
        Dim hyperlink As New Hyperlink(document, "https://www.gemboxsoftware.com/", "Inserted hyperlink.")
        section.Blocks(1).Content.Set(hyperlink.Content)

        ' Insert HTML text at the end of 3rd paragraph.
        section.Blocks(2).Content.End _
            .LoadText("<p style='color:orange'>Inserted HTML text with orange color.</p>",
                New HtmlLoadOptions() With {.InheritCharacterFormat = True, .InheritParagraphFormat = True})

        ' Insert picture at the beginning of last paragraph.
        Dim picture As New Picture(document, "%#Dices.png%", 40, 30)
        section.Blocks.Last().Content.Start.InsertRange(picture.Content)

        document.Save("InsertContent.%OutputFileType%")

    End Sub
End Module
Word document with inserted plain text, HTML text, and image in C# and VB.NET
Screenshot of inserted HTML text and image in Word document

The inserted content can be plain text with specified optional formatting or rich formatted text like HTML and RTF. You can insert the text content using one of the ContentPosition.LoadText or ContentRange.LoadText methods.

You can also insert arbitrary document content using the ContentPosition.InsertRange or ContentRange.Set method.

Get Content

The following example shows how you can retrieve the plain text representation of document elements by using the ContentRange.ToString method.

using GemBox.Document;
using System;

class Program
{
    static void Main()
    {
        // If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY");

        var document = DocumentModel.Load("%#ManipulateContent.docx%");
        var section = document.Sections[0];

        // Get content from 1st paragraph.
        ContentRange firstParagraphContent = section.Blocks[0].Content;
        Console.WriteLine(firstParagraphContent.ToString());

        // Get content from 2nd and 3rd paragraphs.
        ContentRange multipleParagraphsContent = new ContentRange(
            section.Blocks[1].Content.Start,
            section.Blocks[2].Content.End);
        Console.WriteLine(multipleParagraphsContent.ToString());
    }
}
Imports GemBox.Document
Imports System

Module Program

    Sub Main()

        ' If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY")

        Dim document = DocumentModel.Load("%#ManipulateContent.docx%")
        Dim section = document.Sections(0)

        ' Get content from 1st paragraph.
        Dim firstParagraphContent As ContentRange = section.Blocks(0).Content
        Console.WriteLine(firstParagraphContent.ToString())

        ' Get content from 2nd and 3rd paragraphs.
        Dim multipleParagraphsContent As New ContentRange(
            section.Blocks(1).Content.Start,
            section.Blocks(2).Content.End)
        Console.WriteLine(multipleParagraphsContent.ToString())

    End Sub
End Module
Plain text representation of elements in Word file
Screenshot of retrieved Word document text

The ContentRange class is exposed to the following members:

Delete Content

The following example shows various ways you can delete content from a Word document.

using GemBox.Document;
using System.Linq;

class Program
{
    static void Main()
    {
        // If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY");

        var document = DocumentModel.Load("%#ManipulateContent.docx%");
        var section = document.Sections[0];

        // Delete content from 1st and 2nd paragraph.
        ContentRange multipleParagraphsContent = new ContentRange(
            section.Blocks[0].Content.Start,
            section.Blocks[1].Content.End);
        multipleParagraphsContent.Delete();

        // Delete content from last (4th) paragraph.
        ContentRange lastParagraphContent = section.Blocks.Last().Content;
        lastParagraphContent.Delete();

        document.Save("DeleteContent.%OutputFileType%");
    }
}
Imports GemBox.Document
Imports System.Linq

Module Program

    Sub Main()

        ' If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY")

        Dim document = DocumentModel.Load("%#ManipulateContent.docx%")
        Dim section = document.Sections(0)

        ' Delete content from 1st and 2nd paragraph.
        Dim multipleParagraphsContent As New ContentRange(
            section.Blocks(0).Content.Start,
            section.Blocks(1).Content.End)
        multipleParagraphsContent.Delete()

        ' Delete content from last (4th) paragraph.
        Dim lastParagraphContent As ContentRange = section.Blocks.Last().Content
        lastParagraphContent.Delete()

        document.Save("DeleteContent.%OutputFileType%")
    End Sub
End Module
Deleted elements and specific text from Word file.
Screenshot of deleted content in output Word document

It's possible to remove any element from the document by calling the ElementCollection.RemoveAt method on the Element.ParentCollection.

You can also delete any arbitrary document content like parts of an element, as well as single or multiple elements, by using the ContentRange.Delete method.

See also


Next steps

GemBox.Document is a .NET component that enables you to read, write, edit, convert, and print document files from your .NET applications using one simple API. How about testing it today?

Download Buy