Extract individual pages from a document

GemBox.Document provides various APIs that allow you to work with individual pages and elements on the page. With this component you can:

Extract a page from a document

The following example shows how to save the second page of a document to any file format using C# and VB.NET.

Upload your file (Drag file here)
using GemBox.Document;

class Program
{
    static void Main()
    {
        // If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY");

        var document = DocumentModel.Load("%InputFileName%");

        var paginator = document.GetPaginator();

        var secondPage = paginator.Pages[1];

        secondPage.Save("SecondPage.%OutputFileType%");
    }
}
Imports GemBox.Document

Module Program

    Sub Main()

        ' If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY")

        Dim document = DocumentModel.Load("%InputFileName%")

        Dim paginator = document.GetPaginator()

        Dim secondPage = paginator.Pages(1)

        secondPage.Save("SecondPage.%OutputFileType%")
    End Sub

End Module
Word document with an extracted page
Screenshot of a document with an extracted page

You can also extract the DocumentModel from the page and iterate through its content.

var secondPageDocument = paginator.Pages[1].ConvertToDocument();

foreach (Paragraph paragraph in secondPageDocument.GetChildElements(true, ElementType.Paragraph))
    Console.WriteLine(paragraph.Content.ToString());
Dim secondPageDocument = paginator.Pages(1).ConvertToDocument()

For Each paragraph As Paragraph In secondPageDocument.GetChildElements(True, ElementType.Paragraph)
    Console.WriteLine(paragraph.Content.ToString())
Next

Both use cases can be achieved for a range of pages as well, as shown below.

// Save pages to a new document.
paginator.GetRange(2, 5).Save("Output.docx");

// Extract content of the pages.
var extractedDocument = paginator.GetRange(2, 5).ConvertToDocument();
' Save pages to a new document.
paginator.GetRange(2, 5).Save("Output.docx")

' Extract content of the pages.
Dim extractedDocument = paginator.GetRange(2, 5).ConvertToDocument()

Determine the start and end of a page inside a document

GemBox.Document gives you information about the start and the end of every page in the document. The following example shows how you can use this information to insert content at the start of every page.

Upload your file (Drag file here)
using GemBox.Document;

class Program
{
    static void Main()
    {
        // If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY");

        var document = DocumentModel.Load("%InputFileName%");

        var paginator = document.GetPaginator();

        for (var i = 0; i < paginator.Pages.Count; i++)
        {
            var page = paginator.Pages[i];
            var pageRange = page.Range;
            var start = pageRange.Start[0];

            var textBox = CreateTextBox(document, i);
            start.InsertRange(textBox.Content);
        }

        document.Save("Output.%OutputFileType%");
    }

    // A floating textbox that will be inserted at the start of every page.
    private static TextBox CreateTextBox(DocumentModel document, int page)
    {
        var run = new Run(document, "Inserted textbox on page " + (page + 1));
        run.CharacterFormat.Size = 25;
        run.CharacterFormat.FontColor = Color.White;

        var textBox = new TextBox(document, new FloatingLayout(
            new HorizontalPosition(-340, LengthUnit.Point, HorizontalPositionAnchor.RightMargin),
            new VerticalPosition(0, LengthUnit.Point, VerticalPositionAnchor.Margin),
            new Size(340, 45, LengthUnit.Point))
        { WrappingStyle = TextWrappingStyle.InFrontOfText });
        textBox.Fill.SetSolid(new Color(0x4472C4));
        textBox.Blocks.Add(new Paragraph(document, run));

        return textBox;
    }
}
Imports GemBox.Document

Module Program

    Sub Main()

        ' If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY")

        Dim document = DocumentModel.Load("%InputFileName%")

        Dim paginator = document.GetPaginator()

        For i = 0 To paginator.Pages.Count - 1
            Dim page = paginator.Pages(i)
            Dim pageRange = page.Range
            Dim start = pageRange.Start(0)

            Dim textBox = CreateTextBox(document, i)
            start.InsertRange(textBox.Content)
        Next

        document.Save("Output.%OutputFileType%")
    End Sub

    ' A floating textbox that will be inserted at the start of every page.
    Function CreateTextBox(document As DocumentModel, page As Integer) As TextBox
        Dim run = New Run(document, "Inserted textbox on page " & (page + 1))
        run.CharacterFormat.Size = 25
        run.CharacterFormat.FontColor = Color.White

        Dim textBox = New TextBox(document, New FloatingLayout(
            New HorizontalPosition(-340, LengthUnit.Point, HorizontalPositionAnchor.RightMargin),
            New VerticalPosition(0, LengthUnit.Point, VerticalPositionAnchor.Margin),
            New Size(340, 45, LengthUnit.Point)) With
        {.WrappingStyle = TextWrappingStyle.InFrontOfText})
        textBox.Fill.SetSolid(New Color(&H4472C4))
        textBox.Blocks.Add(New Paragraph(document, run))

        Return textBox
    End Function

End Module
Word document with content inserted at the start of every page
Screenshot of a document with content inserted at the start of every page

See also


Next steps

GemBox.Document is a .NET component that enables you to read, write, edit, convert, and print document files from your .NET applications using one simple API. How about testing it today?

Download Buy