Manipulate PDF content streams and resources in C# and VB.NET

Simply put, a content stream in PDF is a stream object that describes how a PDF application will render a page. A page can have one or multiple content streams. They are also used to describe a form XObject, or tiling pattern. Internally, they contain a sequence of instructions explaining how to draw elements (vectors, images, text) on a page.

Some instructions reference content stream resources, such as fonts, images, form XObjects, patterns, shadings, color spaces, or marked content properties which are contained in the accompanying resource dictionary and referenced from the content stream by a unique name.

GemBox.Pdf converts a content stream and the accompanying resource dictionary into a tree of PdfContentElements, starting with the PdfContent as the root, for easier inspection and manipulation.

Manipulating PDF content streams in C# and VB

In the following example, you can see how to improve performance when writing text to multiple pages using the same font. Keep in mind that the embedded subset of the font is calculated just once and not after editing each page.

Multiple PDF pages edited with GemBox.Pdf C#/VB.NET library
Screenshot of multiple PDF pages edited with GemBox.Pdf library
using GemBox.Pdf;
using GemBox.Pdf.Content;

class Program
{
    static void Main()
    {
        // If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY");

        using (var document = new PdfDocument())
        {
            using (var formattedText = new PdfFormattedText())
            {
                // Set the font to TrueType font that will be subset and embedded in the document.
                formattedText.Font = new PdfFont("Calibri", 96);

                // Draw a single letter on each page.
                for (int i = 0; i < 2; ++i)
                {
                    formattedText.Append(((char)('A' + i)).ToString());

                    var page = document.Pages.Add();

                    // Begin editing the page content, but don't end it until all pages are edited.
                    page.Content.BeginEdit();

                    page.Content.DrawText(formattedText, new PdfPoint(100, 500));

                    formattedText.Clear();
                }
            }

            // End editing of all pages.
            // This will convert the content of each page back to the underlying content stream and the accompanying resource dictionary.
            // Subset of the 'Calibri' font, that contains only glyphs for characters 'A' to 'B' will be calculated just once before being
            // embedded in the document.
            foreach (var page in document.Pages)
                page.Content.EndEdit();

            document.Save("Content Streams And Resources.%OutputFileType%");
        }
    }
}
Imports GemBox.Pdf
Imports GemBox.Pdf.Content

Module Program

    Sub Main()

        ' If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY")

        Using document = New PdfDocument()

            Using formattedText = New PdfFormattedText()

                ' Set the font to TrueType font that will be subset and embedded in the document.
                formattedText.Font = New PdfFont("Calibri", 96)

                ' Draw a single letter on each page.
                For i As Integer = 0 To 1

                    formattedText.Append(ChrW(AscW("A"c) + i).ToString())

                    Dim page = document.Pages.Add()

                    ' Begin editing the page content, but don't end it until all pages are edited.
                    page.Content.BeginEdit()

                    page.Content.DrawText(formattedText, New PdfPoint(100, 500))

                    formattedText.Clear()
                Next
            End Using

            ' End editing of all pages.
            ' This will convert the content of each page back to the underlying content stream and the accompanying resource dictionary.
            ' Subset of the 'Calibri' font, that contains only glyphs for characters 'A' to 'B' will be calculated just once before being
            ' embedded in the document.
            For Each page In document.Pages
                page.Content.EndEdit()
            Next

            document.Save("Content Streams And Resources.%OutputFileType%")
        End Using
    End Sub
End Module

Notice that this example creates just two pages because of the free version limitations. But, the performance boost of using the demonstrated explicit editing of PDF pages would be more noticeable when writing different text to many pages using one or more TrueType/OpenType fonts on each page.

Technical notes

  • When editing the PdfForm.Content or the PdfTilingPattern.Content, you must call PdfContent.BeginEdit() before and PdfContent.EndEdit() after editing.
  • Calling PdfContent.BeginEdit() and PdfContent.EndEdit() is optional when editing the PdfPage.Content but might improve performance in some situations.
  • When PdfContent.EndEdit() is called, GemBox.Pdf converts back the PdfContent and all PdfContentElements underneath it to a content stream and the accompanying resource dictionary.
  • For more information about PDF content streams and resources in GemBox.Pdf, see the Content Streams and Resources help page.

See also


Next steps

GemBox.Pdf is a .NET component that enables developers to read, merge and split PDF files or execute low-level object manipulations from .NET applications in a simple and efficient way.

Download Buy