Click or drag to resize

Content Streams and Resources

A PDF content stream contains a sequence of instructions composed of Objects (operands) and keywords (operators) describing the appearance of a page or other graphical entity.

GemBox.Pdf compiles content stream operations (and the associated resources) into a group of content elements, such as text, paths, and external objects (images and forms) for easy inspection and manipulation.

These content elements and other types related to PDF content are implemented in the GemBox.Pdf.Content namespace.

The following sections describe the essential properties of the PDF content implemented in the GemBox.Pdf assembly:

Content Elements

The GemBox.Pdf.Content namespace includes seven types of PDF content element:

  • PdfTextContent: a sequence of glyphs from a single face of a single font.

  • Path: a geometrical item composed of lines and curves.

  • Image: a rectangular array of sample values, each representing a color.

  • Form: a self-contained description of any sequence of content elements.

  • Shading: a smooth transition between colors.

  • PdfContentGroup: a group of content elements that are independent of the rest of the surrounding elements.

  • Mark: a mark used to distinguish a section of the PDF content.

Note Note

GemBox.Pdf currently exposes only the PdfTextContent and PdfContentGroup types.

Other content element types will be exposed in the next version of GemBox.Pdf.

The base class for all content elements is PdfContentElement.

All content element types except Mark inherit PdfVisualContentElement, which enables them to transform the coordinate system and to apply formatting properties (graphics state).

Formatting (graphics state)

GemBox.Pdf abstracts PDF graphics state by grouping its entries into the following types:

  • PdfFillFormat: formatting properties that affect the filling of PDF textual or geometrical content.

  • PdfStrokeFormat: formatting properties that affect the stroking of PDF textual or geometrical content.

  • PdfClipFormat: formatting properties that affect the clipping area.

  • PdfTextFormat: formatting properties that affect only the PDF textual content.

  • PdfContentFormat: groups all previously defined formatting properties.

Formatting properties are accessible for any content element except Mark via the PdfVisualContentElementFormat property.

Usage in GemBox.Pdf

Content elements contained in a PDF page can be obtained via the PdfPageContent property.

The most important content in a PDF document is text and GemBox.Pdf provides the following functionalities related to text:

Note Note

GemBox.Pdf currently exposes PDF content in read-only mode since the main goal was to extract the Unicode representation of text.

Modification of PDF content (creating and adding new content elements, removing existing content elements, and modifying and formatting existing content elements) will be implemented in the next version of GemBox.Pdf.

In the meantime, content can be modified via Objects, as described in the following section.

Content Streams and Resources via Objects

Content Streams and Resources can also be used in GemBox.Pdf by using PDF Objects as in the following example.

// Specify content stream's content as a sequence of content stream operands and operators.
var content = new StringBuilder();
content.AppendLine("BT");                // Begin a text object.          
content.AppendLine("/F1 12 Tf");         // Set the font and font size to use, installing them as parameters in the text state. In this case, the font resource identified by the name F1 specifies the font externally known as Helvetica.
content.AppendLine("70 760 Td");         // Specify a starting position on the page, setting parameters in the text object.
content.AppendLine("(Hello world!) Tj"); // Paint the glyphs for a string of characters at that position.
content.AppendLine("ET");                // End the text object.

// Create content stream and write content to it.
var contentStream = PdfStream.Create();
using (var stream = contentStream.Open(PdfStreamDataMode.Write, PdfStreamDataState.Decoded))
    var contentBytes = PdfEncoding.Byte.GetBytes(content.ToString());
    stream.Write(contentBytes, 0, contentBytes.Length);

// Create font dictionary for Standard Type 1 'Helvetica' font.
var font = PdfDictionary.Create();
font[PdfName.Create("Type")] = PdfName.Create("Font");
font[PdfName.Create("Subtype")] = PdfName.Create("Type1");
font[PdfName.Create("BaseFont")] = PdfName.Create("Helvetica");

// Add font dictionary to resources.
var fontResources = PdfDictionary.Create();
fontResources[PdfName.Create("F1")] = PdfIndirectObject.Create(font);
var resources = PdfDictionary.Create();
resources[PdfName.Create("Font")] = fontResources;

// Create document.
using (var document = new PdfDocument())
    // Create new empty A4 page.
    var page = document.Pages.Add();

    // Set contents and resources of a page.
    var pageDictionary = page.GetDictionary();
    pageDictionary[PdfName.Create("Contents")] = PdfIndirectObject.Create(contentStream);
    pageDictionary[PdfName.Create("Resources")] = resources;

    // Save the document as a PDF file.
// The PDF file is closed after 'using' block.
See Also