Iterating Over Word Documents in C# and VB.NET

GemBox.Document represents a Word document with an in-memory content model that has a tree‑like structure and a DocumentModel as a root element.

You can traverse through the document elements by following their parent/child relationships.

Alternatively, you can use one of the Element.GetChildElements methods, which provide a way to flatten the document representation so it can be iterated.

The following example shows how you can iterate over a document’s content and filter specific types of elements.

Iterating through Word file's content model
Screenshot of Word document's iteration
Upload your file (Drag file here)
using System;
using System.Linq;
using GemBox.Document;

class Program
{
    static void Main()
    {
        // If using Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY");

        var document = DocumentModel.Load("%InputFileName%");

        int numberOfSections = document.Sections.Count;
        int numberOfParagraphs = document.GetChildElements(true, ElementType.Paragraph).Count();
        int numberOfRunsAndFields = document.GetChildElements(true, ElementType.Run, ElementType.Field).Count();

        var section = document.Sections[0];

        int numberOfElements = section.GetChildElements(true).Count();
        int numberOfBlocks = section.GetChildElements(true).OfType<Block>().Count();
        int numberOfInlines = section.GetChildElements(true).OfType<Inline>().Count();

        Console.WriteLine("File has:");
        Console.WriteLine($" - {numberOfSections} sections.");
        Console.WriteLine($" - {numberOfParagraphs} paragraphs.");
        Console.WriteLine($" - {numberOfRunsAndFields} runs and fields.");

        Console.WriteLine();

        Console.WriteLine("First section has:");
        Console.WriteLine($" - {numberOfElements} elements.");
        Console.WriteLine($" - {numberOfBlocks} blocks.");
        Console.WriteLine($" - {numberOfInlines} inlines.");
    }
}
Imports System
Imports System.Linq
Imports GemBox.Document

Module Program

    Sub Main()

        ' If using Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY")

        Dim document As DocumentModel = DocumentModel.Load("%InputFileName%")

        Dim numberOfSections As Integer = document.Sections.Count
        Dim numberOfParagraphs As Integer = document.GetChildElements(True, ElementType.Paragraph).Count()
        Dim numberOfRunsAndFields As Integer = document.GetChildElements(True, ElementType.Run, ElementType.Field).Count()

        Dim section = document.Sections(0)

        Dim numberOfElements As Integer = section.GetChildElements(True).Count()
        Dim numberOfBlocks As Integer = section.GetChildElements(True).OfType(Of Block)().Count()
        Dim numberOfInlines As Integer = section.GetChildElements(True).OfType(Of Inline)().Count()

        Console.WriteLine("File has:")
        Console.WriteLine($" - {numberOfSections} sections.")
        Console.WriteLine($" - {numberOfParagraphs} paragraphs.")
        Console.WriteLine($" - {numberOfRunsAndFields} runs and fields.")

        Console.WriteLine()

        Console.WriteLine("First section has:")
        Console.WriteLine($" - {numberOfElements} elements.")
        Console.WriteLine($" - {numberOfBlocks} blocks.")
        Console.WriteLine($" - {numberOfInlines} inlines.")

    End Sub
End Module

See also


Next steps

GemBox.Document is a .NET component that enables you to read, write, edit, convert, and print document files from your .NET applications using one simple API. How about testing it today?

Download Buy