Performance metrics with large Word files in C# and VB.NET

GemBox.Document is a Word component that follows .NET design guidelines and best practices. It represents Word files in-memory through its rich content model that contains sections, blocks, inlines, drawings, etc. It has optimized memory consumption, allocation, while not jeopardizing the efficiency and speed of the execution.

You can create stress tests to measure the component's processing capabilities using BenchmarkDotNet.

The following example shows how you can track the performance of GemBox.Document using the provided input Word file with 15 sections of various content. The file should cover any typical Word requirements; it includes different kinds of elements (like images, shapes, and tables) and Word features (like bookmarks, comments, and footnotes).

Measuring performance of reading, writing, and iterating through Word files in C# and VB.NET
Screenshot of GemBox.Document performance measurements
using System;
using System.Collections.Generic;
using System.IO;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Engines;
using BenchmarkDotNet.Jobs;
using BenchmarkDotNet.Running;
using GemBox.Document;

[SimpleJob(RuntimeMoniker.Net48)]
[SimpleJob(RuntimeMoniker.NetCoreApp31)]
public class Program
{
    private DocumentModel document;
    private readonly Consumer consumer = new Consumer();

    public static void Main()
    {
        BenchmarkRunner.Run<Program>();
    }

    [GlobalSetup]
    public void SetLicense()
    {
        // If using Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY");

        // If using Free version and example exceeds its limitations, use Trial or Time Limited version:
        // https://www.gemboxsoftware.com/document/examples/c-sharp-free-professional-word-library/1301

        this.document = DocumentModel.Load("%#RandomSections.docx%");
    }

    [Benchmark]
    public DocumentModel Reading()
    {
        return DocumentModel.Load("%#RandomSections.docx%");
    }

    [Benchmark]
    public void Writing()
    {
        using (var stream = new MemoryStream())
            this.document.Save(stream, new DocxSaveOptions());
    }

    [Benchmark]
    public void Iterating()
    {
        this.LoopThroughAllElements().Consume(this.consumer);
    }

    public IEnumerable<Element> LoopThroughAllElements()
    {
        return this.document.GetChildElements(true);
    }
}
Imports System
Imports System.Collections.Generic
Imports System.IO
Imports BenchmarkDotNet.Attributes
Imports BenchmarkDotNet.Engines
Imports BenchmarkDotNet.Jobs
Imports BenchmarkDotNet.Running
Imports GemBox.Document

<SimpleJob(RuntimeMoniker.Net48)>
<SimpleJob(RuntimeMoniker.NetCoreApp31)>
Public Class Program

    Private document As DocumentModel
    Private ReadOnly consumer As Consumer = New Consumer()

    Public Shared Sub Main()
        BenchmarkRunner.Run(Of Program)()
    End Sub

    <GlobalSetup>
    Public Sub SetLicense()
        ' If using Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY")

        ' If using Free version and example exceeds its limitations, use Trial or Time Limited version:
        ' https://www.gemboxsoftware.com/document/examples/c-sharp-free-professional-word-library/1301

        Me.document = DocumentModel.Load("%#RandomSections.docx%")
    End Sub

    <Benchmark>
    Public Function Reading() As DocumentModel
        Return DocumentModel.Load("%#RandomSections.docx%")
    End Function

    <Benchmark>
    Public Sub Writing()
        Using stream = New MemoryStream()
            Me.document.Save(stream, New DocxSaveOptions())
        End Using
    End Sub

    <Benchmark>
    Public Sub Iterating()
        Me.LoopThroughAllElements().Consume(Me.consumer)
    End Sub

    Public Iterator Function LoopThroughAllElements() As IEnumerable(Of Element)
        Return Me.document.GetChildElements(True)
    End Function

End Class

Benchmarks for 10,000 Word pages

The more content you have, the more memory you'll need. The amount of content you can handle depends on a few factors, like the machine's available memory, the application's architecture (32-bit or 64-bit), the targeted .NET platform (.NET Framework or .NET Core), etc.

The following benchmark charts provide the results of working with Word files with up to 10 thousand pages. They show a steady and linear increase in both time and memory with an increased number of pages. For more information, see the resulting performance measurements in the 10_Thousand_Pages_Performance.xlsx file.

Benchmark chart of time that's required for reading and writing Word files with up to 10 thousand pages
Benchmark chart of elapsed time for 10 thousand pages
Benchmark chart of memory that's required for creating Word files with up to 10 thousand pages
Benchmark chart of allocated memory for 10 thousand pages

Tips for improving performance

The following are some recommendations for improving performance while developing with GemBox.Document:

Want more?

Next example GitHub

Check the next example or select an example from the menu. You can also download our examples from the GitHub.


Like it?

Download Buy

If you want to try the GemBox.Document yourself, you can download the free version. It delivers the same performance and set of features as the professional version, but with some operations limited. To remove the limitation, you need to purchase a license.