Mail MergeGemBox.Document Help

Mail merge is a process of merging or importing data from a .NET object, also known as data source, to a DocumentModel instance, also known as template document.

Binding between data source and template document is provided by Field class whose FieldType property is MergeField (usually called a merge field) and whose GetInstructionText  method returns text that refers to the name of the property or column in data source (usually called a merge field name), and in mail merge process, that Field instance will be replaced by actual data returned from the data source for the given property or column name.

Introductory example

The following example gets you right-ahead on the mail merge code, so you can immediately understand what mail merge is and how to use it.

// Create a new empty document. 
var doc = new DocumentModel();

// Add document content.
doc.Sections.Add(new Section(doc, new Paragraph(doc, new Field(doc, FieldType.MergeField, "FullName"))));

// Save the document to a file.
doc.Save("TemplateDocument.docx");

// Initialize mail merge data source. 
var dataSource = new { FullName = "John Doe" };

// Execute mail merge.
doc.MailMerge.Execute(dataSource);

// Save the document to a file.
doc.Save("Document.docx");
Tip Tip

Template document with merge fields is usually created with Microsoft Word application. Here are the instructions on how to insert a merge field FullName into the document with Microsoft Word:

  1. Navigate to Insert tab of Microsoft Word ribbon.

  2. Click on Quick Parts ribbon button to open a drop-down list.

  3. Click on Field... button from the drop-down list to open the Field dialog.

  4. Select Mail Merge from Categories combo-box drop-down list.

  5. Select MergeField from Field names drop-down list.

  6. Insert FullName text into the Field name text-box.

  7. Press OK button to close the Field dialog.

The following screenshot describes the procedure visually.

Insert merge field with Microsoft Word
Insert Merge Field

Here are the screenshots from TemplateDocument.docx and Document.docx files.

TemplateDocument.docx
Introductory Mail Merge Template Document
Document.docx
Introductory Mail Merge Document
Tip Tip

If TemplateDocument.docx is empty when you open it with Microsoft Word, press Alt + F9 to toggle field codes.

As you can see, Field with field type MergeField and instruction text FullName has been replaced by the value from data source property named FullName.

In this example the data source was an instance of an anonymous type, but GemBox.Document supports almost any .NET object to be used as a mail merge data source. More details about mail merge data sources are presented in the next section.

Mail merge data sources

Mail merge supports the following data source types:

Single name and value pair or a sequence of name and value pairs is used for the simplest mail merge. Names represent merge field names and values represent replacements for merge fields.

Whenever possible, mail merge engine will access data source property or column values without using reflection (for example, DataRow column values will be accessed through DataRow[string columnName] indexer. Reflection will be used if there is no other mechanism available to retrieve property / column values based on their names.

If you are uncomfortable with reflection usage (security and performance issues), you can wrap your data source into your IMailMergeDataSource interface implementation and use it as a mail merge data source instead of the original data source. Mail merge engine will only use members from your IMailMergeDataSource implementation which can be implemented in a secure and efficient way - without using reflection.

Implementing IMailMergeDataSource interface is necessary when your original data source is not a standard .NET object (maybe the object is implemented as a property bag, without standard .NET properties) or is not a standard .NET sequence (implemented without IEnumerable interface).

Mail merge process

To start the mail merge process, first mail merge range has to be found.

Mail merge range is a part of the document where the mail merge algorithm searches for merge field instances and replaces them with actual data from the data source.

If the data source contains more than one item, original mail merge range (the one that contains merge field instances) will be cloned and appended to the document just after previously processed mail merge range and then mail merge algorithm merges this mail merge range with data from the next item in the data source. This process is repeated for every item in the data source.

Mail merge process can be initiated with following method overloads:

Mail merge range is identified with its range name. If Execute(Object, String) method overload is used, then range name is explicitly specified in the rangeName parameter, otherwise range name is resolved as described in the following note.

Note Note

When using Execute(Object) method overload, range name is determined from the data source in the following way:

Range name is null or empty

If range name is resolved to a null or String Empty value, then mail merge range is the whole document content - all sections under the document.

This is best illustrated with the following example.

Let the range name be explicitly specified to a null value by using Execute(Object, String) method overload and let the data source be the following DataTable (this data source will also be used in the subsequent examples):

TableName: People

Name

Surname

John

Doe

Fred

Nurk

Hans

Meier

Following image shows the structure, in an XML-like format, of the template document and the structure of the document resulting from mail merging the data source into the template document when range name is resolved to a null or String Empty value:

Structure of template document and mail merged document when range name is null or empty.
Mail Merge Empty Range Name

Oval black-bordered rectangle shows mail merge range in the template document, and how it was expanded by cloning, appending it and filling it with the data in the mail merged document.

Range name neither null nor empty

If range name is neither null nor String Empty value, then mail merge range is determined by the merge field instances with names 'RangeStart:rangeName' and 'RangeEnd:rangeName' .

Merge fields which represent mail merge range beginning and end are removed from the resulting mail merge range, but mail merge range does not necessary start and end where these fields were positioned, as the following example shows.

The following image shows the structure, in an XML-like format, of the template document and the structure of the documents resulting from mail merging the data source into the template document when range name is neither null nor String Empty and with different range end field positioning:

Structure of template document and mail merged documents when range name is neither null nor empty, and with different range end field positioning.
Mail Merge Not Empty Range Name

Oval black-bordered rectangle shows mail merge range in the template document with range end field positioned in the next paragraph and in the same paragraph as the rest of the merge fields, and how it was expanded by cloning, appending it and filling it with the data in the mail merged documents.

Notice how merge fields which represent mail merge range beginning and end have a rangeName parameter with value People, and it is equal to the name of the data source introduced in the previous section which is also used in this example.

Mail merge range depends on the positioning of the merge fields which represent mail merge range beginning and end.

If a range start field and range end field are contained in the same paragraph, then mail merge range will be a collection of Inline elements that are contained between these two fields.

If a range start field and range end field are contained in the different paragraphs, then mail merge range will be a collection of Block elements that are contained between parent paragraphs of these two fields. Parent paragraphs will be included in the mail merge range if they contain any other Inline element except range start field or range end field, otherwise they are removed from the mail merge range.

Nested mail merge

Nested mail merge is a powerful feature that enables you to import relational or hierarchical data source into the template document in a single statement.

Relational data source is, for example, a DataTable that has defined a DataRelation to some other DataTable. Rows from the DataRelation ParentTable are called parent rows and for each parent row there exists zero or more child rows from the DataRelation ChildTable that are related to the parent row as specified in the DataRelation.

Hierarchical data source is any .NET object which contains at least one property which contains other objects. Objects contained in the property value are called child objects and the object which contains the property is called a parent object.

The following example shows how nested mail merge works with both relational and hierarchical data source.

Let the template document used in nested mail merge has the content as in the following image:

TemplateDocument.docx
Nested Mail Merge Template Document
Tip Tip

TemplateDocument.docx shows the default merge fields results (surrounded by « and ») that Microsoft Word has assigned to merge fields when they were created. Press Alt + F9 to toggle field codes.

Nested mail merge with relational data source

The following code shows how to load a template document, create relational data source, execute nested mail merge with it and save the resulting document to a file:

// Load a template document from the file. 
var document = DocumentModel.Load("TemplateDocument.docx", LoadOptions.DocxDefault);

// Create DataSet with two DataTables and one DataRelation. 
// DataTable 'Companies' has columns 'Id' and 'Name'. 
// DataTable 'Employees' has columns 'CompanyId', 'Name' and 'Surname'. 
// DataRelation 'CompanyEmployees' has parent column 'Id' from 'Companies' table and child column 'CompanyId' from 'Employees' table.
DataColumn parentColumn = new DataColumn("Id", typeof(int)), childColumn = new DataColumn("CompanyId", typeof(int));

var companies = new DataTable("Companies");
companies.Columns.Add(parentColumn);
companies.Columns.Add(new DataColumn("Name"));
companies.Rows.Add(0, "GemBox Software");
companies.Rows.Add(1, "ACME");

var employees = new DataTable("Employees");
employees.Columns.Add(childColumn);
employees.Columns.Add(new DataColumn("Name"));
employees.Columns.Add(new DataColumn("Surname"));
employees.Rows.Add(0, "John", "Doe");
employees.Rows.Add(0, "Fred", "Nurk");
employees.Rows.Add(1, "Hans", "Meier");

var dataSet = new DataSet("CompaniesEmployees");
dataSet.Tables.Add(companies);
dataSet.Tables.Add(employees);
dataSet.Relations.Add(new DataRelation("CompanyEmployees", parentColumn, childColumn));

// Execute mail merge. We have to explicitly set range name to null because DataSet.DataSetName cannot be null or empty. 
// Child 'Employee' rows will be automatically imported below the appropriate parent 'Company' row 
// because range name 'CompanyEmployees' is defined as a DataRelation between these two sets of rows.
document.MailMerge.Execute(dataSet, null);
// Following statement can also be used. 
// document.MailMerge.Execute(dataSet.Tables["Companies"]); 

// Save the the resulting mail merged document to a file.
document.Save("Document.docx");

Nested mail merge with hierarchical data source

The following code shows type definitions used in nested mail merge with hierarchical data source:

// Types used to define a hierarchical data source. 
// Type 'Company' has a property 'CompanyEmployees' that contains a sequence of 'Employee' objects. 
public class Company
{
    public string Name { get; set; }
    public IList<Employee> CompanyEmployees { get; set; }
}

public class Employee
{
    public string Name { get; set; }
    public string Surname { get; set; }
}

The following code shows how to load a template document, create hierarchical data source, execute nested mail merge with it and save the resulting document to a file:

// Load a template document from the file. 
var document = DocumentModel.Load("TemplateDocument.docx", LoadOptions.DocxDefault);

// Create an array of Company objects. 
// Each Company object contains a sequence of Employee objects in its 'CompanyEmployees' property. 
var companies = new Company[]
{
    new Company()
    {
        Name = "GemBox Software",
        CompanyEmployees = new List<Employee>()
        {
            new Employee() { Name = "John", Surname = "Doe" },
            new Employee() { Name = "Fred", Surname = "Nurk" }
        }
    },
    new Company()
    {
        Name = "ACME",
        CompanyEmployees = new List<Employee>()
        {
            new Employee() { Name = "Hans", Surname = "Meier" }
        }
    }
};

// Execute mail merge. We have to explicitly set range name to 'Companies' because range name cannot be specified using the array. 
// Child 'Employee' objects will be automatically imported below the appropriate parent 'Company' object 
// because range name 'CompanyEmployees' is defined as a property in the 'Company' type.
document.MailMerge.Execute(companies, "Companies");

// Save the the resulting mail merged document to a file.
document.Save("Document.docx");

The resulting mail merged document is same for both the relational and hierarchical data source and is shown in the following image:

Document.docx
Nested Mail Merge Document

Nested mail merge also works with a custom implementation of IMailMergeDataSource interface. When a nested pair of RangeStart: nestedRangeName and RangeEnd: nestedRangeName fields is encountered, child records for that nested range will be requested from the IMailMergeDataSource by using the IMailMergeDataSource TryGetValue(String, Object ) method and passing the nestedRangeName as a valueName parameter value.

Mail merge formatting

Mail merge supports CharacterFormat.Language specified on or resolved from Field.CharacterFormat which identifies the language used for formatting values of fields which have date/time formatting field switch \@ or numeric formatting field switch \# in their instruction text.

Tip Tip

To view or add field formatting switches with Microsoft Word, press Alt + F9 to toggle field codes.

For example, following Word document field code { MERGEFIELD Date \@ "yyyy-MM-dd" } represents merge field with name Date and date/time formatting switch \@ with argument yyyy-MM-dd.

Tip Tip

Date/time formatting field switch \@ supports all Standard Date and Time Format Strings and Custom Date and Time Format Strings.

Numeric formatting field switch \# supports all Standard Numeric Format Strings and Custom Numeric Format Strings.

Mail merge formatting process uses IFormattable interface to format the value, if date / time formatting field switch \@ or numeric formatting field switch \# is present in the field's instruction text, otherwise IConvertible interface is used.

Mail merge options

Mail merge functionality is exposed through MailMerge type and its flexibility allows further customizations and operations by changing or using the following members:

  • ClearOptions - used to specify if merge fields for which no data has been found in the mail merge data source or ranges, paragraphs and table rows which contained merge fields but none of them has been merged, should be removed in the mail merge process. See mail merge clear options sample.

  • FieldMerging - event that allows you to intercept every mail merge operation and customize the data importation process.

  • FieldMappings - used if merge field name and data source property / column name are different, but they should be merged.

    Note Note

    Merge field names and data source property / column names comparisons in mail merging are case-insensitive, regardless whether you use FieldMappings or not.

    So, for example, merge field named fULLnAME will be successfully replaced with a value of a data source property / column named FullName.

  • RangeStartPrefix and RangeEndPrefix - if you have already used some other document processing component for mail merge, and that component required those merge fields that represent mail merge range beginning and end to start with some prefix other than RangeStart: and RangeEnd:, which are used by default in GemBox.Document, you can continue using those prefixes. You don't need to change your template document, just inform GemBox.Document mail merge engine about new prefixes, by changing these properties.

  • RemoveMergeFields - removes all merge fields or mail merge related fields from the document.

    Note Note

    Except MergeField field, GemBox.Document mail merge engine also supports the following mail merge related fields:

    • Next - used to move to the next record in the data source.

    • MergeRec - used to print the number of the corresponding merged data record.

    • MergeSeq - used to print the number of data records which have been successfully merged.

  • GetMergeFieldNames  - used for diagnostics, to retrieve all field names in the document.