New GemBox.Document 2.5 Release with PDF read support
We have just released a new setup, help, and examples for GemBox.Document 2.5 with support for reading PDF files.
GemBox.Document currently supports reading PDF files that contain text in paragraphs and/or tables by trying to recognize the logical structure of the document from the content of PDF pages. The output is not of high fidelity, since it depends on the complexity of the content of the PDF pages, but has the following advantages:
- The logical structure of the document (sections, paragraphs, tables) is available.
- Text search is fully supported.
- Editing a document is fully supported.
Note that we plan to progressively improve recognition of logical structure and add support for new features, such as pictures, form fields, etc.
For more information about PDF reading with GemBox.Document, see our help page: Support level for reading PDF format.
For an example showing the results obtained from reading a PDF file and extracting text from it, see the Read and Extract PDF Text in C# and VB.NET example.
Other notable improvements included in this release are:
- Added support for updating the TOC and table, column, and cell widths while paginating the document by using the DocumentModel.GetPaginator method with PaginatorOptions.
- Added support for text box formatting via the TextBox.TextBoxFormat property.
- Added support for default tab stops via the DocumentSettings.DefaultTabStop property.
- Added support for specifying DPI resolution when saving to image formats via the ImageSaveOptions.DpiX and ImageSaveOptions.DpiY properties.
- Added support for specifying a document name when printing a document via the PrintOptions.DocumentName property.
- Various other enhancements and fixes in GemBox.Document API and file format readers/writers. For more details, see the GemBox.Document bug fixes page.