PdfProcessing: TextFormatProvider: Text exported with wrong encoding when Simple Font with Differences array

Create an account Log In

Completed

Last Updated: 27 May 2021 11:28 by ADMIN

Release LIB 2021.2.531 (31/05/2021)

Manu

Created on: 03 Apr 2020 09:25

Category: PdfProcessing

Type: Bug Report

PdfProcessing: TextFormatProvider: Text exported with wrong encoding when Simple Font with Differences array

When the document contains Simple Font with predefined encoding and no ToUnicode mapping the text should be extracted with the following algorithm:

Map the character code to a character name according to the font’s Differences array.
Look up the character name in the Adobe Glyph List to obtain the corresponding Unicode value.

Currently, the PdfProcessing library doesn't map the character code properly which leads to wrongly encoded text content.

0 comments