Completed
Last Updated: 27 May 2021 11:28 by ADMIN
Release LIB 2021.2.531 (31/05/2021)
Manu
Created on: 03 Apr 2020 09:25
Category: PdfProcessing
Type: Bug Report
2
PdfProcessing: TextFormatProvider: Text exported with wrong encoding when Simple Font with Differences array

When the document contains Simple Font with predefined encoding and no ToUnicode mapping the text should be extracted with the following algorithm:

  • Map the character code to a character name according to the font’s Differences array.
  • Look up the character name in the Adobe Glyph List to obtain the corresponding Unicode value. 

Currently, the PdfProcessing library doesn't map the character code properly which leads to wrongly encoded text content.

0 comments