When merging or splitting files that contain the "159 '\u009f'" char, ArgumentException("The encoding is not supported.") is thrown.
This issue can also result in ArgumentException in client applications with the message: currentIndirectObject should be null.
Workaround:
Use PdfFormatProvider or reflection to register the char (check the comments).
I was working on some acroforms and wherever I needed a space I kept on getting a different letter (Ê). On debugging the problem seems to be in the TryGetCharCode function of OpenTypeFontSource
Version: 2019.3.917
Below is copy of function
public override bool TryGetCharCode(int unicode, out int charCode)
{
bool result = false;
ushort glyphId;
charCode = CMap.MISSING_GLYPH_ID;
if (this.TryGetGlyphId(unicode, out glyphId))
{
ushort uCharCode = CMap.MISSING_GLYPH_ID;
CMapTable table = this.CMap.GetCMapTable(3, 0);
if (table != null)
{
result = table.TryGetCharId(glyphId, out uCharCode);
}
table = this.CMap.GetCMapTable(1, 0);
if (table != null)
{
result = table.TryGetCharId(glyphId, out uCharCode);
}
charCode = uCharCode;
return result;
}
return false;
}
The font used has 2 cmap tables: one with platformid of 3 and encodingid of 1, the other with platformid 1 and encodingid 0. According to https://docs.microsoft.com/en-us/typography/opentype/spec/cmap platform id 3 and encoding 1 is correct for windows so not sure why the first call to getcmaptable looks for encodingid 0.
Second of all even if I change it to the following
CMapTable table = this.CMap.GetCMapTable(3, 1);
if there is a second cmap with platformid of 1 whatever the result of the call with regards to platformid 3, the result will be overridden
I can say that if I add some checks so that if the first call succeeds it doesn't attempt the 2nd I do get the expected behaviour in the pdf and get spaces
As a note the FontFamily property of the OpenTypeFontSource is "Arial"
When a TrueType font is defined, the mapping of character codes to glyph indices depends on the built-in cmap table mappings defined in the font and the Encoding property defined in the PDF dictionary.
However, the current implementation maps all characters with cmap tables for Microsoft Symbolic and Macintosh Roman, which causes incorrect mapping results, e.g. space characters are mapped to an Ê glyph.
The issue is also described in the following public item: TryGetCharCode for OpenTypeFont uses wrong cmap and returns wrong charcode.
Workaround: Change the font of the TextBoxField's widget appearance:
foreach (var widget in field.Widgets)
{
widget.TextProperties.Font = FontsRepository.Helvetica;
}