Completed
Last Updated: 25 Mar 2020 15:08 by ADMIN
Release LIB 2020.1.316 (03/16/2020)

I am using the trial version of telerik for xamarin for .net core which was released last year, to convert pdf to text. Our service is hosted in Azure. The text which I get back, in certain cases the spaces are missing (say for example instead of 'I [am] here', it displays 'I[am]here'. This happens randomly. 

The code which we have used is as follows -

byte[] pdfBinary = Convert.FromBase64String(inputString);

 TextFormatProviderSettings textFormatProviderSettings = new TextFormatProviderSettings("\r\n", " ");
         
 var textFormatProvider = new TextFormatProvider();
 var pdfFormatProvider = new PdfFormatProvider();

 RadFixedDocument document = pdfFormatProvider.Import(pdfBinary);
 result = textFormatProvider.Export(document, textFormatProviderSettings);
Completed
Last Updated: 18 Mar 2020 06:48 by ADMIN
Release LIB 2020.1.316 (03/16/2020)
By specification the destination string can be a string of up to 512 bytes. However, the current implementation supports strings of up to four bytes which leads to: ArgumentOutOfRangeException: 'bytes should be less or equal than 4.Parameter name: bytes'.
Unplanned
Last Updated: 02 Mar 2020 13:11 by ADMIN
When importing document with predefined ToUnicode CMaps (e.g. Identity-H), an InvalidCastException is thrown with cause: 

Unable to cast object of type 'Telerik.Windows.Documents.Fixed.FormatProviders.Pdf.Model.Types.PdfName' to type 'Telerik.Windows.Documents.Fixed.FormatProviders.Pdf.Model.Elements.CMaps.ToUnicodeCMap'.
Unplanned
Last Updated: 28 Feb 2020 09:23 by ADMIN
The Form object does not inherit the graphics state of the page
Unplanned
Last Updated: 26 Feb 2020 11:08 by ADMIN
Unplanned
Last Updated: 13 Feb 2020 14:47 by ADMIN
A KeyNotFoundException is thrown when trying to open a PDF containing specific Type1 font.
Unplanned
Last Updated: 06 Feb 2020 15:14 by ADMIN
The values of rotated widgets on a rotated page are invisible after exporting them. The value can be seen only while editing a field.
Completed
Last Updated: 06 Feb 2020 11:00 by ADMIN
Release LIB 2020.1.210 (02/10/2020)
RadPDfProcessing: Exception when parsing a Tiling pattern with non-rgb color.
Completed
Last Updated: 06 Feb 2020 10:21 by ADMIN
Release LIB 2020.1.210 (02/10/2020)
If ImageInline's size is set through its Size property and not through Height or Width, the given size is not respected on export.
Completed
Last Updated: 06 Feb 2020 05:29 by ADMIN
Release LIB 2020.1.210 (02/10/2020)
When there is a destination object in a RadFixedDocument and it doesn't have page set (it is null), an exception is thrown. However, this is supported by Adobe and it is not forbidden in the Pdf specification.
Completed
Last Updated: 21 Jan 2020 14:56 by ADMIN
Release LIB 2020.1.127 (01/27/2020)
WinAnsiEncoding it is imported as StandardEncoding since WinAnsiEncoding is still not implemented in RadPdfProcessing. 
Unplanned
Last Updated: 16 Jan 2020 12:14 by ADMIN
Add support for documents with an invalid stream cross-reference table
Unplanned
Last Updated: 13 Dec 2019 06:50 by ADMIN
Exception when the endstream is no on a new row 
Unplanned
Last Updated: 09 Dec 2019 11:52 by ADMIN

When merging files that contain the "159 '\u009f'" char, ArgumentException("The encoding is not supported.") is thrown.

Workaround:

Use PdfFormatProvider.

Unplanned
Last Updated: 28 Nov 2019 08:07 by ADMIN
When the bookmarks are declared as objects instead of a string they are not visible.
Completed
Last Updated: 25 Nov 2019 07:52 by ADMIN
Release 2019.3.1125 (11/25/2019)
The exception is thrown with the message "Password is not correct" even when the user password is correct. This issue occurs for specific encryption algorithm parameters only.
Unplanned
Last Updated: 22 Nov 2019 08:08 by ADMIN
By specification, the InteractiveForm fields can have the same name if they are descendants of a common ancestor. Such fields are different representations of the same underlying field; they should differ only in properties that specify their visual appearance. When such a document is imported an ArgumentException: 'An item with the same key has already been added.' is thrown.
Completed
Last Updated: 05 Nov 2019 12:03 by ADMIN
Release LIB 2019.3.1111 (11/11/2019)
If the font family name is defined using a language other than English, the font is not applied to the content. This also can affect the performance as the font is read but it is not registered in the FontsRepository.
Completed
Last Updated: 17 Oct 2019 05:57 by ADMIN
Release R3 2019 SP1
By specification the widget annotation and its content can be merged into the field dictionary. When merged widget with Kids property is imported an InvalidCastException: 'Unable to cast object of type 'Telerik.Windows.Documents.Fixed.FormatProviders.Pdf.Model.Elements.Annotations.WidgetObject' to type 'Telerik.Windows.Documents.Fixed.FormatProviders.Pdf.Model.Elements.Forms.FormFieldNode'.' is thrown.
Declined
Last Updated: 15 Oct 2019 17:38 by ADMIN

I was working on some acroforms and wherever I needed a space I kept on getting a different letter (Ê). On debugging the problem seems to be in the TryGetCharCode function of OpenTypeFontSource

Version: 2019.3.917

Below is copy of function

public override bool TryGetCharCode(int unicode, out int charCode)
        {
            bool result = false;
            ushort glyphId;
            charCode = CMap.MISSING_GLYPH_ID;
            if (this.TryGetGlyphId(unicode, out glyphId))
            {
                ushort uCharCode = CMap.MISSING_GLYPH_ID;
                CMapTable table = this.CMap.GetCMapTable(3, 0);
                if (table != null)
                {
                    result = table.TryGetCharId(glyphId, out uCharCode);
                }

                table = this.CMap.GetCMapTable(1, 0);
                if (table != null)
                {
                    result = table.TryGetCharId(glyphId, out uCharCode);
                }

                charCode = uCharCode;

                return result;
            }

            return false;
        }

The font used has 2 cmap tables: one with platformid of 3 and encodingid of 1, the other with platformid 1 and encodingid 0. According to https://docs.microsoft.com/en-us/typography/opentype/spec/cmap platform id 3 and encoding 1 is correct for windows so not sure why the first call to getcmaptable looks for encodingid 0.

Second of all even if I change it to the following

CMapTable table = this.CMap.GetCMapTable(3, 1);

if there is a second cmap with platformid of 1 whatever the result of the call with regards to platformid 3, the result will be overridden

I can say that if I add some checks so that if the first call succeeds it doesn't attempt the 2nd I do get the expected behaviour in the pdf and get spaces

 

As a note the FontFamily property of the OpenTypeFontSource is "Arial"