When the document is encoded in two-byte encoding using Little-Endian, the CsQuery HTML parser provides the entire document as FirstChild of the <body> tag, which leads to incorrect import. This seems to be occurring with MS Outlook messages saved as HTML. Workaround: convert the file to Big-Endian encoding before importing.
Make it possible to link files (read: excel table) into word documents, so the editing experience in e.g. word allows to navigate to the linked file. Ms Word and OOXML allow insertion of files in a docx document. This results in OLEOobject referring to the embedded file.