If HTML document is imported, and it contains image with invalid URL, then the image is imported with this URL in the document model. On subsequent export to Docx, the library tries to download the image data, which throws WebException. Instead, the image should be replaced with generic 'error' image.
Workaround: Manually test the image URL for correctness on HTML import, and replace the data:
static void Main(string[] args)
{
HtmlFormatProvider htmlFormatProvider = new HtmlFormatProvider();
htmlFormatProvider.ImportSettings.LoadFromUri += (sender, e) =>
{
if (!IsValid(e.Uri))
{
e.SetData(File.ReadAllBytes("no-image.png"));
}
};
}
private static bool IsValid(string uri)
{
try
{
using (WebClient client = new WebClient())
{
client.DownloadData(uri);
}
}
catch (WebException)
{
return false;
}
return true;
}
The same issue appears when exporting to PDF.