Skip to content

--encoding is not working when exporting to HTML #155

Closed
@brrd

Description

@brrd

First of all, thank you very much for this very useful program.

Describe the bug
When exporting a document to HTML with the --encoding option, the output file is always in encoded in windows-1252.

This issue looks like this one (someone suggested an answer, but I don't know if it's relevant here): https://stackoverflow.com/q/34026716

To Reproduce
Here with UTF-8:

docto.exe -F input.rtf -T wdFormatHTML -O test.html -E 65001

The same behavior is encountered when running the command from Node.js (https://github.com/brrd/msoconvert).

Expected behavior
I would expect the file HTML file to be encoded in UTF-8, and its header to contain this meta:

<meta http-equiv=Content-Type content="text/html; charset=utf-8">

Instead, the file is encoded in windows-1252 and the the header contains the following:

<meta http-equiv=Content-Type content="text/html; charset=windows-1252">

Additional context

  • Please run the command with -L 10 to provide verbose logging and paste that into your bug report.
docto.exe -F input.rtf -T wdFormatHTML -O test.html -E 65001 -L 10
[20210528 19:21:26 -]: [DEBUG]  Log Level Set To:10
Loading ChooseConverter...
Parameter Count is 10
Converter:MS Word
[DEBUG]  Log Level Set To:10
[INFO]   Loading Configuration...
[DEBUG]  Parameter Count is 10
[DEBUG]  Input File is: C:\Users\Thomas\Desktop\input.rtf
[DEBUG]  Type Integer is: 8
[INFO]   Output file: C:\Users\Thomas\Desktop\test.html
[INFO]   Log Level Set To:10
[DEBUG]  Current Directory: C:\Users\Thomas\Desktop
[DEBUG]  Ready to Execute
[DEBUG]  Executing Conversion ...
[INFO]   ExecuteConversion:C:\Users\Thomas\Desktop\input.rtf
[DEBUG]  Version >= 14 Using Saveas2 Function
[INFO]   File Converted: C:\Users\Thomas\Desktop\test.html
  • Please also run docto.exe -v so I can see what version of Docto and Word you are running.
docto.exe -v
DocTo Version:1.03.30.54
OfficeApp Version:16
Source: https://github.com/tobya/DocTo/
  • What OS: [e.g. Windows Server 2012]

Windows 10 Pro 20H2

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions