Skip to content

Latest commit

 

History

History
21 lines (14 loc) · 907 Bytes

fixing-column-encoding-with-ftfy-and-sqlite-transform.md

File metadata and controls

21 lines (14 loc) · 907 Bytes

Fixing broken text encodings with sqlite-transform and ftfy

I was working with a database table that included values that were clearly in the wrong character encoding - values like this:

Rue Léopold I

I used my sqlite-transform tool with the ftfy Python library to fix that by running the following:

sqlite-transform lambda chiens.db espaces-pour-chiens-et-espaces-interdits-aux-chiens namefr \
  --code 'ftfy.fix_encoding(value)' \
  --import ftfy

That's the database file, the table and the column, then --code and --import to specify the transformation.

Since I had installed sqlite-transform using pipx install sqlite-transform I needed to first install the ftfy library into the correct virtual environment. The recipe for doing that is:

pipx inject sqlite-transform ftfy