Description
description
I am in the process of migrating my fastpages blog to nbdev and quarto. I followed the migration guide
https://nbdev.fast.ai/tutorials/blogging.html
When running nbdev_migrate --path posts
, the code raises an exception for unicode characters. I temporarily got around the issue by substituting my em dash with -
. I wanted to document this in case anyone else encountered this issue and potentially identify the function that is causing the exception
how to reproduce
If you would like to replicate this, here's the source markdown file:
https://github.com/progressEdd/blog/blob/2e040a0a3fa86268555f00b525f33f2b86331491/_posts/2020-12-23-Investigating-RPiPlay-Apple-Airplay-on-Fedora-Linux.md
a snippet of the code it fails when the unicode character for a em dash is in the text—
---
keywords: fastai
description: As a graduate student during COVID-19 some of my classes were online. I found it difficult to share my iPad screen over web conferences. At that time, I did not want to install Zoom because of the numerous <a href='https://thehackernews.com/2020/08/zoom-software-vulnerabilities.html'>security vulnerabilities</a>. Furthermore, the feature of casting my iPad screen was exclusive to Zoom. If I wanted to share my screen in other applications such as Microsoft Teams or Google Meet — I needed an alternative. Over the course of a couple months, I researched and tested multiple methods to cast my iPad screen. This blog post is the fruits of my labor.
title: Investigating RPiPlay — Apple Airplay on Fedora Linux
comments: true
nb_path: _notebooks/2020-12-23-Investigating-RPiPlay-Apple-Airplay-on-Fedora-Linux.ipynb
layout: notebook
---
...
When I run
uv run nbdev_migrate --path posts
Traceback (most recent call last):
File "/mnt/sda1/Documents/development_projects/progressEdd_projects/Python-Notebooks/personal-project/blog-migration/.venv/lib64/python3.13/site-packages/nbdev/migrate.py", line 180, in nbdev_migrate
if f.name.endswith('.md'): migrate_md(f)
~~~~~~~~~~^^^
File "/mnt/sda1/Documents/development_projects/progressEdd_projects/Python-Notebooks/personal-project/blog-migration/.venv/lib64/python3.13/site-packages/nbdev/migrate.py", line 164, in migrate_md
txt = fp_md_fm(path)
File "/mnt/sda1/Documents/development_projects/progressEdd_projects/Python-Notebooks/personal-project/blog-migration/.venv/lib64/python3.13/site-packages/nbdev/migrate.py", line 100, in fp_md_fm
return _re_fm_md.sub(_dict2fm(fm), md)
~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.13/re/__init__.py", line 377, in _compile_template
return _sre.template(pattern, _parser.parse_template(repl, pattern))
~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
File "/usr/lib64/python3.13/re/_parser.py", line 1076, in parse_template
raise s.error('bad escape %s' % this, len(this)) from None
re.PatternError: bad escape \u at position 618 (line 10, column 26)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/mnt/sda1/Documents/development_projects/progressEdd_projects/Python-Notebooks/personal-project/blog-migration/.venv/bin/nbdev_migrate", line 12, in <module>
sys.exit(nbdev_migrate())
~~~~~~~~~~~~~^^
File "/mnt/sda1/Documents/development_projects/progressEdd_projects/Python-Notebooks/personal-project/blog-migration/.venv/lib64/python3.13/site-packages/fastcore/script.py", line 125, in _f
return tfunc(**merge(args, args_from_prog(func, xtra)))
File "/mnt/sda1/Documents/development_projects/progressEdd_projects/Python-Notebooks/personal-project/blog-migration/.venv/lib64/python3.13/site-packages/nbdev/migrate.py", line 181, in nbdev_migrate
except Exception as e: raise Exception(f'Error in migrating file: {f}') from e
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The exact exception
return _re_fm_md.sub(_dict2fm(fm), md)
~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.13/re/__init__.py", line 377, in _compile_template
return _sre.template(pattern, _parser.parse_template(repl, pattern))
~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
File "/usr/lib64/python3.13/re/_parser.py", line 1076, in parse_template
raise s.error('bad escape %s' % this, len(this)) from None
re.PatternError: bad escape \u at position 618 (line 10, column 26)
The above exception was the direct cause of the following exception:
The issue appears to be with this function that causes the error. the md
variable needs to add some processing to substitute the escape sequences \u
with \\u
.
# %% ../nbs/api/16_migrate.ipynb
def fp_md_fm(path):
"Make fastpages front matter in markdown files quarto compliant."
p = Path(path)
md = p.read_text()
fm = _fm2dict(md, nb=False)
if fm:
fm = _fp_convert(fm, path)
return _re_fm_md.sub(_dict2fm(fm), md)
else: return md
Chatgpt says there are other cases as well
Unicode Escapes:
\uXXXX
: Represents a Unicode character with 4 hex digits.\UXXXXXXXX
: Represents a Unicode character with 8 hex digits.\N{name}
: Represents a Unicode character by name.Hexadecimal Escapes:
\xXX
: Represents a character with 2 hex digits.Common Whitespace and Control Escapes:
\n
: Newline.\t
: Tab.\r
: Carriage return.\b
: Backspace (or word boundary in regex contexts).\f
: Form feed.\v
: Vertical tab.Other Special Sequences in Regex:
\1
,\2
, etc.: References to captured groups.\g<name>
or\g<number>
: Named or numbered backreferences.
Asking it to suggest a substitution to capture all kinds of sequences for those above
md = re.sub(r'(?<!\\)\\([uUNx])', r'\\\1', md)
my environment
- IDE: VScodium
- Python: 3.13
- uv
- nbdev>=2.3.35
uv run quarto check
Quarto 1.6.42
[✓] Checking environment information...
Quarto cache location: /home/progressedd/.cache/quarto
[✓] Checking versions of quarto binary dependencies...
Pandoc version 3.4.0: OK
Dart Sass version 1.70.0: OK
Deno version 1.46.3: OK
Typst version 0.11.0: OK
[✓] Checking versions of quarto dependencies......OK
[✓] Checking Quarto installation......OK
Version: 1.6.42
Path: /home/progressedd/opt/quarto-1.6.42/bin
[✓] Checking tools....................OK
TinyTeX: (external install)
Chromium: (not installed)
[✓] Checking LaTeX....................OK
Using: TinyTex
Path: /home/progressedd/.TinyTeX/bin/x86_64-linux
Version: 2021
[✓] Checking basic markdown render....OK
[✓] Checking Python 3 installation....OK
Version: 3.13.2
Path: /mnt/sda1/Documents/development_projects/progressEdd_projects/Python-Notebooks/personal-project/blog-migration/.venv/bin/python3
Jupyter: (None)
Jupyter is not available in this Python installation.
Install with python3 -m pip install jupyter
[✓] Checking R installation...........OK
Version: 4.4.3
Path: /usr/lib64/R
LibPaths:
- /usr/lib64/R/library
- /usr/share/R/library
knitr: 1.33
rmarkdown: (None)
The rmarkdown package is not available in this R installation.
Install with install.packages("rmarkdown")
progressEdd@codium /m/s/D/d/p/P/p/blog-migration > quarto check
Quarto 1.6.42
[✓] Checking environment information...
Quarto cache location: /home/progressedd/.cache/quarto
[✓] Checking versions of quarto binary dependencies...
Pandoc version 3.4.0: OK
Dart Sass version 1.70.0: OK
Deno version 1.46.3: OK
Typst version 0.11.0: OK
[✓] Checking versions of quarto dependencies......OK
[✓] Checking Quarto installation......OK
Version: 1.6.42
Path: /home/progressedd/opt/quarto-1.6.42/bin
[✓] Checking tools....................OK
TinyTeX: (external install)
Chromium: (not installed)
[✓] Checking LaTeX....................OK
Using: TinyTex
Path: /home/progressedd/.TinyTeX/bin/x86_64-linux
Version: 2021
[✓] Checking basic markdown render....OK
[✓] Checking Python 3 installation....OK
Version: 3.13.2
Path: /usr/bin/python3
Jupyter: (None)
Jupyter is not available in this Python installation.
Install with python3 -m pip install jupyter
There is an unactivated Python environment in .venv. Did you forget to activate it?
[✓] Checking R installation...........OK
Version: 4.4.3
Path: /usr/lib64/R
LibPaths:
- /usr/lib64/R/library
- /usr/share/R/library
knitr: 1.33
rmarkdown: (None)
The rmarkdown package is not available in this R installation.
Install with install.packages("rmarkdown")