File tree Expand file tree Collapse file tree 4 files changed +41
-4
lines changed
tests/test_syntax/extensions Expand file tree Collapse file tree 4 files changed +41
-4
lines changed Original file line number Diff line number Diff line change @@ -81,6 +81,13 @@ The following new features have been included in the 3.3 release:
8181 maintain the current behavior in the rebuilt Markdown in HTML extension. A few random
8282 edge-case bugs (see the included tests) were resolved in the process (#803 ).
8383
84+ * An alternate function ` markdown.extensions.headerid.slugify_unicode ` has been included
85+ with the [ Table of Contents] ( ../extensions/toc.md ) extension which supports Unicode
86+ characters in table of contents slugs. The old ` markdown.extensions.headerid.slugify `
87+ method which removes non-ASCII characters remains the default. Import and pass
88+ ` markdown.extensions.headerid.slugify_unicode ` to the ` slugify ` configuration option
89+ to use the new behavior.
90+
8491## Bug fixes
8592
8693The following bug fixes are included in the 3.3 release:
Original file line number Diff line number Diff line change @@ -202,6 +202,9 @@ The following options are provided to configure the output:
202202
203203 The callable must return a string appropriate for use in HTML ` id ` attributes.
204204
205+ An alternate version of the default callable supporting Unicode strings is also
206+ provided as ` markdown.extensions.headerid.slugify_unicode ` .
207+
205208* ** ` separator ` ** :
206209 Word separator. Character which replaces white space in id. Defaults to "` - ` ".
207210
Original file line number Diff line number Diff line change 2323import xml .etree .ElementTree as etree
2424
2525
26- def slugify (value , separator ):
26+ def slugify (value , separator , encoding = 'ascii' ):
2727 """ Slugify a string, to make it URL friendly. """
28- value = unicodedata .normalize ('NFKD' , value ).encode ('ascii' , 'ignore' )
29- value = re .sub (r'[^\w\s-]' , '' , value .decode ('ascii' )).strip ().lower ()
30- return re .sub (r'[%s\s]+' % separator , separator , value )
28+ value = unicodedata .normalize ('NFKD' , value ).encode (encoding , 'ignore' )
29+ value = re .sub (r'[^\w\s-]' , '' , value .decode (encoding )).strip ().lower ()
30+ return re .sub (r'[{}\s]+' .format (separator ), separator , value )
31+
32+
33+ def slugify_unicode (value , separator ):
34+ """ Slugify a string, to make it URL friendly while preserving Unicode characters. """
35+ return slugify (value , separator , 'utf-8' )
3136
3237
3338IDCOUNT_RE = re .compile (r'^(.*)_([0-9]+)$' )
Original file line number Diff line number Diff line change @@ -141,3 +141,25 @@ def testPermalinkWithEmptyTitle(self):
141141 '</h1>' , # noqa
142142 extensions = [TocExtension (permalink = True , permalink_title = "" )]
143143 )
144+
145+ def testPermalinkWithUnicodeInID (self ):
146+ from markdown .extensions .toc import slugify_unicode
147+ self .assertMarkdownRenders (
148+ '# Unicode ヘッダー' ,
149+ '<h1 id="unicode-ヘッター">' # noqa
150+ 'Unicode ヘッダー' # noqa
151+ '<a class="headerlink" href="#unicode-ヘッター" title="Permanent link">¶</a>' # noqa
152+ '</h1>' , # noqa
153+ extensions = [TocExtension (permalink = True , slugify = slugify_unicode )]
154+ )
155+
156+ def testPermalinkWithUnicodeTitle (self ):
157+ from markdown .extensions .toc import slugify_unicode
158+ self .assertMarkdownRenders (
159+ '# Unicode ヘッダー' ,
160+ '<h1 id="unicode-ヘッター">' # noqa
161+ 'Unicode ヘッダー' # noqa
162+ '<a class="headerlink" href="#unicode-ヘッター" title="パーマリンク">¶</a>' # noqa
163+ '</h1>' , # noqa
164+ extensions = [TocExtension (permalink = True , permalink_title = "パーマリンク" , slugify = slugify_unicode )]
165+ )
You can’t perform that action at this time.
0 commit comments