Skip to content

HTML2PHPBBCode is a Python package to parse HTML code and convert it to phpBB-compatible BBCode.

License

Notifications You must be signed in to change notification settings

tdiam/html2phpbbcode

Repository files navigation

Build Status PyPI version License

HTML2PHPBBCode

HTML2PHPBBCode is a Python 3 package that can be used to parse HTML code and convert it to phpBB-compatible BBCode.

Usage

>>> from html2phpbbcode.parser import HTML2PHPBBCode
>>> parser = HTML2PHPBBCode()
>>> parser.feed('<ul><li>Hello</li><li>World</li></ul>')
'[list][*]Hello[*]World[/list]'
>>> parser.feed('<ol><li>one<li>two</ol>')
'[list=1][*]one[*]two[/list]'
>>> parser.feed('<a href="https://water.org">Water.org</a>')
'[url=https://water.org]Water.org[/url]'
>>> parser.feed('<a href="mailto:info@water.org">Mail Water.org</a>')
'[email=info@water.org]Mail Water.org[/email]'
>>> parser.feed('<strong>Hello <i>World</i>. It&#39;s a wonderful world</strong>')
"[b]Hello [i]World[/i]. It's a wonderful world[/b]"

Acknowledgements

HTML2PHPBBCode is based on the html2bbcode package of Vladimir Korsun which is available under the BSD License.

The regex package by Matthew Barnett is also used, available under the Python Software Foundation License.

The code includes some regular expressions from the phpBB bulletin board software as well. Minor changes have been made for Python compatibility. phpBB code is available under GNU GPL v2.0.

Differences from html2bbcode

This package differs from html2bbcode in the following:

  • The generated BBCode follows the syntax described in phpBB's BBCode guide.
  • <b>, <i>, <u>, <s>, <ol> HTML tags are also supported.
  • <font>'s size attribute handling has been changed so that it maps to reasonable BBCode size values.
  • If the href attribute of an <a> link uses the mailto: protocol, then the [email] BBCode tag is used.
  • If the href attribute of an <a> link is neither an email nor a valid http/https URL, the link is converted to plain-text in BBCode.
  • The parser removes excessive whitespace such as newlines between tags: <p>Hello</p>\n<p>World</p> (TODO: Use the W3C spec rules)

Installing

The package is available at PyPI and can be installed with the following command:

pip install html2phpbbcode

Installing from source is also an option:

python3 setup.py install

Testing

pytest is used for testing. Just run pytest in the project directory to execute the tests.

About

HTML2PHPBBCode is a Python package to parse HTML code and convert it to phpBB-compatible BBCode.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages