Skip to content
This repository was archived by the owner on Feb 24, 2025. It is now read-only.

Formalize extensions support; add inline HTML support #44

Merged
merged 1 commit into from
Sep 22, 2015
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,10 @@
## 0.9.0

* Formalize an API for Markdown extensions (#43).
* **Breaking:** Fenced code blocks are now considered an extension, as
they are not part of Markdown.pl.
* Inline HTML syntax supported. This is also considered an extension (#18).

## 0.8.0

* **Breaking:** Remove (probably unused) fields: `LinkSyntax.resolved`,
Expand Down
35 changes: 33 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
A portable markdown library written in Dart. It can parse markdown into
html on both the client and server.
A portable Markdown library written in Dart. It can parse Markdown into
HTML on both the client and server.

Usage
-----
Expand All @@ -13,6 +13,37 @@ void main() {
}
```

Syntax extensions
-----------------

A few Markdown extensions are supported. They are all disabled by default, and
can be enabled by specifying an Array of extension syntaxes in the `blockSyntaxes` or `inlineSyntaxes`
argument of `markdownToHtml`.

The currently supported inline extension syntaxes are:

* `new InlineHtmlSyntax()` - approximately CommonMark's
[definition](http://spec.commonmark.org/0.22/#raw-html) of "Raw HTML".

The currently supported block extension syntaxes are:

* `const FencedCodeBlockSyntax()` - Code blocks familiar to Pandoc and PHP
Markdown Extra users.

For example:

```dart
import 'package:markdown/markdown.dart';

void main() {
print(markdownToHtml('Hello <span class="green">Markdown</span>',
inlineSyntaxes: [new InlineHtmlSyntax()]));
//=> <p>Hello <span class="green">Markdown</span></p>
}
```

### Custom syntax extensions

You can create and use your own syntaxes.

```dart
Expand Down
42 changes: 24 additions & 18 deletions lib/src/block_parser.dart
Original file line number Diff line number Diff line change
Expand Up @@ -55,10 +55,31 @@ class BlockParser {
/// The markdown document this parser is parsing.
final Document document;

/// The enabled block syntaxes. To turn a series of lines into blocks, each of
/// these will be tried in turn. Order matters here.
final List<BlockSyntax> blockSyntaxes = [];

/// Index of the current line.
int _pos;
int _pos = 0;

BlockParser(this.lines, this.document) : _pos = 0;
/// The collection of built-in block parsers.
final List<BlockSyntax> standardBlockSyntaxes = const [
const EmptyBlockSyntax(),
const BlockHtmlSyntax(),
const SetextHeaderSyntax(),
const HeaderSyntax(),
const CodeBlockSyntax(),
const BlockquoteSyntax(),
const HorizontalRuleSyntax(),
const UnorderedListSyntax(),
const OrderedListSyntax(),
const ParagraphSyntax()
];

BlockParser(this.lines, this.document) {
blockSyntaxes.addAll(document.blockSyntaxes);
blockSyntaxes.addAll(standardBlockSyntaxes);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way "extensions" or "plug-ins" work for InlineSyntaxes is that the parser pulls them from the document, which the public API can mutate. Outside code can just add in new objects that implement InlineSyntax.

I'm not sure if that's a great API or not—I lean towards not since it exposes an awful lot about how the parser is implemented—but I think it would be good for block syntax to be consistent with it.

You could either also add a blockSyntaxes field to Document, or remove inlineSyntaxes and use extensions for it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I'm not crazy about exposing the innards. It's also unfortunate that the order of the syntaxes matters (maybe not so much with the block-level ones), because then the user can't insert their syntax in the middle.

I'll stick with how they work for InlineSyntaxes. Adding a blockSyntaxes field to Document. I can change the API back if we need to later.

}

/// Gets the current line.
String get current => lines[_pos];
Expand Down Expand Up @@ -90,21 +111,6 @@ class BlockParser {
}

abstract class BlockSyntax {
/// Gets the collection of built-in block parsers. To turn a series of lines
/// into blocks, each of these will be tried in turn. Order matters here.
static const List<BlockSyntax> syntaxes = const [
const EmptyBlockSyntax(),
const BlockHtmlSyntax(),
const SetextHeaderSyntax(),
const HeaderSyntax(),
const CodeBlockSyntax(),
const FencedCodeBlockSyntax(),
const BlockquoteSyntax(),
const HorizontalRuleSyntax(),
const UnorderedListSyntax(),
const OrderedListSyntax(),
const ParagraphSyntax()
];

const BlockSyntax();

Expand Down Expand Up @@ -136,7 +142,7 @@ abstract class BlockSyntax {
/// Gets whether or not [parser]'s current line should end the previous block.
static bool isAtBlockEnd(BlockParser parser) {
if (parser.isDone) return true;
return syntaxes.any((s) => s.canParse(parser) && s.canEndBlock);
return parser.blockSyntaxes.any((s) => s.canParse(parser) && s.canEndBlock);
}
}

Expand Down
11 changes: 8 additions & 3 deletions lib/src/document.dart
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,19 @@ import 'ast.dart';
import 'block_parser.dart';
import 'inline_parser.dart';

/// Maintains the context needed to parse a markdown document.
/// Maintains the context needed to parse a Markdown document.
class Document {
final Map<String, Link> refLinks;
List<BlockSyntax> blockSyntaxes;
List<InlineSyntax> inlineSyntaxes;
Resolver linkResolver;
Resolver imageLinkResolver;

Document({this.inlineSyntaxes, this.linkResolver, this.imageLinkResolver})
Document(
{this.blockSyntaxes: const [],
this.inlineSyntaxes: const [],
this.linkResolver,
this.imageLinkResolver})
: refLinks = <String, Link>{};

parseRefLinks(List<String> lines) {
Expand Down Expand Up @@ -61,7 +66,7 @@ class Document {

var blocks = <Node>[];
while (!parser.isDone) {
for (var syntax in BlockSyntax.syntaxes) {
for (var syntax in parser.blockSyntaxes) {
if (syntax.canParse(parser)) {
var block = syntax.parse(parser);
if (block != null) blocks.add(block);
Expand Down
4 changes: 3 additions & 1 deletion lib/src/html_renderer.dart
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,13 @@ import 'inline_parser.dart';

/// Converts the given string of markdown to HTML.
String markdownToHtml(String markdown,
{List<InlineSyntax> inlineSyntaxes,
{List<BlockSyntax> blockSyntaxes: const [],
List<InlineSyntax> inlineSyntaxes: const [],
Resolver linkResolver,
Resolver imageLinkResolver,
bool inlineOnly: false}) {
var document = new Document(
blockSyntaxes: blockSyntaxes,
inlineSyntaxes: inlineSyntaxes,
imageLinkResolver: imageLinkResolver,
linkResolver: linkResolver);
Expand Down
20 changes: 15 additions & 5 deletions lib/src/inline_parser.dart
Original file line number Diff line number Diff line change
Expand Up @@ -70,13 +70,10 @@ class InlineParser {

InlineParser(this.source, this.document) : _stack = <TagState>[] {
// User specified syntaxes are the first syntaxes to be evaluated.
if (document.inlineSyntaxes != null) {
syntaxes.addAll(document.inlineSyntaxes);
}

syntaxes.addAll(document.inlineSyntaxes);
syntaxes.addAll(_defaultSyntaxes);

// Custom link resolvers goes after the generic text syntax.
// Custom link resolvers go after the generic text syntax.
syntaxes.insertAll(1, [
new LinkSyntax(linkResolver: document.linkResolver),
new ImageLinkSyntax(linkResolver: document.imageLinkResolver)
Expand Down Expand Up @@ -202,6 +199,19 @@ class TextSyntax extends InlineSyntax {
}
}

/// Leave inline HTML tags alone, from
/// [CommonMark 0.22](http://spec.commonmark.org/0.22/#raw-html).
///
/// This is not actually a good definition (nor CommonMark's) of an HTML tag,
/// but it is fast. It will leave text like <a href='hi"> alone, which is
/// incorrect.
///
/// TODO(srawlins): improve accuracy while ensuring performance, once
/// Markdown benchmarking is more mature.
class InlineHtmlSyntax extends TextSyntax {
InlineHtmlSyntax() : super(r'</?[A-Za-z][^>]*>');
}

/// Matches autolinks like `<http://foo.com>`.
class AutolinkSyntax extends InlineSyntax {
AutolinkSyntax() : super(r'<((http|https|ftp)://[^>]*)>');
Expand Down
2 changes: 1 addition & 1 deletion pubspec.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: markdown
version: 0.8.0
version: 0.9.0-dev
author: Dart Team <misc@dartlang.org>
description: A library for converting markdown to HTML.
homepage: https://github.com/dart-lang/markdown
Expand Down
17 changes: 17 additions & 0 deletions test/extensions/inline_html.unit
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
>>> within a paragraph
Within a <em class="x">paragraph</EM>.

<<<
<p>Within a <em class="x">paragraph</EM>.</p>
>>> not HTML
Obviously, 3 < 5 and 7 > 2.
Not HTML: <3>, <_a>, <>

<<<
<p>Obviously, 3 &lt; 5 and 7 > 2.
Not HTML: &lt;3>, &lt;_a>, &lt;></p>
>>> "markdown" within a tag is not parsed
Text <a href="_foo_">And "_foo_"</a>.

<<<
<p>Text <a href="_foo_">And "<em>foo</em>"</a>.</p>
6 changes: 6 additions & 0 deletions test/markdown_test.dart
Original file line number Diff line number Diff line change
Expand Up @@ -99,4 +99,10 @@ nyan''', '''<p>~=[,,_,,]:3</p>
1. This will not be an &lt;ol>.
''', inlineOnly: true);
});

testFile('extensions/fenced_code_blocks.unit',
blockSyntaxes: [const FencedCodeBlockSyntax()]);

testFile('extensions/inline_html.unit',
inlineSyntaxes: [new InlineHtmlSyntax()]);
}
118 changes: 74 additions & 44 deletions test/util.dart
Original file line number Diff line number Diff line change
Expand Up @@ -15,60 +15,90 @@ final _indentPattern = new RegExp(r"^\(indent (\d+)\)\s*");

/// Run tests defined in "*.unit" files inside directory [name].
void testDirectory(String name) {
// Locate the "test" directory. Use mirrors so that this works with the test
// package, which loads this suite into an isolate.
var testDir = p.dirname(currentMirrorSystem()
var dir = p.join(_testDir, name);
var entries =
new Directory(dir).listSync().where((e) => e.path.endsWith('.unit'));

for (var entry in entries) {
testUnitFile(name, entry);
}
}

// Locate the "test" directory. Use mirrors so that this works with the test
// package, which loads this suite into an isolate.
String get _testDir => p.dirname(currentMirrorSystem()
.findLibrary(#markdown.test.util)
.uri
.path);

var dir = p.join(testDir, name);
var entries =
new Directory(dir).listSync().where((e) => e.path.endsWith('.unit'));
void testFile(String file,
{List<BlockSyntax> blockSyntaxes: const [],
List<InlineSyntax> inlineSyntaxes: const []}) =>
testUnitFile(
file,
new File(p.join(_testDir, file)),
blockSyntaxes: blockSyntaxes,
inlineSyntaxes: inlineSyntaxes);

for (var entry in entries) {
group("$name ${p.basename(entry.path)}", () {
var lines = (entry as File).readAsLinesSync();

var i = 0;
while (i < lines.length) {
var description = lines[i++].replaceAll(">>>", "").trim();

// Let the test specify a leading indentation. This is handy for
// regression tests which often come from a chunk of nested code.
var indentMatch = _indentPattern.firstMatch(description);
if (indentMatch != null) {
// The test specifies it in spaces, but the formatter expects levels.
description = description.substring(indentMatch.end);
}

if (description == "") {
description = "line ${i + 1}";
} else {
description = "line ${i + 1}: $description";
}

var input = "";
while (!lines[i].startsWith("<<<")) {
input += lines[i++] + "\n";
}

var expectedOutput = "";
while (++i < lines.length && !lines[i].startsWith(">>>")) {
expectedOutput += lines[i] + "\n";
}

validateCore(description, input, expectedOutput);
void testUnitFile(
String directory,
File entry,
{List<BlockSyntax> blockSyntaxes: const [],
List<InlineSyntax> inlineSyntaxes: const []}) {
group('$directory ${p.basename(entry.path)}', () {
var lines = entry.readAsLinesSync();

var i = 0;
while (i < lines.length) {
var description = lines[i++].replaceAll(">>>", "").trim();

// Let the test specify a leading indentation. This is handy for
// regression tests which often come from a chunk of nested code.
var indentMatch = _indentPattern.firstMatch(description);
if (indentMatch != null) {
// The test specifies it in spaces, but the formatter expects levels.
description = description.substring(indentMatch.end);
}
});
}

if (description == "") {
description = "line ${i + 1}";
} else {
description = "line ${i + 1}: $description";
}

var input = "";
while (!lines[i].startsWith("<<<")) {
input += lines[i++] + "\n";
}

var expectedOutput = "";
while (++i < lines.length && !lines[i].startsWith(">>>")) {
expectedOutput += lines[i] + "\n";
}

validateCore(
description,
input,
expectedOutput,
blockSyntaxes: blockSyntaxes,
inlineSyntaxes: inlineSyntaxes
);
}
});
}

void validateCore(String description, String markdown, String html,
{List<InlineSyntax> inlineSyntaxes, Resolver linkResolver,
Resolver imageLinkResolver, bool inlineOnly: false}) {
void validateCore(
String description,
String markdown,
String html,
{List<BlockSyntax> blockSyntaxes: const [],
List<InlineSyntax> inlineSyntaxes: const [],
Resolver linkResolver,
Resolver imageLinkResolver,
bool inlineOnly: false}) {
test(description, () {
var result = markdownToHtml(markdown,
blockSyntaxes: blockSyntaxes,
inlineSyntaxes: inlineSyntaxes,
linkResolver: linkResolver,
imageLinkResolver: imageLinkResolver,
Expand Down