Skip to content

Commit 2f1e520

Browse files
authored
Merge commit from fork
GHSA-c2pc-g5qf-rfrf
2 parents cb026a5 + d777db8 commit 2f1e520

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+1655
-131
lines changed

.github/workflows/tests.yml

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -109,6 +109,24 @@ jobs:
109109

110110
- run: vendor/bin/psalm --no-progress --stats --threads=$(nproc) --output-format=github --shepherd
111111

112+
pathological:
113+
name: Pathological Tests
114+
runs-on: ubuntu-latest
115+
116+
steps:
117+
- uses: actions/checkout@v3
118+
119+
- uses: shivammathur/setup-php@v2
120+
with:
121+
php-version: 8.1
122+
extensions: curl, mbstring, yaml
123+
coverage: none
124+
tools: composer:v2
125+
126+
- run: composer update --no-progress
127+
128+
- run: php tests/pathological/test.php
129+
112130
docs-lint:
113131
permissions:
114132
contents: read # for actions/checkout to fetch code

.phpstorm.meta.php

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@
3131
'html_input',
3232
'allow_unsafe_links',
3333
'max_nesting_level',
34+
'max_delimiters_per_line',
3435
'renderer',
3536
'renderer/block_separator',
3637
'renderer/inner_separator',
@@ -89,6 +90,7 @@
8990
'table/alignment_attributes/left',
9091
'table/alignment_attributes/center',
9192
'table/alignment_attributes/right',
93+
'table/max_autocompleted_cells',
9294
'table_of_contents',
9395
'table_of_contents/html_class',
9496
'table_of_contents/max_heading_level',

CHANGELOG.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,31 @@ Updates should follow the [Keep a CHANGELOG](https://keepachangelog.com/) princi
66

77
## [Unreleased][unreleased]
88

9+
This is a **security release** to address potential denial of service attacks when parsing specially crafted,
10+
malicious input from untrusted sources (like user input).
11+
12+
### Added
13+
14+
- Added `max_delimiters_per_line` config option to prevent denial of service attacks when parsing malicious input
15+
- Added `table/max_autocompleted_cells` config option to prevent denial of service attacks when parsing large tables
16+
- The `AttributesExtension` now supports attributes without values (#985, #986)
17+
- The `AutolinkExtension` exposes two new configuration options to override the default behavior (#969, #987):
18+
- `autolink/allowed_protocols` - an array of protocols to allow autolinking for
19+
- `autolink/default_protocol` - the default protocol to use when none is specified
20+
- Added `RegexHelper::isWhitespace()` method to check if a given character is an ASCII whitespace character
21+
- Added `CacheableDelimiterProcessorInterface` to ensure linear complexity for dynamic delimiter processing
22+
- Added `Bracket` delimiter type to optimize bracket parsing
23+
24+
### Changed
25+
26+
- `[` and `]` are no longer added as `Delimiter` objects on the stack; a new `Bracket` type with its own stack is used instead
27+
- `UrlAutolinkParser` no longer parses URLs with more than 127 subdomains
28+
- Expanded reference links can no longer exceed 100kb, or the size of the input document (whichever is greater)
29+
- Delimiters should always provide a non-null value via `DelimiterInterface::getIndex()`
30+
- We'll attempt to infer the index based on surrounding delimiters where possible
31+
- The `DelimiterStack` now accepts integer positions for any `$stackBottom` argument
32+
- Several small performance optimizations
33+
934
## [2.5.3] - 2024-08-16
1035

1136
### Changed
@@ -77,6 +102,25 @@ Updates should follow the [Keep a CHANGELOG](https://keepachangelog.com/) princi
77102
- Fixed declaration parser being too strict
78103
- `FencedCodeRenderer`: don't add `language-` to class if already prefixed
79104

105+
### Deprecated
106+
107+
- Returning dynamic values from `DelimiterProcessorInterface::getDelimiterUse()` is deprecated
108+
- You should instead implement `CacheableDelimiterProcessorInterface` to help the engine perform caching to avoid performance issues.
109+
- Failing to set a delimiter's index (or returning `null` from `DelimiterInterface::getIndex()`) is deprecated and will not be supported in 3.0
110+
- Deprecated `DelimiterInterface::isActive()` and `DelimiterInterface::setActive()`, as these are no longer used by the engine
111+
- Deprecated `DelimiterStack::removeEarlierMatches()` and `DelimiterStack::searchByCharacter()`, as these are no longer used by the engine
112+
- Passing a `DelimiterInterface` as the `$stackBottom` argument to `DelimiterStack::processDelimiters()` or `::removeAll()` is deprecated and will not be supported in 3.0; pass the integer position instead.
113+
114+
### Fixed
115+
116+
- Fixed NUL characters not being replaced in the input
117+
- Fixed quadratic complexity parsing unclosed inline links
118+
- Fixed quadratic complexity parsing emphasis and strikethrough delimiters
119+
- Fixed issue where having 500,000+ delimiters could trigger a [known segmentation fault issue in PHP's garbage collection](https://bugs.php.net/bug.php?id=68606)
120+
- Fixed quadratic complexity deactivating link openers
121+
- Fixed quadratic complexity parsing long backtick code spans with no matching closers
122+
- Fixed catastrophic backtracking when parsing link labels/titles
123+
80124
## [2.4.1] - 2023-08-30
81125

82126
### Fixed

composer.json

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -42,8 +42,9 @@
4242
"phpstan/phpstan": "^1.8.2",
4343
"phpunit/phpunit": "^9.5.21 || ^10.5.9 || ^11.0.0",
4444
"scrutinizer/ocular": "^1.8.1",
45-
"symfony/finder": "^5.3 | ^6.0 || ^7.0",
46-
"symfony/yaml": "^2.3 | ^3.0 | ^4.0 | ^5.0 | ^6.0 || ^7.0",
45+
"symfony/finder": "^5.3 | ^6.0 | ^7.0",
46+
"symfony/process": "^5.4 | ^6.0 | ^7.0",
47+
"symfony/yaml": "^2.3 | ^3.0 | ^4.0 | ^5.0 | ^6.0 | ^7.0",
4748
"unleashedtech/php-coding-standard": "^3.1.1",
4849
"vimeo/psalm": "^4.24.0 || ^5.0.0"
4950
},
@@ -103,11 +104,13 @@
103104
"phpstan": "phpstan analyse",
104105
"phpunit": "phpunit --no-coverage",
105106
"psalm": "psalm --stats",
107+
"pathological": "tests/pathological/test.php",
106108
"test": [
107109
"@phpcs",
108110
"@phpstan",
109111
"@psalm",
110-
"@phpunit"
112+
"@phpunit",
113+
"@pathological"
111114
]
112115
},
113116
"extra": {

docs/2.5/configuration.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ $config = [
2727
'html_input' => 'escape',
2828
'allow_unsafe_links' => false,
2929
'max_nesting_level' => PHP_INT_MAX,
30+
'max_delimiters_per_line' => PHP_INT_MAX,
3031
'slug_normalizer' => [
3132
'max_length' => 255,
3233
],
@@ -73,6 +74,7 @@ Here's a list of the core configuration options available:
7374
- `escape` - Escape all HTML
7475
- `allow_unsafe_links` - Remove risky link and image URLs by setting this to `false` (default: `true`)
7576
- `max_nesting_level` - The maximum nesting level for blocks (default: `PHP_INT_MAX`). Setting this to a positive integer can help protect against long parse times and/or segfaults if blocks are too deeply-nested.
77+
- `max_delimiters_per_line` - The maximum number of delimiters (e.g. `*` or `_`) allowed in a single line (default: `PHP_INT_MAX`). Setting this to a positive integer can help protect against long parse times and/or segfaults if lines are too long.
7678
- `slug_normalizer` - Array of options for configuring how URL-safe slugs are created; see [the slug normalizer docs](/2.5/customization/slug-normalizer/#configuration) for more details
7779
- `instance` - An alternative normalizer to use (defaults to the included `SlugNormalizer`)
7880
- `max_length` - Limits the size of generated slugs (defaults to 255 characters)

docs/2.5/customization/delimiter-processing.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,8 @@ public function getDelimiterUse(DelimiterInterface $opener, DelimiterInterface $
4848

4949
This method is used to tell the engine how many characters from the matching delimiters should be consumed. For simple processors you'll likely return `1` (or whatever your minimum length is). In more advanced cases, you can examine the opening and closing delimiters and perform additional logic to determine whether they should be fully or partially consumed. You can also return `0` if you'd like.
5050

51+
**Note:** Unless you're returning a hard-coded value, you should probably implement `CacheableDelimiterProcessorInterface` instead of `DelimiterProcessorInterface` - this will allow the engine to perform additional caching for better performance.
52+
5153
### `process()`
5254

5355
```php

docs/2.5/extensions/tables.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ $config = [
4444
'center' => ['align' => 'center'],
4545
'right' => ['align' => 'right'],
4646
],
47+
'max_autocompleted_cells' => 10_000,
4748
],
4849
];
4950

@@ -159,6 +160,14 @@ $config = [
159160

160161
Or any other HTML attributes you'd like!
161162

163+
### Limiting Auto-Completed Cells
164+
165+
The GFM specification says that the:
166+
167+
> table’s rows may vary in the number of cells. If there are a number of cells fewer than the number of cells in the header row, empty cells are inserted.
168+
169+
This feature could be abused to create very large tables. To prevent this, you can configure the `max_autocompleted_cells` option to limit the number of empty cells that will be autocompleted. If the limit is reached, further parsing of the table will be aborted.
170+
162171
## Credits
163172

164173
The Table functionality was originally built by [Martin Hasoň](https://github.com/hason) and [Webuni s.r.o.](https://www.webuni.cz) before it was merged into the core parser.

docs/2.5/security.md

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,8 @@ In order to be fully compliant with the CommonMark spec, certain security settin
1111

1212
- `html_input`: How to handle raw HTML
1313
- `allow_unsafe_links`: Whether unsafe links are permitted
14-
- `max_nesting_level`: Protected against long render times or segfaults
14+
- `max_nesting_level`: Protect against long render times or segfaults
15+
- `max_delimiters_per_line`: Protect against long parse times or rendering segfaults
1516

1617
Further information about each option can be found below.
1718

@@ -88,6 +89,25 @@ echo $converter->convert($markdown);
8889

8990
See the [configuration](/2.5/configuration/) section for more information.
9091

92+
## Max Delimiters Per Line
93+
94+
Similarly to the maximum nesting level, **no maximum number of delimiters per line is enforced by default.** Delimiters can be nested (like `*a **b** c*`) or un-nested (like `*a* *b* *c*`) - in either case, having too many in a single line can result in long parse times. We therefore have a separate option to limit the number of delimiters per line.
95+
96+
If you need to parse untrusted input, consider setting a reasonable `max_delimiters_per_line` (perhaps 100-1000) depending on your needs. Once this level is hit, any subsequent delimiters on that line will be rendered as plain text.
97+
98+
### Example - Prevent too many delimiters
99+
100+
```php
101+
use League\CommonMark\CommonMarkConverter;
102+
103+
$markdown = '*a* **b *c **d** c* b**'; // 8 delimiters (* and **)
104+
105+
$converter = new CommonMarkConverter(['max_delimiters_per_line' => 6]);
106+
echo $converter->convert($markdown);
107+
108+
// <p><em>a</em> **b *c <strong>d</strong> c* b**</p>
109+
```
110+
91111
## Additional Filtering
92112

93113
Although this library does offer these security features out-of-the-box, some users may opt to also run the HTML output through additional filtering layers (like HTMLPurifier). If you do this, make sure you **thoroughly** test your additional post-processing steps and configure them to work properly with the types of HTML elements and attributes that converted Markdown might produce, otherwise, you may end up with weird behavior like missing images, broken links, mismatched HTML tags, etc.

docs/2.6/upgrading.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,3 +6,27 @@ redirect_from: /upgrading/
66
---
77

88
# Upgrading from 2.5 to 2.6
9+
10+
## `max_delimiters_per_line` Configuration Option
11+
12+
The `max_delimiters_per_line` configuration option was added in 2.6 to help protect against malicious input that could
13+
cause excessive memory usage or denial of service attacks. It defaults to `PHP_INT_MAX` (no limit) for backwards
14+
compatibility, which is safe when parsing trusted input. However, if you're parsing untrusted input from users, you
15+
should probably set this to a reasonable value (somewhere between `100` and `1000`) to protect against malicious inputs.
16+
17+
## Custom Delimiter Processors
18+
19+
If you're implementing a custom delimiter processor, and `getDelimiterUse()` has more logic than just a
20+
simple `return` statement, you should implement `CacheableDelimiterProcessorInterface` instead of
21+
`DelimiterProcessorInterface` to improve performance and avoid possible quadratic performance issues.
22+
23+
`DelimiterProcessorInterface` has a `getDelimiterUse()` method that tells the engine how many characters from the
24+
matching delimiters should be consumed. Simple processors usually always return a hard-coded integer like `1` or `2`.
25+
However, some more advanced processors may need to examine the opening and closing delimiters and perform additional
26+
logic to determine whether they should be fully or partially consumed. Previously, these results could not be safely
27+
cached, resulting in possible quadratic performance issues.
28+
29+
In 2.6, the `CacheableDelimiterProcessorInterface` was introduced to allow these "dynamic" processors to be safely
30+
cached. It requires a new `getCacheKey()` method that returns a string that uniquely identifies the combination of
31+
factors considered when determining the delimiter use. This key is then used to cache the results of the search for
32+
a matching delimiter.

phpstan.neon.dist

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,8 @@ parameters:
77
message: '#Parameter .+ of class .+Reference constructor expects string, string\|null given#'
88
- path: src/Util/RegexHelper.php
99
message: '#Method .+RegexHelper::unescape\(\) should return string but returns string\|null#'
10+
- path: src/Delimiter/DelimiterStack.php
11+
message: '#unknown class WeakMap#'
1012
exceptions:
1113
uncheckedExceptionClasses:
1214
# Exceptions caused by bad developer logic that should always bubble up:

0 commit comments

Comments
 (0)