Skip to content

Commit e693b67

Browse files
phdavis1027Mingun
authored andcommitted
change: allow unescape_with to override standard replacements
This involves swapping the order in which the main loop of that function checks for `named_entities` and `entity_resolver` closures, namely we check the user-provided `entity_resolver` first. This shouldn't be a breaking change unless some existing client code already had overridden `named_entity` replacements, but those replacements would have been noops under the previous implementation so the fix should be trivial.
1 parent 99a64a8 commit e693b67

File tree

2 files changed

+36
-2
lines changed

2 files changed

+36
-2
lines changed

Changelog.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,8 @@ to get an offset of the error position. For `SyntaxError`s the range
3434
it can handle every attribute that does not match existing cases within an enum variant.
3535
- [#722]: Allow to pass owned strings to `Writer::create_element`. This is breaking change!
3636
- [#275]: Added `ElementWriter::new_line()` which enables pretty printing elements with multiple attributes.
37+
- [#734]: Allow to override resolution of predefined entities (`lt`, `gt`, `apos`, `quot`, `amp`)
38+
in `unescape_with` family of methods.
3739

3840
### Bug Fixes
3941

@@ -81,6 +83,7 @@ to get an offset of the error position. For `SyntaxError`s the range
8183
[#704]: https://github.com/tafia/quick-xml/pull/704
8284
[#705]: https://github.com/tafia/quick-xml/pull/705
8385
[#722]: https://github.com/tafia/quick-xml/pull/722
86+
[#734]: https://github.com/tafia/quick-xml/pull/734
8487
[#738]: https://github.com/tafia/quick-xml/pull/738
8588

8689

src/escape.rs

Lines changed: 33 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -192,8 +192,39 @@ pub fn unescape(raw: &str) -> Result<Cow<str>, EscapeError> {
192192
///
193193
/// If feature [`escape-html`] is enabled, then recognizes all [HTML5 escapes].
194194
///
195+
/// Predefined entities will be resolved _after_ trying to resolve with `resolve_entity`,
196+
/// which allows you to override default behavior which required in some XML dialects.
197+
///
198+
/// Character references (`&#hh;`) cannot be overridden, they are resolved before
199+
/// calling `resolve_entity`.
200+
///
201+
/// Note, that entities will not be resolved recursively. In order to satisfy the
202+
/// XML [requirements] you should unescape nested entities by yourself.
203+
///
204+
/// # Example
205+
///
206+
/// ```
207+
/// # use quick_xml::escape::unescape_with;
208+
/// # use pretty_assertions::assert_eq;
209+
/// let override_named_entities = |entity: &str| match entity {
210+
/// // Override standard entities
211+
/// "lt" => Some("FOO"),
212+
/// "gt" => Some("BAR"),
213+
/// // Resolve custom entities
214+
/// "baz" => Some("&lt;"),
215+
/// // All other entities produces EscapeError::UnrecognizedSymbol error
216+
/// _ => None,
217+
/// };
218+
///
219+
/// assert_eq!(
220+
/// unescape_with("&amp;&lt;test&gt;&baz;", override_named_entities).unwrap(),
221+
/// "&FOOtestBAR&lt;"
222+
/// );
223+
/// ```
224+
///
195225
/// [`escape-html`]: ../index.html#escape-html
196226
/// [HTML5 escapes]: https://dev.w3.org/html5/html-author/charref
227+
/// [requirements]: https://www.w3.org/TR/xml11/#intern-replacement
197228
pub fn unescape_with<'input, 'entity, F>(
198229
raw: &'input str,
199230
mut resolve_entity: F,
@@ -221,10 +252,10 @@ where
221252
if let Some(entity) = pat.strip_prefix('#') {
222253
let codepoint = parse_number(entity, start..end)?;
223254
unescaped.push_str(codepoint.encode_utf8(&mut [0u8; 4]));
224-
} else if let Some(value) = named_entity(pat) {
225-
unescaped.push_str(value);
226255
} else if let Some(value) = resolve_entity(pat) {
227256
unescaped.push_str(value);
257+
} else if let Some(value) = named_entity(pat) {
258+
unescaped.push_str(value);
228259
} else {
229260
return Err(EscapeError::UnrecognizedSymbol(
230261
start + 1..end,

0 commit comments

Comments
 (0)