-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix sanitizer config - multiple rules #11133
Conversation
Signed-off-by: Alexander Scheel <alexander.m.scheel@gmail.com>
Resolves: 9888 Signed-off-by: Alexander Scheel <alexander.m.scheel@gmail.com>
b216247
to
4042adb
Compare
Codecov Report
@@ Coverage Diff @@
## master #11133 +/- ##
=======================================
Coverage 43.47% 43.47%
=======================================
Files 600 600
Lines 85108 85104 -4
=======================================
Hits 37002 37002
+ Misses 43537 43534 -3
+ Partials 4569 4568 -1
Continue to review full report at Codecov.
|
first thought: since #9075 got into v1.11 -> we will break existing config based on that app.ini settings |
Maybe it could be possible to also support old syntax for backward compatibility |
I think the point is that the old config doesn't work well -- it is best we migrate completely rather than support what doesn't work. In particular, this won't work: [markup.sanitizer]
ELEMENT = a
ALLOW_ATTR = target
REGEXP = $1
ELEMENT = a
ALLOW_ATTR = rel
REGEXP = $2 Because [markup.sanitizer]
ELEMENT = a
ALLOW_ATTR = target
REGEXP = $1
ELEMENT = a
ALLOW_ATTR = rel
REGEXP = $2
ELEMENT = img
ALLOW_ATTR = src
REGEXP = $3 Our data is:
So we don't know if Rewriting the ini file (with There's then two working scenarios:
With this PR currently, we support number two just fine. That leaves number one -- since it borders closely on what doesn't work, I'm inclined to require manual migration. Thoughts? |
Yeah the old system just doesn't work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments as above otherwise approve
Signed-off-by: Alexander Scheel <alexander.m.scheel@gmail.com>
Thanks @zeripath for the docs review. I wrote the docs first (and originally was thinking numbered sections) but then I looked at the code and figured numbering was a needless restriction. I've added an example of what was broken with the old config format to the cheat sheet as well. |
custom/conf/app.ini.sample
Outdated
; The following keys can be used multiple times to define sanitation policy rules. | ||
[markup.sanitizer.1] | ||
; The following keys can appear once to define a sanitation policy rule. | ||
; This section can appear with an incremenented number to define multiple rules. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
; This section can appear with an incremenented number to define multiple rules. | |
; This section can appear with an incrementing numbers or any distinct alphanumeric string to define multiple rules. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think "this section can appear with an incrementing numbers to define multiple rules" is correct English either.
I've made it just "This section can appear again with a unique alphanumeric string to define multiple rules".
This was changed because the implementation with the ini parser used was flawed; the following configs were indistinguishable after parsing: | ||
|
||
```ini | ||
[markup.sanitizer] | ||
ELEMENT = a | ||
ALLOW_ATTR = target | ||
REGEXP = $1 | ||
ELEMENT = a | ||
ALLOW_ATTR = rel | ||
REGEXP = $2 | ||
ELEMENT = img | ||
ALLOW_ATTR = src | ||
REGEXP = $3 | ||
``` | ||
|
||
and | ||
|
||
```ini | ||
[markup.sanitizer] | ||
ELEMENT = a | ||
ALLOW_ATTR = target | ||
REGEXP = $1 | ||
ELEMENT = img | ||
ALLOW_ATTR = rel | ||
REGEXP = $2 | ||
ELEMENT = img | ||
ALLOW_ATTR = src | ||
REGEXP = $3 | ||
``` | ||
|
||
Because of limitations in the ini library, we are unable to automatically migrate configurations. | ||
|
||
We will still parse the first rule from a `[markup.sanitizer]` section if present, but multiple rules must be manually migrated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was changed because the implementation with the ini parser used was flawed; the following configs were indistinguishable after parsing: | |
```ini | |
[markup.sanitizer] | |
ELEMENT = a | |
ALLOW_ATTR = target | |
REGEXP = $1 | |
ELEMENT = a | |
ALLOW_ATTR = rel | |
REGEXP = $2 | |
ELEMENT = img | |
ALLOW_ATTR = src | |
REGEXP = $3 | |
``` | |
and | |
```ini | |
[markup.sanitizer] | |
ELEMENT = a | |
ALLOW_ATTR = target | |
REGEXP = $1 | |
ELEMENT = img | |
ALLOW_ATTR = rel | |
REGEXP = $2 | |
ELEMENT = img | |
ALLOW_ATTR = src | |
REGEXP = $3 | |
``` | |
Because of limitations in the ini library, we are unable to automatically migrate configurations. | |
We will still parse the first rule from a `[markup.sanitizer]` section if present, but multiple rules must be manually migrated. |
I'd rather remove this section. If you feel like an explanation is needed, please reference the version where it was introduced and link the relevant issue or this PR (e.g. "this was changed from the original implementation in 1.11 due to severe limitations; see #1234
"). But we tend to avoid making explicit version references in the docs, so maybe just remove it and let the Blog speak. 😁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a blog entry is fine with @zeripath I could write such, but I must confess I've never looked at the blog entries and mostly read documentation. :-) On the breaking section of release notes, there's a link to the PR, but not to the blog entry -- do none of those have blog entries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... So this bit could be dropped and instead placed in the PR description - during release we can then precis this and put it into the blog post.
|
||
**Note**: The above section numbering policy is new; previously the section was `[markup.sanitizer]` and keys could be redefined. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
**Note**: The above section numbering policy is new; previously the section was `[markup.sanitizer]` and keys could be redefined. |
Same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had specifically requested some information as this is a breaking change requiring change to config (although the previous config simply does not work.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid duplicating too much info, I've added a reference to the cheat sheet here but explicitly left this line. Its either that or add a reference to this pull request (in both places) if we don't want it in the documentation -- but to someone wanting to migrate, it isn't clear what the problem is just by looking at our extended discussion here. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about:
**Note**: The above section numbering policy is new; previously the section was `[markup.sanitizer]` and keys could be redefined. | |
**Note**: Prior to Gitea 1.12 there was a single `markup.sanitiser` section and keys could be redefined, however, there were significant problems with this method of configuration necessitating this change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had specifically requested some information as this is a breaking change requiring change to config (although the previous config simply does not work.)
Sorry, @zeripath. I missed your comment, which was already resolved.
Signed-off-by: Alexander Scheel <alexander.m.scheel@gmail.com>
custom/conf/app.ini.sample
Outdated
@@ -965,8 +965,8 @@ SHOW_FOOTER_TEMPLATE_LOAD_TIME = true | |||
|
|||
[markup.sanitizer.1] | |||
; The following keys can appear once to define a sanitation policy rule. | |||
; This section can appear with an incremenented number to define multiple rules. | |||
; e.g., [markup.sanitizer.1] -> [markup.sanitizer.2] | |||
; This section can appear again with a unique alphanmuric string to define multiple rules. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
; This section can appear again with a unique alphanmuric string to define multiple rules. | |
; This section can appear multiple times by adding a unique alphanumeric suffix to define multiple rules. |
|
||
**Note**: The above section numbering policy is new; previously the section was `[markup.sanitizer]` and keys could be redefined. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had specifically requested some information as this is a breaking change requiring change to config (although the previous config simply does not work.)
Sorry, @zeripath. I missed your comment, which was already resolved.
Signed-off-by: Andrew Thornton <art27@cantab.net>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only one typo. Otherwise LG-TM.
Co-Authored-By: guillep2k <18600385+guillep2k@users.noreply.github.com>
make lg-tm work |
In go-gitea#9888, it was reported that my earlier pull request go-gitea#9075 didn't quite function as expected. I was quite hopeful the `ValuesWithShadow()` worked as expected (and, I thought my testing showed it did) but I guess not. @zeripath proposed an alternative syntax which I like: ```ini [markup.sanitizer.1] ELEMENT=a ALLOW_ATTR=target REGEXP=something [markup.sanitizer.2] ELEMENT=a ALLOW_ATTR=target REGEXP=something ``` This was quite easy to adopt into the existing code. I've done so in a semi-backwards-compatible manner: - The value from `.Value()` is used for each element. - We parse `[markup.sanitizer]` and all `[markup.sanitizer.*]` sections and add them as rules. This means that existing configs will load one rule (not all rules). It also means people can use string identifiers (`[markup.sanitiser.KaTeX]`) if they prefer, instead of numbered ones. Co-authored-by: Andrew Thornton <art27@cantab.net> Co-authored-by: guillep2k <18600385+guillep2k@users.noreply.github.com>
In #9888, it was reported that my earlier pull request #9075 didn't quite function as expected. I was quite hopeful the
ValuesWithShadow()
worked as expected (and, I thought my testing showed it did) but I guess not. @zeripath proposed an alternative syntax which I like:This was quite easy to adopt into the existing code. I've done so in a semi-backwards-compatible manner:
.Value()
is used for each element.[markup.sanitizer]
and all[markup.sanitizer.*]
sections and add them as rules.This means that existing configs will load one rule (not all rules). It also means people can use string identifiers (
[markup.sanitiser.KaTeX]
) if they prefer, instead of numbered ones.