-
Notifications
You must be signed in to change notification settings - Fork 311
Open
Labels
bugIndicates an unexpected problem or unintended behaviorIndicates an unexpected problem or unintended behaviorcomponent: botscomponent: coregood first issueIndicates a good issue for first-time contributorsIndicates a good issue for first-time contributorshelp wantedIndicates that a maintainer wants help on an issue or pull requestIndicates that a maintainer wants help on an issue or pull request
Description
As a continuation of #2377, we have a regression on parsing invalid URLs. Previously, the urllib was mach more liberal in processing URLs, now it rejects much more cases.
We use it for sanitize the URLs, and html_parser is an example of bot that uses the liberal behavior in tests:
| EXAMPLE_EVENT2['source.url'] = "http://[D] lingvaworld.ru/media/system/css/messg.jpg" |
| def test_event_without_split(self): | |
| self.sysconfig = {"columns": ["time.source", "source.url", "malware.hash.md5", | |
| "source.ip", "__IGNORE__"], | |
| "skip_head": True, | |
| "default_url_protocol": "http://", | |
| "type": "malware-distribution"} | |
| self.run_bot() | |
| self.assertMessageEqual(0, EXAMPLE_EVENT2) |
In patched Python versions (e.g. 3.11.4), this URL is rejected. We need to either decide against allowing such URLs, or redesign our sanitization.
Temporally, the test is skipped to unlock other work.
Metadata
Metadata
Assignees
Labels
bugIndicates an unexpected problem or unintended behaviorIndicates an unexpected problem or unintended behaviorcomponent: botscomponent: coregood first issueIndicates a good issue for first-time contributorsIndicates a good issue for first-time contributorshelp wantedIndicates that a maintainer wants help on an issue or pull requestIndicates that a maintainer wants help on an issue or pull request