-
Notifications
You must be signed in to change notification settings - Fork 0
Review TargetExtractor class #10
Comments
Yes, needs TODOs.
Indeed, that's a bug. Need tests for that. (That's the good thing of having this as a separate class.)
Unfortunately, yes; see solid/specification#31
Such as? (I'm not entirely following here.)
Erring on the safe side for now. Meant to be extended and tested accordingly. Needs a note.
For which reasons? DDOS attacks? Good to have a length validation before a regex. |
Trellis uses a query-based URL for ACL resources. I.e. given an LDP resource at
There can be subtle bugs and unexpected assumptions hidden in regular expressions, especially if they are complex. (I have seen a lot of problems in the wild, here). I will add that I use regular expressions all the time in scripts that I run locally. It is just that, for server code, I try to be more strict. |
I'm personally definitely open to that; I actually dislike the identification based on URL pattern. We need to discuss that at solid/specification#31
The alternative being looping over the string? Not sure if less error-prone; both implementations need lots of test cases anyway. Curious for examples in the wild if you have them (low priority). |
In this case, a regex is probably fine. I've seen regex-based host validators that aim for RFC-level compliance, and they are veritable monsters. My personal "don't use regular expression rule" is more of a general rule of thumb: one that has plenty of exceptions in practice. I just try not to reach for a regex out of habit or convenience. |
OK so can you summarize what works need to be done here? |
|
Fix typo and add comments, fix #10
Be aware that this code is pretty brittle and easy to circumvent.
One can URLencode the slash character (
/ => %2F
) which can be used to get around the conditional check, potentially, allowing one to do some filesystem navigation that should be strictly disallowed. Instead ofdecodeURI
, I think you meandecodeURIComponent
.Also, here you are assuming that an ACL resource involves a particular path-based construct. That necessarily excludes any query parameter-based approach. (I don't know if you want to leave that option open?)
It seems that non-alphanumeric characters are disallowed from domain names, including underscores? Is that a necessary restriction? Also, the VALID_HOST regex is pretty simplistic for a concrete class like this. (Just note, that I'm almost always 👎 on using regular expression in server code, so take that with a grain of salt)
The text was updated successfully, but these errors were encountered: