-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix parsing Issue 63 https://github.com/3rd-Eden/useragent/issues/63 #79
base: master
Are you sure you want to change the base?
Conversation
It looks great, but I do wonder about the potential performance impact of the newly introduced function call. |
replaceRegExResult() Given that to find the result for each user-agent many regexp against the UA have to be checked, this will IMHO hardly matter. Accessing "os" or "device" also runs more of the regex matching the UA. So again the time goes to matching those. It would be possible to reduce the inner loop
currently the maximum is $4. If you have an idea how to improve that otherwise, let me know. If performance is a big issue, here is an idea (but not one I will do any work on)
But detecting them must take into account, that the term is not optional (no ? or | ). With a good regex analyser terms can also be found in the middle of the regex. For those a "guardian" could be created (a regex or indexOf) checking if the term in the regex does exist. |
Btw, I thought about avoiding the duplication.
I tried keeping track (in update.js) whih $n are used. But then I ended up getting other issues. I might still look at something where the fallback stays as it is ($2 in 2nd field, $3 in 3rd. $4 in 4th), but is est to empty if that replacement was used before. That way it will not introduce using $1 or $5... except where they are in the downloaded data. |
I added some more to optimize the speed and remove duplicate fields. This also pushed another commit, which is unrelated.
That way one can localize the "Other" value (or in my case set it to undefined. *** Important
Does 2 thinks (sorry one commit)
Makes the 2nd field "$3". |
@3rd-Eden Ping. Is there a reasons this hasn't been merged? Just wondering if this isn't a valid improvement/fix then some feedback could probably move it in the right direction. |
Because I'm enjoying my free time. |
@3rd-Eden apologize, didn't mean to make that sound as snarky as I'm sure it came across. Could this PR be merged though? |
Proposed solution to issue 63.
Tested with nearly 3000 agents. (using updated reg-ex)
1)
used to be "Mac OS X" version: 10
are now "Mac OS X" version: Network
used to be "iOS version: 7 (or 8)
are now "iOS" version: Network
The version is wrong. This is because, I kept the behaviour of using the 2nd, 3rd and 4th match, if there is no "os_v1_replacement" (see file update.js, last argument to getReplacement())
Most of them only have "os_replacement":
Quit a lot of changes. the data is now much better allocated to
Some example for changes (old => new)
The Nokia now has some duplication. Probably due to keeping $2, $3, $4 / see above