-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: categories sometimes missing in search (while visible in manual API calls!) #5749
Comments
Nice observation. Thanks for opening this issue, @mnalis ! As you note, the warning about gacoffset is not a problem. The problem is this code-snippet in our codebase where we intentionally filter out categories that have a year in them. I don't think we aimed to filter out valid categories like this one. The intention was only to filter out the invalid ones. Looks like this one got filtered out as a side effect. We should explore tweaking the logic to avoid this. |
That API call seems to return boolean field named However #1029 is another problem; those categories should not be hidden, but they might not be useful for upload of new pictures. However, I find such overreaching regex as I do not find in that ticket the reasoning to exclude everything resembling a year (closest is to filter out things "in the 1960s", but that has separate regex I would suggest just removing regex While there I'd also update (unless someone else would like to do their first code contribution, I could also make a PR for that -- if people agree with the reasoning above) |
Categories with years were really polluting category suggestions before that regexp. Whatever you typed, you would get 20 times the same result with different year, burying all but one useful result. The ship being unduly filtered is a sad side-effect, but a quite rare occurrence. App-side filtering could be a solution. |
@mnalis could you please assign me for this task |
@HrithikPadavala We have not yet reached a conclusion about what strategy to use. Would you mind trying another bug? Thanks! :-) |
@nicolas-raoul sure, thanks |
@HrithikPadavala There are some issues which we explicitly need for the upcoming 5.0.2 release. You could check them here: https://github.com/commons-app/apps-android-commons/milestone/11 Feel free to work on something that you may find interesting there 🙂 |
Yeah. That is unfortunate. Could we possible discuss with the Commons community to explore the feasibility of turning the categories into hidden ones? That would help us weed out the filtering we do at our end?
I could understand this.
I think that's because the year filtering has been in place even before that ticket. The relevant discussion could be seen in #47.
It seems to me like #47 only focused on categories other than "in the 1990s" etc. So, its tough to conclude if removing the year filter altogether would be an ideal solution. Let's discuss to see if we could come up with a better solution to this problem.
That's a good suggestion.
I think for now you could feel free to raise a separate PR for updating the |
Perhaps. I'm not sure where to request that though, and it would not really remove the need to filter on our side (as I noted above not all problematic categories should be hidden) or simplify code - it would just reduce a number of regexes slightly. Given that regexes are something I can write well at least, it's probably not a priority (unless I'm missing something).
I would disagree that is rare. For example it includes at least following categories:
and probably many more (i.e. any human-built thing that was built to survive more than few years).
Ah, thanks for the pointer, that reference helps a lot! While I can see the reasoning, the issue is that such filter then leaves out way too much.
Done that part in #5761 |
It's not a blocker certainly. I was mostly suggesting that as a way to start discussions on this so that we could at some future point of time reduce the amount of regex we need to filter out categories. It does take time to arrive at a consensus with the community 🙂
Ah. That's a big list. Thanks for taking the time to mention the potential categories that we're filtering out accidentally as a consequence of that regex. It does seem like we should do something to improve the situation.
That's a good idea. We could certainly try this!
This is also a fine idea but I believe the sorting down approach would be a bit more tangible as a quick remedy to the situation. If anyone is willing to achieve it this way, feel free to chime in.
Great. Thanks for that! I've left a comment there. Kindly check the same. Once that's resolved, it should be good to go. |
@mnalis I just discovered that there's this long standing PR of mine #4902 which actually proposes to stop filtering out all categories that have an year in it. There's already some related discussion in issue #4901 too. Specifically, there's some consensus on what we should and should not filter out [ref]. So, I'm starting to do you think if we could just take forward a combination of your changes (#5761) and the one proposed in #4902 in the next release and see how it goes 🤔 |
@sivaraam sounds great to me; I'm all for it. 🚀 If we find out that few more unneeded categories start to pop out, we can tune the regexes to filter out only those problematic/spammy categories, instead of everything containing a year. |
Summary
While doing the research for #3179 (comment) I've noticed that manually run API calls do not always return same as I see in the commons app.
According to @sivaraam for searching this API should be used in such case:
apps-android-commons/app/src/main/java/fr/free/nrw/commons/category/CategoryInterface.kt
Lines 34 to 39 in 06e25a0
but when I call that API manually (see "Expected behaviour" section), I get some elements missing (i.e. 2 results instead of 3). Could someone help debug this?
Probably unrelated, but API also returns warning "Unrecognized parameter: gacoffset." so it seems like we shouldn't be using that anymore; but does not seem like it should be the cause of the issue.
Steps to reproduce
Olea (s
Expected behaviour
3 categories displayed:
as that is what API call seems to return when run manually:
https://commons.wikimedia.org/w/api.php?format=json&action=query&formatversion=2&generator=allcategories&prop=categoryinfo|description|pageimages&piprop=thumbnail&pithumbsize=70&gaclimit=25&gacoffset=0&gacprefix=Olea+(s
Actual behaviour
Only 2 categories displayed (See screenshot):
Device name
Samsung Galaxy S23+
Android version
Android 14 (OneUI 6.1)
Commons app version
5.0.1~af028cbdd (latest F-droid)
Device logs
nothing worthwhile (i.e. nothing mentioning the search or API calls or "Olea"...)
Screen-shots
Would you like to work on the issue?
None
The text was updated successfully, but these errors were encountered: