-
Notifications
You must be signed in to change notification settings - Fork 70
Rename users clean or prune -- use aggregated user statistics #901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ange in utility functions
erzhtor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides, a comment regarding command (and flag) backward compatibility, PR looks good!
| Examples: | ||
| $ src users clean -days 182 | ||
| $ src users prune -days 182 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are such command/API changes usually made in src cli? Should this be a major version increase or should we keep backward compatibility leaving both clean and prune commands and marking one deprecated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost all the functionality changes in terms of the command run locally, the old clean command had some big timeout issues see #848. The name change from clean to prune was mostly just a style change. I don't think this command is being widely used so I think the change is alright. To my knowledge no one has put this on a cronjob or anything like that yet.
mrnugget
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this different than before in its performance characteristics? You're still fetching all users in the table, without doing any filtering. I don't see any improvement here?
| site { | ||
| users { | ||
| nodes { | ||
| username | ||
| siteAdmin | ||
| lastActiveAt | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intendation is off
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is hitting the aggregated_user_statistics table, and the value of lastActiveAt doesn't need to access the event_logs table the same way usageStatistics used to. Try it out here, its much more performant.
aggregated_user_statistics table is updating in the background, so this isn't accessing the event logs table at the time of the request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/sourcegraph/src-cli/pull/903/files Indentation fix
|
Okay, so we're removing the N+1 error, got it. That's only 1 of 2 possible performance issues, the other is that we transfer all users over the wire. (I don't think we want that on an instance with 10k+ users). But I just checked and it turns out that the code here doesn't work: Here, run the query (note that I added that means you're only cleaning up the first 100 users. |
|
Ah, good catch @mrnugget will open a follow up on this to paginate and maybe implement a filter similar to what we were doing here: https://github.com/sourcegraph/sourcegraph/pull/43370 |
* added new type to process siteUser grahQL type -- still needing to change in utility functions * code compiling but not registering users to remove correctly * correctly structured json response * remove debugging log * rename command src users clean to src users prune * rename null flag * changelog entry * correct json formating * correct command name in users general help * style correct users.go
This PR refactors use of the
usageStatisticslastActiveUsagetable which caused the command to timeout during execution on instances with a sufficiently large number of users. See #848Instead the
aggregated_user_statisticstable is used via thesitegraphQL endpointThis has two benefits
lastActiveAtfrom thesitegraphQL endpoint is much more efficient thanusageStatistics, and eliminates timeout concerns on many users instancesaggregated_user_statisticspersists the datelastActiveAteliminating the concern ofnulllastActiveUsagevalues which are misleading.lastActiveUsageis computed by reference to theevent_logstable which only persists event entries for 93 days, when 93 days pass if a user had no entries inevent_logslastActiveUsagewas set tonull-- previouslysrc users cleaninterpreted anullvalue as a user never having used the instance. A flag-remove-null-usersis provided to removenulllastActiveAt` usersThis PR also changes the name of the
src users cleancommand renaming it tosrc users prunewhich follows thedockerandkubectlcleanup command conventionCloses #848
Test plan
Tested on local machine