Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve listing performance by using go-git #6478

Merged
merged 39 commits into from
Apr 19, 2019

Conversation

filipnavara
Copy link
Contributor

Contributes to #491.

This is rebased version of my changes which switch most of the read-only repository access logic to use go-git. We have been running version of Gitea with these changes in production environment for several months so it received some stress testing.

It's becoming harder to maintain this as separate branch so it would be nice if someone can take it over and upstream the changes.

Updating the go-git.v4 vendor module should bring some more performance improvements.

@filipnavara
Copy link
Contributor Author

Not sure how to update the vendor modules. make vendor followed by make test-vendor still prints a lot of diffs on my local machine.

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Apr 1, 2019
@lunny lunny added the type/enhancement An improvement of existing functionality label Apr 1, 2019
models/repo_branch.go Outdated Show resolved Hide resolved
modules/git/repo_tag.go Outdated Show resolved Hide resolved
@typeless
Copy link
Contributor

typeless commented Apr 3, 2019

Not sure how to update the vendor modules. make vendor followed by make test-vendor still prints a lot of diffs on my local machine.

@filipnavara you have to commit the following changes after make vendor.

  • go.mod
  • vendor/modules.txt

After that, make test-vendor will stop complaining.
I fetched this PR and tested with the above steps.

@filipnavara
Copy link
Contributor Author

Thanks, I will try that.

@filipnavara
Copy link
Contributor Author

filipnavara commented Apr 4, 2019

@typeless Unfortunately that didn't work for me. I suspect it has to do something with my local setup. I updated Go, but that alone didn't help.

make vendor generates a ton of changes locally. The make test-vendor target triggers vendor target too. Keeping only the two files doesn't make make vendor any better.

@typeless
Copy link
Contributor

typeless commented Apr 5, 2019

@filipnavara
I uploaded my changes at https://github.com/typeless/gitea/tree/6478. It passes make test-vendor.

@filipnavara filipnavara changed the title WIP: Improve listing performance by using go-git Improve listing performance by using go-git Apr 5, 2019
@codecov-io
Copy link

codecov-io commented Apr 5, 2019

Codecov Report

Merging #6478 into master will decrease coverage by 0.05%.
The diff coverage is 65.54%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6478      +/-   ##
==========================================
- Coverage    41.4%   41.35%   -0.06%     
==========================================
  Files         415      415              
  Lines       56382    56337      -45     
==========================================
- Hits        23344    23296      -48     
- Misses      29906    29921      +15     
+ Partials     3132     3120      -12
Impacted Files Coverage Δ
models/error.go 32.51% <ø> (+0.13%) ⬆️
modules/git/signature.go 0% <ø> (-64%) ⬇️
modules/git/sha1.go 73.91% <ø> (+23.91%) ⬆️
modules/context/repo.go 56.97% <0%> (-0.12%) ⬇️
routers/repo/issue.go 36.72% <0%> (-0.09%) ⬇️
routers/repo/setting_protected_branch.go 40.23% <0%> (ø) ⬆️
routers/repo/commit.go 24.69% <0%> (-0.11%) ⬇️
modules/repofiles/update.go 44.73% <0%> (ø) ⬆️
routers/api/v1/repo/branch.go 43.75% <0%> (ø) ⬆️
models/repo_branch.go 53.42% <100%> (-1.55%) ⬇️
... and 31 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 19ec260...c036283. Read the comment docs.

modules/git/commit_info.go Show resolved Hide resolved
@lafriks lafriks added this to the 1.9.0 milestone Apr 8, 2019
models/repo_branch.go Outdated Show resolved Hide resolved
Signed-off-by: Filip Navara <navara@emclient.com>
Signed-off-by: Filip Navara <navara@emclient.com>
Signed-off-by: Filip Navara <navara@emclient.com>
Signed-off-by: Filip Navara <navara@emclient.com>
…t and never modified after.

Signed-off-by: Filip Navara <navara@emclient.com>
…ing (.editorconfig code path is still hit).

Signed-off-by: Filip Navara <navara@emclient.com>
Signed-off-by: Filip Navara <navara@emclient.com>
Signed-off-by: Filip Navara <navara@emclient.com>
Signed-off-by: Filip Navara <navara@emclient.com>
Signed-off-by: Filip Navara <navara@emclient.com>
Signed-off-by: Filip Navara <navara@emclient.com>
Signed-off-by: Filip Navara <navara@emclient.com>
Signed-off-by: Filip Navara <navara@emclient.com>
… thus all the loaded packfile indexes).

Signed-off-by: Filip Navara <navara@emclient.com>
@filipnavara
Copy link
Contributor Author

@lunny Can you give me a link to the Git repository? I'll try to reproduce it locally.

@lunny
Copy link
Member

lunny commented Apr 19, 2019

@lunny
Copy link
Member

lunny commented Apr 19, 2019

I think maybe you missed commits counting cache in this PR.

@filipnavara
Copy link
Contributor Author

I do get pretty abysmal times on that repository on Windows (~12-13 seconds). I'll investigate it.

@filipnavara
Copy link
Contributor Author

I assume the regression you are seeing happens because I dropped supported for the cache introduced in #6045. It's relatively trivial to bring it back and I hope to test and submit a patch soon.

@lunny
Copy link
Member

lunny commented Apr 19, 2019

@filipnavara I don't think so because #6045 is not merged yet. As I know, for a big repo. Commits counting and Last Commit of every file or folder will spend most time when loading repo home page. I think maybe because you missed Commits counting which introduced on #2774

@filipnavara
Copy link
Contributor Author

@lunny Right. I did remove the usage of LastCommitCache in GetCommitsInfo though. I'll submit a patch to add it back, but it could not explain this.

I am pretty sure I didn't mess with the commit counting. I'll profile it and figure out what is wrong.

@filipnavara
Copy link
Contributor Author

Just a small update: The profile clearly shows the time is spent in GetCommitsInfo. It seems to hit some degenerate path in either my algorithm or go-git. I keep working on finding the underlying issue, but this may take some time to understand what is going on.

@lafriks
Copy link
Member

lafriks commented Apr 19, 2019

@filipnavara thanks, you are doing great work! Really appreciate it 👍

@filipnavara
Copy link
Contributor Author

filipnavara commented Apr 19, 2019

Thanks for all the positive comments! I managed to find a flaw in my algorithm that severly degraded performance on repositories with huge number of merge commits. I'm testing a fix now and hope to submit it soon.

Damn, not too soon though. There's still an issue with some 8 year old file in the root of the repository that is slow.

@filipnavara
Copy link
Contributor Author

filipnavara commented Apr 20, 2019

Update: I am still tracking down the regression, but I managed to close in on it a bit.

Draft PR #6686 significantly improves the speed on repositories containing massive amount of merge commits. The first commit in the PR fixes one correctness issue and speeds up the loads for cases where there are files with mostly recent commits. If I delete the ".yardopts" file (9 year old file in the root directory) I can load the listing in 2 seconds (compared to over 4 seconds before this PR). Second commit in the PR introduces optimization for reducing the search space in case of many merge commits. On stock Rails repository together it brings over 30% improvement for the listing times.

Second problem is actually something I tried to tackle before and that's enormous amount of I/O in go-git when loading relatively small objects. This was mostly caused by unnecessary seeks throwing away buffered input and then re-reading it again. Due to some correctness issues my previous fix didn't apply to delta-encoded objects. Unfortunately this seems to be hit very hard on traversing the tree/commit history. Naïve fix by changing this line to allow deltas reduces the loading time by another 40% to around 5s.

@filipnavara
Copy link
Contributor Author

filipnavara commented Apr 20, 2019

I have a WIP branch with few go-git fixes at https://github.com/filipnavara/gitea/tree/go-git-perf-fixes (directly in the vendor module directory for now). It contains PR #6686 + some optimizations to go-git. On my machine that's enough to push it to around 4 - 4.5 seconds on the root of the Rails repository. There's still room for some improvement but I wanted to address the low-hanging fruit first.

For the root directory it's now similar to the performance I got before #6478. That may not necessarily be good enough since invoking git is super slow on Windows by comparison. For subdirectories it's significantly faster.

Here are results from the profiler for the above branch and for a state before PR #6478: https://gist.github.com/filipnavara/0777cdeeba07f469314e4c05185fa29d

Notice that the expensive listing operation on before- profile is over 7 seconds! Yet I am fairly certain that the displayed page load time was around 4.6s. It may be some quirk or a clever parallelization I missed.

@filipnavara
Copy link
Contributor Author

filipnavara commented Apr 20, 2019

Notice that the expensive listing operation on before- profile is over 7 seconds! Yet I am fairly certain that the displayed page load time was around 4.6s. It may be some quirk or a clever parallelization I missed.

That's actually it. I was actually comparing apples to oranges. The previous implementation did parallelization on the maximum number of CPUs which is how it managed to achieve the speed. While that's beneficial in some cases (like a site with low load) the trade off comes with higher disk access and lower actual throughput under load.

Obviously I can do that with the new algorithm too by partitioning the set of input entries into groups and then for each of these groups running the history algorithm in parallel. It may yield better results in some cases but it's unlikely to be a good idea under higher load.

Long term there are much better approaches to solve it. Internally we used to run another fork of Gitea+go-git that implemented support for git commit-graph files and custom bloom filter format. While the commit-graph work is now part of official git, the bloom filter format is not and as such it was not viable to maintain. It was inspired by the work done for Azure DevOps (https://devblogs.microsoft.com/devops/super-charging-the-git-commit-graph-iv-bloom-filters/) and I highly recommend reading the whole series of the articles. Having this metadata allowed us to keep the load times almost always under one second for nearly any shape of repository.

@filipnavara
Copy link
Contributor Author

go-git fixes are submitted.

lunny pushed a commit that referenced this pull request Apr 21, 2019
…pository. (#6686)

* Fix flaw in the commit history lookup that caused unnecessary traversal
when the repository contains a lot of merge commits.

Also return the merge commit as the changed one if the file or
directory was changed as part of the merge, eg. through conflict
resolution.

Signed-off-by: Filip Navara <filip.navara@gmail.com>

* Perform history simplification. If a file is present on multiple parents
in a merge commit follow only the first parent.
@filipnavara
Copy link
Contributor Author

filipnavara commented Apr 21, 2019

I'm sure I am already boring everyone to hell with all the comments. :-)

Turns out that I was not the only person who noticed the performance shortcomings in go-git and started fixing them. I hope the pending PRs will get processed in timely manner. My latest measurements were with src-d/go-git#1119, src-d/go-git#1121, src-d/go-git#1123, src-d/go-git#1124, src-d/go-git#1125, src-d/go-git#1126 applied (for convenience I pushed the merged version to https://github.com/filipnavara/go-git/tree/wip-perf). Together they allow me to load the root of Rails repository in little less than 3 seconds (more than 50% improvement compared to currently released version).

With pregenerated commit graph file (#6701) it drops to 1.7 seconds.

I was actually worried that I caused some serious performance regression for some scenarios and that this PR, or part of it, would have to be reverted. I'm now fairly certain that should not be necessary. Thanks for feedback from everone.

@lunny
Copy link
Member

lunny commented Apr 23, 2019

@filipnavara I don't fell boring about what you are doing, but very thanks for your effort on that. I will try https://github.com/filipnavara/go-git/tree/wip-perf ASAP.

@lunny
Copy link
Member

lunny commented Apr 23, 2019

I have tested your wip-perf branch. The time on rails root page spent 2.5s ~ 4s and on release/v1.8 it's 2.3s. It's significant branch. But I found there are some different last commit display between two implements.

Notice: I'm using an old rails repository.

v1.8 :

image

branch wip-perf :

image

You can find actionview's last commit message is different.

@filipnavara
Copy link
Contributor Author

filipnavara commented Apr 23, 2019

@lunny Can you share the commit hash of the HEAD you are looking at?

I believe this is result of my last tweaks which I compared to GitHub behavior. I'd like to verify the correctness locally and I need to ensure that we are looking at the same thing.

@lunny
Copy link
Member

lunny commented Apr 24, 2019

The last commit id is 5abe612b0d on that rails repository.

@filipnavara
Copy link
Contributor Author

GitHub displays the same thing as you have on the wip-perf screenshot: https://github.com/rails/rails/tree/5abe612b0d

It boils down to the treatment of merge commits on directories. I don't have hard preference on either one, but I would not consider it a bug.

@filipnavara
Copy link
Contributor Author

The time on rails root page spent 2.5s ~ 4s and on release/v1.8 it's 2.3s.

Keep in mind that these are numbers with basically no load. The go-git version of the code is single-threaded and uses algorithmic parallelization only. The previous version was massively multi-threaded and used all the CPUs and quite probably more I/O as well.

At the moment go-git is not as optimized in reading as the native git but it gets really close (with the recently merged patches and the ones in the pipeline). When there are no old commits in the listing then go-git handily beats the old version. The Rails repository has a file in the root of the repository that was not modified for 8 years. On Windows the go-git version still beats git on this due to overhead of process launches and the massive amount of I/O.

Algorithmically computing the newest commit for this file means that all those 8 years of commit objects have to be loaded from disk. That's true for both go-git and git. The approach with go-git however loads these objects only once and resolves the newest commits for all the files along the way.

Aside from the general go-git speed-ups I would like to implement the support for git commit-graph files. It's additional data that has to be generated, but it offers significant speed up. It is specifically designed to solve the problem of history traversal and it will greatly help things like generating branch ahead/behind numbers (#6695).

@lunny
Copy link
Member

lunny commented Apr 24, 2019

@filipnavara It seems all above pipeline PRs on go-git merged. Could you send a PR from your wip-perf branch?

@filipnavara
Copy link
Contributor Author

@lunny don't we need to wait official go-git release for the vendor module to get updated? (I can ask the go-git guys to make one)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. type/enhancement An improvement of existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants