-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Faster graph rendering for repositories with huge amount of branches. #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
GitUp is not designed for very large repos - see http://forums.gitup.co/t/gitup-limitations-known-issues/29. This change will make things a bit faster, but the reason it's an NSMutableArray right now and not hashed-based container is because ordering of tips must be preserved. |
I'm still profiling the code, but this change reduced the startup time on my computer from ~1 minute to ~10 seconds which is already a good thing, IMO. I would love to be able to use it at work, it is a really cool client :) Ok, so the hash table approach per se doesn't work here.
Can there be any case when the same commit will be in this array twice? |
Of course I will encapsulate the code in a separate class that will both preserve commit order and have a faster way to check if commit has already been added. |
I unfortunately don't have bandwidth to help understand the code. It's been written to be pretty clear regarding how it works and you can always trace what is happening using the debugger 😄 |
Sure, no problem :) |
Perhaps a separate data structure could be used to keep track of ordering, while still getting the speed benefits from using the NSMutableDictionary? |
Yes, I'm currently working on it. |
Ok, tests are passing now, but I have no idea why they were failing in commit |
FYI I updated the contribution requirements at https://github.com/git-up/GitUp#contributing. |
👍 Will fix my pull request then. |
Done |
So what's the performance like with this change? GitUp will log exactly how long graph generation and graph rendering take. Note that like I was saying GitUp is not designed to handle very large repos, so a number of things are going to be inefficient, not just this. Improving the speed from say 1mn to 10s is interesting but doesn't solve the actual problem: for it to be usable it needs to be below 1s. |
Also there are still a number of areas that don't match the GitUp style e.g. |
Test set: the repository of ~237000 commits. I run it on my MacBook Pro (2015). The performance before this change: The performance after this change: It is still not that comfortable, but looks like a good step forward. Yes, there are a few places that can be improved, but I wanted to take an iterative approach and try to optimise one thing at a time, until it gets usable on the repo of our size. If you want me to measure all the bottlenecks first and make a bigger PR, I'm ok with this :) I will fix codestyle issues and also I noticed one warning I have introduced, will fix that as well. |
Done. I tried to imitate the code style you have in an app, but please, tell me if anything left... |
@@ -15,6 +15,8 @@ | |||
|
|||
#import <Foundation/Foundation.h> | |||
|
|||
@class GCRepository; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a separate commit.
} | ||
|
||
- (void)addCommit:(GCCommit *)commit { | ||
if (![self hasCommit:commit]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to call -hasCommit, access _actualCommitHashes directly.
I had a first pass look and pointed out a few things. You need to keep crafting this whole thing until it is as concise and elegant as possible - keep iterating! 😄 For instance, I would use CFMutableSets with no retain / release callbacks (since the commits are already in the array anyway and therefore retained there) and low-overhead hash/equal callbacks accessing directly the underlying In any case what makes you think the number of commits in your repo is the problem here? This code only cares about tips i.e. references. What's the output of |
Yes, we have a lot of tips in our repo as well. Anyway it looks like a good test set for this product :)
~22000 |
I've fixed most of the things you pointed.
Currently I see only a small overhead from this class in profiler. |
#endif | ||
|
||
#import "GCOrderedSet.h" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace this by #import GCPrivate.h
like other implementation files.
If this is a critical code path, which it is, then I would say yes, might as well go 100%, instead of stopping at 90% 😄 In any case this PR is certainly a step in the right direction, but still need some polish before it can land. Keep reviewing that diff over and over again until you're 100% confident this is it. Also note that the problem here is not the number of commits but the number of branches in your repo (22,000), which is plain huge. In my experience however, repos that have such an abnormally high number of branches are also going to have hundreds of thousands of commits if not millions, so other parts of GitUp become really slow. BTW the commit title should be more precise since this change optimizes only drawing graphs with lots of refs, not huge repos in general. |
I did some measurements, and found out that I can get a somewhat visible performance improvements only by using
I did some more profiling and the next important bottlenecks are:
|
The problem with ARC when you have code dealing with lots of objects in tight loops or similar is that since it has to be extremely conservative to avoid crashes, it retains / releases everything. So you end up with hundreds of thousands or retain/release calls which take locks and slow down everything. A repo can be huge along 3 axis (and any combination of course):
Each of them will affect GitUp differently, requires appropriate performance analysis, and dedicated fixes. My understanding is that you're hitting (2). The important thing is that GitUp has been deliberately architectured to handle very efficiently repos below these limits. In the real world, the vast vast majority of repos are an order of magnitude smaller than these limits. The direct consequences is that properly addressing these performance issues (and going from say 10s to 2s is still not enough as a reactive app needs to be <1s on representative hardware) is most likely going to require rewriting entire parts of GitUpKit, which may be a lot of work and is certainly very high-risk in terms of potential regressions. It's interesting work no matter what, so by all means, feel free to continue, but I thought that it's important you understand what you're getting into 😄 |
I do agree, making app fast & responsive on huge amount of data is hard. There are 3 types of performance fixes that can be introduced for an app:
I will try to focus on (1) first.
I guess our repo hit all 3 of this... Just refs had the worst performance penalty. |
Ok, I did change the commit message, rebased it on the latest version of master. Is there anything left to be done? |
It's looking pretty good now. Just some minor things left. Did you make sure to add the new Please agree to GitUp CLA at http://forums.gitup.co/t/gitup-contributor-license-agreement/323. |
Yes, I did (OSX+iOS Targets) ;)
Done |
NSMutableArray usage in [GIGraph _generateGraph] caused a bottleneck due to a lot of calls to the [NSMutableArray containsObject:] and [NSMutableArray removeObject:] methods. These methods are really slow on a huge amount of tips (~24000 in my case). To solve this issue CGOrderedSet class was created: * optimized for speed of adding/removal objects; * preserving the objects order. NSMutableArray was replaced to GCOrderedSet which solved this particular bottleneck.
Your screenshot is for the OSX target though :) |
All right, great work, thanks! |
Faster graph rendering for repositories with lots of branches
Awesome! |
The header was missing but I fixed it and added a test in Travis CI to catch this going forward: 9d06064. |
oops... |
(https://travis-ci.org/git-up/GitUp.svg?branch=master)](https://travis-ci.org/git-up/GitUp) GitUp ===== **Work quickly, safely, and without headaches. The Git interface you've been missing all your life has finally arrived.** <p align="center"> <img src="https://i.imgur.com/JuQIxJu.png" width="50%" height="50%"><img src="https://i.imgur.com/9rgXktz.png" width="50%" height="50%"> </p> Git recently celebrated its 10 years anniversary, but most engineers are still confused by its intricacy (3 of the [top 5 questions of all time](http://stackoverflow.com/questions?sort=votes) on Stack Overflow are Git related). Since Git turns even simple actions into mystifying commands (“git add” to stage versus “git reset HEAD” to unstage anyone?), it’s no surprise users waste time, get frustrated, distract the rest of their team for help, or worse, screw up their repo! GitUp is a bet to invent a new Git interaction model that lets engineers of all levels work quickly, safely, and without headaches. It's unlike any other Git client out there from the way it’s built (it interacts directly with the Git database on disk), to the way it works (you manipulate the repository graph instead of manipulating commits). With GitUp, you get a truly efficient Git client for Mac: - A **live and interactive repo graph** (edit, reorder, fixup, merge commits…), - **Unlimited undo / redo** of almost all operations (even rebases and merges), - Time Machine like **snapshots for 1-click rollbacks** to previous repo states, - Features that don’t even exist natively in Git like a **visual commit splitter** or a **unified reflog browser**, - **Instant search across the entire repo** including diff contents, - A **ridiculously fast UI**, often faster than the command line. *GitUp was created by [@swisspol](https://github.com/swisspol) in late 2014 as a bet to reinvent the way developers interact with Git. After several months of work, it was made available in pre-release early 2015 and reached the [top of Hacker News](https://news.ycombinator.com/item?id=9653978) along with being [featured by Product Hunt](http://www.producthunt.com/tech/gitup-1) and [Daring Fireball](http://daringfireball.net/linked/2015/06/04/gitup). 30,000 lines of code later, GitUp reached 1.0 mid-August 2015 and was released open source as a gift to the developer community.* Getting Started =============== - Official website: https://gitup.co ## Download: - Latest release on GitHub: https://github.com/git-up/GitUp/releases - Homebrew (Not maintained by GitUp developers): `brew install homebrew/cask/gitup` (Note: There is already a formula called gitup, so the full name must be specified!) **Read the [docs](https://github.com/git-up/GitUp/wiki) and use [GitHub Issues](https://github.com/git-up/GitUp/issues) for support & feedback.** Releases notes are available at https://github.com/git-up/GitUp/releases. Builds tagged with a `v` (e.g. `v1.2.3`) are released on the "Stable" channel, while builds tagged with a `b` (e.g. `b1234`) are only released on the "Continuous" channel. You can change the update channel used by GitUp in the app preferences. ## Build To build GitUp yourself, simply run the command `git clone --recursive https://github.com/git-up/GitUp.git` in Terminal, then open the `GitUp/GitUp.xcodeproj` Xcode project and hit Run. **IMPORTANT:** If you do not have an Apple ID with a developer account for code signing Mac apps, the build will fail with a code signing error. Simply delete the "Code Signing Identity" build setting of the "Application" target to work around the issue: <p align="center"> <img src="http://i.imgur.com/dWpJExk.png"> </p> **Alternatively**, if you do have a developer account, you can create the file "Xcode-Configurations/DEVELOPMENT_TEAM.xcconfig" with the following build setting as its content: > DEVELOPMENT_TEAM = [Your TeamID] For a more detailed description of this, you can have a look at the comments at the end of the file "Xcode-Configurations/Base.xcconfig". GitUpKit ======== **GitUp is built as a thin layer on top of a reusable generic Git toolkit called "GitUpKit". This means that you can use that same GitUpKit framework to build your very own Git UI!** *GitUpKit has a very different goal than [ObjectiveGit](https://github.com/libgit2/objective-git). Instead of offering extensive raw bindings to [libgit2](https://github.com/libgit2/libgit2), GitUpKit only uses a minimal subset of libgit2 and reimplements everything else on top of it (it has its own "rebase engine" for instance). This allows it to expose a very tight and consistent API, that completely follows Obj-C conventions and hides away the libgit2 complexity and sometimes inconsistencies. GitUpKit adds on top of that a number of exclusive and powerful features, from undo/redo and Time Machine like snapshots, to entire drop-in UI components.* Architecture ------------ The GitUpKit source code is organized as 2 independent layers communicating only through the use of public APIs: **Base Layer (depends on Foundation only and is compatible with OS X and iOS)** - `Core/`: wrapper around the required minimal functionality of [libgit2](https://github.com/libgit2/libgit2), on top of which is then implemented all the Git functionality required by GitUp (note that GitUp uses a [slightly customized fork](https://github.com/git-up/libgit2/tree/gitup) of libgit2) - `Extensions/`: categories on the `Core` classes to add convenience features implemented only using the public APIs **UI Layer (depends on AppKit and is compatible with OS X only)** - `Interface/`: low-level view classes e.g. `GIGraphView` to render the GitUp Map view - `Utilities/`: interface utility classes e.g. the base view controller class `GIViewController` - `Components/`: reusable single-view view controllers e.g. `GIDiffContentsViewController` to render a diff - `Views/`: high-level reusable multi-views view controllers e.g. `GIAdvancedCommitViewController` to implement the entire GitUp Advanced Commit view **IMPORTANT**: If the preprocessor constant `DEBUG` is defined to a non-zero value when building GitUpKit (this is the default when building in "Debug" configuration), a number of extra consistency checks are enabled at run time as well as extra logging. Be aware that this overhead can significantly affect performance. GitUpKit API ------------ Using the GitUpKit API should be pretty straightforward since it is organized by functionality (e.g. repository, branches, commits, interface components, etc...) and a best effort has been made to name functions clearly. Regarding the "Core" APIs, the best way to learn them is to peruse the associated unit tests - for instance see [the branch tests](GitUpKit/Core/GCBranch-Tests.m) for the branch API. Here is some sample code to get you started (error handling is left as an exercise to the reader): **Opening and browsing a repository:** ```objc // Open repo GCRepository* repo = [[GCRepository alloc] initWithExistingLocalRepository:<PATH> error:NULL]; // Make sure repo is clean assert([repo checkClean:kGCCleanCheckOption_IgnoreUntrackedFiles error:NULL]); // List all branches NSArray* branches = [repo listAllBranches:NULL]; NSLog(@"%@", branches); // Lookup HEAD GCLocalBranch* headBranch; // This would be nil if the HEAD is detached GCCommit* headCommit; [repo lookupHEADCurrentCommit:&headCommit branch:&headBranch error:NULL]; NSLog(@"%@ = %@", headBranch, headCommit); // Load the *entire* repo history in memory for fast access, including all commits, branches and tags GCHistory* history = [repo loadHistoryUsingSorting:kGCHistorySorting_ReverseChronological error:NULL]; assert(history); NSLog(@"%lu commits total", history.allCommits.count); NSLog(@"%@\n%@", history.rootCommits, history.leafCommits); ``` **Modifying a repository:** ```objc // Take a snapshot of the repo GCSnapshot* snapshot = [repo takeSnapshot:NULL]; // Create a new branch and check it out GCLocalBranch* newBranch = [repo createLocalBranchFromCommit:headCommit withName:@"temp" force:NO error:NULL]; NSLog(@"%@", newBranch); assert([repo checkoutLocalBranch:newBranch options:0 error:NULL]); // Add a file to the index [[NSData data] writeToFile:[repo.workingDirectoryPath stringByAppendingPathComponent:@"empty.data"] atomically:YES]; assert([repo addFileToIndex:@"empty.data" error:NULL]); // Check index status GCDiff* diff = [repo diffRepositoryIndexWithHEAD:nil options:0 maxInterHunkLines:0 maxContextLines:0 error:NULL]; assert(diff.deltas.count == 1); NSLog(@"%@", diff); // Create a commit GCCommit* newCommit = [repo createCommitFromHEADWithMessage:@"Added file" error:NULL]; assert(newCommit); NSLog(@"%@", newCommit); // Restore repo to saved snapshot before topic branch and commit were created BOOL success = [repo restoreSnapshot:snapshot withOptions:kGCSnapshotOption_IncludeAll reflogMessage:@"Rolled back" didUpdateReferences:NULL error:NULL]; assert(success); // Make sure topic branch is gone assert([repo findLocalBranchWithName:@"temp" error:NULL] == nil); // Update workdir and index to match HEAD assert([repo resetToHEAD:kGCResetMode_Hard error:NULL]); ``` Complete Example #1: GitDown ---------------------------- [GitDown](Examples/GitDown) is a very basic app that prompts the user for a repo and displays an interactive and live-updating list of its stashes (all with ~20 lines of code in `-[AppDelegate applicationDidFinishLaunching:]`): <p align="center"> <img src="http://i.imgur.com/ZfxM7su.png"> </p> Through GitUpKit, this basic app also gets for free unlimited undo/redo, unified and side-by-side diffs, text selection and copy, keyboard shortcuts, etc... This source code also demonstrates how to use some other GitUpKit view controllers as well as building a customized one. Complete Example git-up#2: GitDiff ---------------------------- [GitDiff](Examples/GitDiff) demonstrates how to create a view controller that displays a live updating diff between `HEAD` and the workdir à la `git diff HEAD`: <p align="center"> <img src="http://i.imgur.com/29hxDcJ.png"> </p> Complete Example git-up#3: GitY ------------------------- [GitY](Examples/GitY) is a [GitX](http://gitx.frim.nl/) clone built using GitUpKit and less than 200 lines of code: <p align="center"> <img src="http://i.imgur.com/6cuPcT4.png"> </p> Complete Example git-up#4: iGit ------------------------- [iGit](Examples/iGit) is a test iOS app that simply uses GitUpKit to clone a GitHub repo and perform a commit. Contributing ============ See [CONTRIBUTING.md](CONTRIBUTING.md). Credits ======= - [@swisspol](https://github.com/swisspol): concept and code - [@wwayneee](https://github.com/wwayneee): UI design - [@jayeb](https://github.com/jayeb): website *Also a big thanks to the fine [libgit2](https://libgit2.github.com/) contributors without whom GitUp would have never existed!* License ======= GitUp is copyright 2015-2018 Pierre-Olivier Latour and available under [GPL v3 license](http://www.gnu.org/licenses/gpl-3.0.txt). See the [LICENSE](LICENSE) file in the project for more information. **IMPORTANT:** GitUp includes some other open-source projects and such projects remain under their own license.
We have a huge repo at work (~22000 branches at the moment) and GitUp is really slow on it.
While profiling it turns to be that GitUp makes a lot of comparisons while building an object graph.

I have made a special optimised class which is much faster (also it checks SHA strings, not objects). According to Instruments, it is not the heaviest stack trace anymore (and the app seems to be far more usable on our repo right now).