Skip to content

use history cache when getting last revision of a file #3861

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jan 6, 2022

Conversation

vladak
Copy link
Member

@vladak vladak commented Jan 4, 2022

When browsing FreeBSD project (main branch from https://github.com/freebsd/freebsd-src) on internal OpenGrok instance, I noticed that getting xrefs takes way too long - usually bunch of seconds. When looking at what the web app is doing, I noticed CPU load spikes and jstack revealed that the web app actually retrieved the revision string by getting the history via repository specific code:

"http-nio-8080-exec-5982" #237111 daemon prio=5 os_prio=64 cpu=3475.16ms elapsed=49069.29s tid=0x000000001359a800 nid=0x3aef1 runnable  [0x00007ff2e0556000]
   java.lang.Thread.State: RUNNABLE
        at java.util.zip.Inflater.inflateBytesBytes(java.base@11.0.7-internal/Native Method)
        at java.util.zip.Inflater.inflate(java.base@11.0.7-internal/Inflater.java:385)
        - locked <0x00007ff851064fd0> (a java.util.zip.Inflater$InflaterZStreamRef)
        at org.eclipse.jgit.internal.storage.file.WindowCursor.inflate(WindowCursor.java:284)
        at org.eclipse.jgit.internal.storage.file.Pack.decompress(Pack.java:370)
        at org.eclipse.jgit.internal.storage.file.Pack.load(Pack.java:805)
        at org.eclipse.jgit.internal.storage.file.Pack.get(Pack.java:274)
        at org.eclipse.jgit.internal.storage.file.PackDirectory.open(PackDirectory.java:211)
        at org.eclipse.jgit.internal.storage.file.ObjectDirectory.openPackedObject(ObjectDirectory.java:390)
        at org.eclipse.jgit.internal.storage.file.ObjectDirectory.openPackedFromSelfOrAlternate(ObjectDirectory.java:354)
        at org.eclipse.jgit.internal.storage.file.ObjectDirectory.openObjectWithoutRestoring(ObjectDirectory.java:345)
        at org.eclipse.jgit.internal.storage.file.ObjectDirectory.openObject(ObjectDirectory.java:330)
        at org.eclipse.jgit.internal.storage.file.WindowCursor.open(WindowCursor.java:132)
        at org.eclipse.jgit.revwalk.RevWalk.getCachedBytes(RevWalk.java:1109)
        at org.eclipse.jgit.revwalk.RevCommit.parseHeaders(RevCommit.java:126)
        at org.eclipse.jgit.revwalk.TreeRevFilter.include(TreeRevFilter.java:113)
        at org.eclipse.jgit.revwalk.PendingGenerator.next(PendingGenerator.java:108)
        at org.eclipse.jgit.revwalk.BlockRevQueue.<init>(BlockRevQueue.java:40)
        at org.eclipse.jgit.revwalk.FIFORevQueue.<init>(FIFORevQueue.java:37)
        at org.eclipse.jgit.revwalk.StartGenerator.next(StartGenerator.java:133)
        at org.eclipse.jgit.revwalk.RevWalk.next(RevWalk.java:591)
        at org.eclipse.jgit.revwalk.RevWalk.nextForIterator(RevWalk.java:1526)
        at org.eclipse.jgit.revwalk.RevWalk.iterator(RevWalk.java:1550)
        at org.opengrok.indexer.history.GitRepository.getHistory(GitRepository.java:520)
        at org.opengrok.indexer.history.GitRepository.getLastHistoryEntry(GitRepository.java:471)
        at org.opengrok.indexer.history.HistoryGuru.getLastHistoryEntry(HistoryGuru.java:234)
        at org.opengrok.web.PageConfig.getLastRevFromHistory(PageConfig.java:1313)
        at org.opengrok.web.PageConfig.getLatestRevision(PageConfig.java:1303)
        at org.apache.jsp.list_jsp._jspService(list_jsp.java:292)
        at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:71)
        at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:770)

Given that the FreeBSD Git repository is on the heavier side, this action takes a bit of time.

Looking at the code of PageConfig#getLastRevFromHistory() and HistoryGuru#getLastHistoryEntry() I realized that history cache is avoided. This change fixes that.

@vladak vladak added the webapp web application label Jan 4, 2022
vladak added 3 commits January 4, 2022 23:46
also remove public modifier from pre-existing tests
@vladak
Copy link
Member Author

vladak commented Jan 5, 2022

There are 2 issues with the change:

  • the HistoryGuru cannot be tested with Mockito because it is final class so the fallback to repository.getLastHistoryEntry() in HistoryGuru#getLastHistoryEntry() is not really covered
  • the ui parameter used in getLastHistoryEntry() is largely ignored. Firstly, in the repository.getLastHistoryEntry() fallback path - none of the Repository#getHistory() methods actually have the ui parameter. Also, the historyCache.get() can fall back to repository.getHistory() as is the case with FileHistoryCache#get(). Even the classes that override Repository#getLastHistoryEntry(File file, boolean ui) ignore the ui parameter - because the getHistory() method they are usually based on lacks it.

@vladak
Copy link
Member Author

vladak commented Jan 5, 2022

The fallback control in FileHistoryCache#get() will not work for repositories that do not have history per directory (e.g. SCCS) in order to maintain the semantics of the fetchHistoryWhenNotInCache tunable however there the performance hit of double getHistory() for these is arguably not so big.

@vladak vladak force-pushed the page_config_history_get_last_entry branch from 8a9e797 to e443d5d Compare January 5, 2022 11:40
@ahornace
Copy link
Contributor

ahornace commented Jan 5, 2022

the HistoryGuru cannot be tested with Mockito because it is final class so the fallback to repository.getLastHistoryEntry() in HistoryGuru#getLastHistoryEntry() is not really covered

There are workarounds for mocking final classes: e.g. https://stackoverflow.com/questions/14292863/how-to-mock-a-final-class-with-mockito (just don't use PowerMock :D)

Otherwise looks good to me.

@vladak vladak merged commit ec5b57e into oracle:master Jan 6, 2022
@vladak vladak deleted the page_config_history_get_last_entry branch January 6, 2022 10:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
webapp web application
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants