Skip to content

Optimise logs #111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 9, 2025
Merged

Optimise logs #111

merged 6 commits into from
Jun 9, 2025

Conversation

fpozzobon
Copy link
Contributor

Usage of logs slows down significantly generation of the docs, adding a cache mechanism

Usage of logs slows down significantly generation of the docs, adding a cache mechanism
@fpozzobon
Copy link
Contributor Author

@timvink I realise that logs slows down significantly generation of document, adding a caching mechanism on page level, let me know if you would prefer it somewhere else.
With this optimisation it takes 30 seconds instead of 170 seconds locally for one of my project.

self._authors.append(co_author)
co_author.add_lines(self, commit)

def _get_git_log(self, sha, commit):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename this function? It's only used for parsing co-authors?

Also, we should have already parsed all the commit earlier:

self._commits[sha] = Commit(self, sha, **kwargs)

It's better to extract and add the co-author information in the Commit class:

https://github.com/timvink/mkdocs-git-authors-plugin/blob/master/src/mkdocs_git_authors_plugin/git/commit.py

then use it here. You won't need to have the caching in that case, and it should be even faster as you're not running git log again.

Copy link
Contributor Author

@fpozzobon fpozzobon Jun 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, although that raises the question on which class should be responsible for calling git log? Updated using the repository to deduce git logs, let me know if this is not desirable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative I can pass a function callback "fetch_co_authors" in repo.get_commit or "break it down" between calling constructor and returning cached commit.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I didn't read up enough, my bad. Your approach seems very reasonable. We're running git blame, and you added (a cached) git_log to get the co-author information. Either way, we're going to have to make a lot more git calls, which is a lot slower.

Maybe the better option is to not only have the caching, but also make processing co-authors optional (and disabled by default). The reason being is that the mainstream user flows will take a big performance hit to support a niche feature (git co-authors are not very common).

Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call, added option add_co_authors with false by default.
I noticed that repo already uses a git command, maybe it's fine to do git log as well?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one, left one comment.

I noticed that repo already uses a git command, maybe it's fine to do git log as well?

We need git blame for line-by-line authors. We need git log to find co-authors. So we'd have to do two git calls, which is expensive (compute time). Or maybe you have a better, more efficient way in mind?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately no, need to call git log in addition but with cache it is much faster :) (unless each line is a different commit ofc)

fpozzobon added 4 commits June 9, 2025 11:55
- moving responsibility of deducing co-author to repo
- added co-author inside commit
docs/options.md Outdated
@@ -8,6 +8,7 @@ plugins:
show_contribution: true
show_line_count: true
show_email_address: true
add_co_authors: true
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

false (this is the default)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also add the documentation of the option lower down in this file?

self._authors.append(co_author)
co_author.add_lines(self, commit)

def _get_git_log(self, sha, commit):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one, left one comment.

I noticed that repo already uses a git command, maybe it's fine to do git log as well?

We need git blame for line-by-line authors. We need git log to find co-authors. So we'd have to do two git calls, which is expensive (compute time). Or maybe you have a better, more efficient way in mind?

@timvink timvink merged commit dff4931 into timvink:master Jun 9, 2025
18 checks passed
@fpozzobon
Copy link
Contributor Author

@timvink thanks a lot for your fast support on this change (and sorry I didn't see it before :) ) , will you release a new version for this change?

@timvink
Copy link
Owner

timvink commented Jun 10, 2025

Yes, doing it now.

@timvink
Copy link
Owner

timvink commented Jun 10, 2025

@fpozzobon fpozzobon deleted the optimise_logs branch June 10, 2025 05:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants