Skip to content

Dive deep into the history of any Git repository. This Python framework provides comprehensive tools for analyzing source code changes, commit metadata, and developer contributions at a granular level.

License

Notifications You must be signed in to change notification settings

codingwithshawnyt/GitAnalyzer

Repository files navigation

GITANALYZER

A powerful Python library for mining and analyzing Git repositories

license last-commit repo-top-language repo-language-count PyPI Downloads


Disclosure: This project received funding from Texas A&M University for research purposes.


## πŸ”— Table of Contents

πŸ“ Overview

GitAnalyzer is a Python library for mining and analyzing Git repositories. It provides a powerful interface for extracting detailed information about commits, developers, and code changes. The tool supports both local and remote repositories, with features including:

  • Commit history traversal and filtering
  • Code change analysis
  • Developer contribution tracking
  • Process metrics calculation
  • Support for multiple repository analysis

πŸ‘Ύ Features

  • Flexible Repository Access: Analyze both local and remote Git repositories
  • Comprehensive Commit Analysis: Extract detailed information about commits, including:
    • Author and committer details
    • Modified files and their changes
    • Code churn metrics
    • Commit relationships
  • Developer Analytics: Track developer contributions and experience
  • Process Metrics: Calculate various software process metrics
  • Multiple Repository Support: Analyze multiple repositories in sequence
  • Mailmap Support: Proper handling of author mappings via .mailmap files
  • Configurable Filters: Filter commits by:
    • Date ranges
    • Commit hashes
    • Tags
    • File types
    • Authors

πŸ“ Project Structure

└── GitAnalyzer/
    β”œβ”€β”€ LICENSE
    β”œβ”€β”€ Makefile                  # Build automation configuration
    β”œβ”€β”€ dev-requirements.txt      # Development dependencies
    β”œβ”€β”€ docs                      # Documentation directory
    β”‚   β”œβ”€β”€ Makefile             # Documentation build configuration
    β”‚   β”œβ”€β”€ commit.rst           # Commit analysis documentation
    β”‚   β”œβ”€β”€ conf.py              # Sphinx configuration
    β”‚   β”œβ”€β”€ deltamaintainability.rst  # Maintainability metrics docs
    β”‚   β”œβ”€β”€ git.rst              # Git interface documentation
    β”‚   β”œβ”€β”€ index.rst            # Documentation index
    β”‚   β”œβ”€β”€ intro.rst            # Introduction guide
    β”‚   β”œβ”€β”€ modifiedfile.rst     # File modification docs
    β”‚   β”œβ”€β”€ processmetrics.rst   # Process metrics documentation
    β”‚   β”œβ”€β”€ reference.rst        # API reference
    β”‚   β”œβ”€β”€ repository.rst       # Repository handling docs
    β”‚   β”œβ”€β”€ requirements.txt     # Documentation dependencies
    β”‚   └── tutorial.rst         # Usage tutorial
    β”œβ”€β”€ gitanalyzer              # Main package directory
    β”‚   β”œβ”€β”€ domain               # Core domain models
    β”‚   β”œβ”€β”€ git.py              # Git interface implementation
    β”‚   β”œβ”€β”€ metrics             # Analysis metrics implementations
    β”‚   β”œβ”€β”€ repository.py       # Repository management
    β”‚   └── utils               # Utility functions and helpers
    β”œβ”€β”€ pytest.ini              # PyTest configuration
    β”œβ”€β”€ requirements.txt        # Core dependencies
    β”œβ”€β”€ setup.py               # Package installation setup
    β”œβ”€β”€ test-requirements.txt  # Testing dependencies
    └── tests                  # Test suite directory
        β”œβ”€β”€ integration        # Integration tests
        β”œβ”€β”€ metrics           # Metrics tests
        β”œβ”€β”€ test_*.py         # Unit test files

πŸ“‚ Project Index

GITANALYZER/
__root__
dev-requirements.txt Development dependencies including mypy, flake8, and pytest-cov
pytest.ini PyTest configuration for test suite
test-requirements.txt Testing-specific dependencies
requirements.txt Core package dependencies including GitPython and pytz
Makefile Build and development automation tasks
setup.py Package installation and distribution configuration
gitanalyzer
git.py Core Git interaction and repository management
repository.py High-level repository analysis interface
metrics
process
commits_count.py Commit frequency analysis
change_set.py Change set size metrics
contributors_count.py Contributor participation metrics
contributors_experience.py Developer experience analysis
lines_count.py Code line modification metrics
hunks_count.py Code change block analysis
process_metric.py Base process metric implementation
history_complexity.py Repository history complexity metrics
code_churn.py Code churn and volatility metrics
utils
mailmap.py Git mailmap handling utilities
check_git_version.py Git version compatibility checker
conf.py Configuration management utilities
domain
commit.py Commit entity model and analysis
developer.py Developer entity model and tracking

πŸš€ Getting Started

β˜‘οΈ Prerequisites

Before getting started with GitAnalyzer, ensure your runtime environment meets the following requirements:

  • Python: Version 3.8 or higher
  • Git: Any recent version
  • Operating System: Linux, macOS, or Windows
  • Package Manager: pip

βš™οΈ Installation

Install GitAnalyzer using one of the following methods:

Build from source:

  1. Clone the GitAnalyzer repository:
❯ git clone https://github.com/codingwithshawnyt/GitAnalyzer
  1. Navigate to the project directory:
❯ cd GitAnalyzer
  1. Install the project dependencies:

Using pip Β 

❯ pip install -r requirements.txt -r dev-requirements.txt -r test-requirements.txt

πŸ€– Usage

Here's a basic example of using GitAnalyzer:

from GitAnalyzer import Repository

# Initialize repository (local or remote)
repo = Repository('path/to/repository')

# Traverse commits
for commit in repo.traverse_commits():
    print(f'Commit: {commit.hash}')
    print(f'Author: {commit.author.name}')
    print(f'Date: {commit.author_date}')
    
    # Access modified files
    for modification in commit.modified_files:
        print(f'Modified file: {modification.filename}')
        print(f'Changes: +{modification.added_lines}, -{modification.deleted_lines}')

πŸ§ͺ Testing

Run the test suite using the following command:

❯ pytest

For coverage report:

❯ pytest --cov=gitanalyzer

πŸ“Œ Project Roadmap

  • Core Functionality: Basic commit traversal and analysis
  • Process Metrics: Implementation of various process metrics
  • Multiple Repository Support: Ability to analyze multiple repositories
  • Documentation: Comprehensive documentation with Sphinx
  • Additional Metrics: Implementation of more advanced metrics
  • Performance Optimization: Improve analysis speed for large repositories

πŸ”° Contributing

Contributing Guidelines
  1. Fork the Repository: Start by forking the project repository to your GitHub account.
  2. Clone Locally: Clone the forked repository to your local machine using a git client.
    git clone https://github.com/codingwithshawnyt/GitAnalyzer
  3. Create a New Branch: Always work on a new branch, giving it a descriptive name.
    git checkout -b new-feature-x
  4. Make Your Changes: Develop and test your changes locally.
  5. Commit Your Changes: Commit with a clear message describing your updates.
    git commit -m 'Implemented new feature x.'
  6. Push to GitHub: Push the changes to your forked repository.
    git push origin new-feature-x
  7. Submit a Pull Request: Create a PR against the original project repository. Clearly describe the changes and their motivations.
  8. Review: Once your PR is reviewed and approved, it will be merged into the main branch. Congratulations on your contribution!
Contributor Graph


πŸŽ— License

This project is protected under the Apache License 2.0 License. For more details, refer to the LICENSE file.


πŸ™Œ Acknowledgments

  • GitPython: Core Git interaction functionality
  • Sphinx: Documentation generation
  • pytest: Testing framework
  • All contributors who have helped improve GitAnalyzer

About

Dive deep into the history of any Git repository. This Python framework provides comprehensive tools for analyzing source code changes, commit metadata, and developer contributions at a granular level.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published