Skip to content

A tool for analysing codebases using parse trees to estimate structural distance, suggest refactors, and detect AI-generated code, plagiarism, or collusion.

License

Notifications You must be signed in to change notification settings

Arsngrobg/SourceDiff

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SourceDiff

SourceDiff is an advanced code analysis tool designed to measure and minimize structural distance between codebases. It helps developers, educators, and code reviewers by analysing how similar codebases are.

By leveraging static analysis, parse trees (PTs), and pattern recognition technique, SourceDiff provides helpful refactoring suggestions to align codebases more closely or to highlight significant divergences. This makes it particularly useful for:

  • Plagiarism or collusion in academic environments
  • AI-generated or unoriginal code that may be copy-pasted or generated by an AI agent
  • Redundant code that may produce unused compilation artifacts
  • Code transposition for duplicate logic that can be transposed into methods/functions

Use Cases

  • Academia
    • Detecting plagiarism through copy-pasting or AI-generated code
    • Evaluation of code quality - sudden differences in code quality may hint at cheating
  • Professional
    • Identify AI-generated code in code reviews
    • Maintain consistent programming styles across the codebase

Plug-n-Play

SourceDiff uses the Tree Sitter API for parsing source code into parse trees, and they provide a large database of pre-generated incremental parsers for use in SourceDiff. This software requires that you have a C compiler installed on your target device, as for the plug-n-player behaviour, requires parsers to be compiled and linked dynamically in order for them to be used.

About

A tool for analysing codebases using parse trees to estimate structural distance, suggest refactors, and detect AI-generated code, plagiarism, or collusion.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published