Code for "Detection of LLM-Generated Java Code Using Discretized Nested Bigrams" (arXiv:2502.15740). Achieves state-of-the-art performance in distinguishing human vs. LLM-written Java.
nlp machine-learning natural-language-processing bigrams feature-engineering discretization authorship-attribution binary-classification abstract-syntax-tree source-code-analysis nested-bigrams llm-generated-code code-stylometry
-
Updated
May 15, 2025 - Java