Skip to content

dineshr93/copyright_detection_ner_model

Repository files navigation

copyright_detection_ner_model

Basic pipeline to generate a copyright texts detection model from SPACY NER

An atempt to create a model exclusively to detect the literal copyright texts present in each source code.

Installation

copyright_detection_ner_model requires python v3.10+ , scancode v32.3.2 to run.

download multiple packages into the input folder and use extractcode to unpack the archive files

extractcode --shallow --replace-originals input/your_archive
python -m venv venv && source venv/bin/activate
git clone git@github.com:dineshr93/copyright_detection_ner_model.git && cd copyright_detection_ner_model && \
pip install -r requirements.txt
make b #starts the pipeline

NER Model Training Flow

NER Model Training Flow

License

Copyright (c) 2025 Dinesh Ravi

AGPL-3.0+

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published