An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation
-
Updated
Nov 8, 2024 - Python
An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation
Bitextor generates translation memories from multilingual websites
Python scripts preprocessing Penn Treebank and Chinese Treebank
OpusFilter - Parallel corpus processing toolkit
Utilities for Processing the Switchboard Dialogue Act Corpus
A Serverless Text Annotation Tool for Corpus Development
A parser for annotated MuseScore 3 files.
Reading the data from OPIEC - an Open Information Extraction corpus
Utilities for Processing the Meeting Recorder Dialogue Act Corpus
A simple collocation-driven recognition of rhymes. Contains pre-trained models for Czech, Dutch, English, French, German, Russian, and Spanish poetry
Korpuslinguistik war noch nie so einfach...
A library of functions enabling complex corpus search in context (KWIC), search aggregation, bag-of-words building & keyphrase extraction.
Hard-Forked from JuliaText/TextAnalysis.jl
ALvisNLP corpus processing engine
Measure the similarity of text corpora for 74 languages
Plotly-Dash NLP project. Document similarity measure using Latent Dirichlet Allocation, principal component analysis and finally follow with KMeans clustering. Project is completed with dynamic visual interaction.
A set of corpus-based sampling & analysis M4L devices
Script that sets up and configures an entire CQPweb server installation
Scripts for building a geo-located web corpus using Common Crawl data
Add a description, image, and links to the corpus-processing topic page so that developers can more easily learn about it.
To associate your repository with the corpus-processing topic, visit your repo's landing page and select "manage topics."