Skip to content

Generator of random grammars, and corpora from them

Notifications You must be signed in to change notification settings

glicerico/rangram

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

rangram

Generator of random grammars, and corpora from them.

The current repo is an attempt to test different corpus-processing tools on a number of grammars.

The src folder contains the code used to generate random grammars, as well as corpora from them. This folder includes:

  • grammar_generator.py creates a random grammar given a set of parameters.

The randomly created grammars follow a number of parameters, specified in the header of the file containing the grammar:

% RANDOM GRAMMAR with parameters:
% num_words = 20
% num_classes = 4
% num_class_connectors = 7
% connectors_limit = 2
  • sentence_generator.py generates random sentences using a specified grammar.

  • corpus_generator.py uses the previously mentioned files to generate a random corpus.

The utils folder contains scripts for different purposes, e.g. to process a given grammar with some tool, and evaluate the results.

About

Generator of random grammars, and corpora from them

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published