Skip to content

Latest commit

 

History

History

MeasureCrossEntropy

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

First split the data into 10 equal portions for 10-fold cross-validation using Splitter/FileSplitter.java

Then run estimate_evaluate_Ngram.py

The output of the file will be the average cross-entropy from 1-gram to 10-gram over 10-fold cross-validation.

For running the estimate_evaluate_Ngram.py script, following python libraries are required.

os, codecs, commands, math

Before running the script make sure MITLM is properly installed.

Follow these steps: git clone https://github.com/orezpraw/MIT-Language-Modeling-Toolkit.git mitlm

cd mitlm

sudo ./autogen.sh

sudo make

sudo make install

sudo ldconfig