This project implements a Bigram Language Model using Python. The purpose of the model is to predict the next word in a sentence based on the previous word using bigram probabilities.
-
bigram_model.py: This file contains the implementation of the Bigram Language Model. It includes methods for preprocessing sentences, loading data, calculating unigram and bigram counts, converting counts to probabilities, generating sentences, calculating sentence log probabilities, and calculating perplexity. -
test_bigram_model.py: This file contains test cases for validating the functionality and accuracy of the Bigram Language Model.
To use the Bigram Language Model in your own project, follow these steps:
-
Clone the repository:
git clone https://github.com/Alicechui/Bigram-Language-Models.git
To run the test cases, execute the test_bigram_model.py file:
python test_bigram_model.pyThe tests cover various functionalities of the Bigram Language Model and ensure the correctness of the implementation.
This project is licensed under the MIT License.
Please refer to the documentation for more information and usage examples.