Log-odds-ratio with Informative Dirichlet priors

This is an implementation based on the paper Fightin’ Words: Lexical Feature Selection and Evaluation for Identifying the Content of Political Conflict.

This is used for the language modeling for stance detection in the paper - Knowledge Enhanced Masked Language Model for Stance Detection.

Please see our stance detection repo 🚀

Usage

Run the following commands.

python log_odds_ratio.py \
    --filepath_corpus_i=$FP_CORPUS_I \
    --filepath_corpus_j=$FP_CORPUS_J \
    --filepath_background_corpus=$BACKGROUND_CORPUS

Among generated files, check out the z_scores.txt containing words sorted by Z-score. The top words more likely belong to corpus I while the botton words likely belong to corpus J, with respect to the background corpus.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
log_odds_ratio.py		log_odds_ratio.py
text_helper.py		text_helper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Log-odds-ratio with Informative Dirichlet priors

Usage

About

Releases

Packages

Languages

License

kornosk/log-odds-ratio

Folders and files

Latest commit

History

Repository files navigation

Log-odds-ratio with Informative Dirichlet priors

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages