Skip to content

oacore/sdg_classification

Repository files navigation

Multi-label SDG Classification

Install the python dependencies inside a virtual env

cd sdg_classification
virtualenv venv
source venv/bin/activate
pip3 install -r requirements.txt

Train a multi-label SDG classifier

Multi-label SBERT fine-tuning + Classification on synthetic dataset

python3 "$PROJECT_DIR/src/multi_label_sdg.py" --multi_label_finetuning --dataset=synthetic --do_train

Label Description SBERT fine-tuning + Classification on synthetic dataset

python3 "$PROJECT_DIR/src/multi_label_sdg.py" --label_desc_finetuning --dataset=synthetic --do_train

Two-stage SBERT fine-tuning + Classification

python3 "$PROJECT_DIR/src/multi_label_sdg.py" --label_desc_finetuning --multi_label_finetuning --dataset=synthetic --do_train

Synthetic dataset is available at data/synthetic_data/synthetic_final.tsv

To train the model on Out-if-Domain (OOD) Knowledge Hub Dataset,

python3 "$PROJECT_DIR/src/multi_label_sdg.py" --label_desc_finetuning --multi_label_finetuning --dataset=knowledge_hub --do_train

To perform evaluation on the manually annotated multi-label scientific SDG dataset,

python3 "$PROJECT_DIR/src/multi_label_sdg.py" --multi_label_finetuning --dataset=synthetic --do_train --do_in_domain_eval

To perform evaluation on the synthetic SDG dataset,

python3 "$PROJECT_DIR/src/multi_label_sdg.py" --multi_label_finetuning --dataset=synthetic --do_train --do_synthetic_eval

The source code for SBERT fine-tuning and linear classification is largely inspired from SetFit

Manually annotated Multi-label SDG dataset

Manually annotated dataset of papers from Open Research Online (ORO) is available at data/manually_annotated_oro/oro_gold_dataset.txt (final version)

Demo page

The source code for the demo page, CORE Labs is available here -

https://github.com/oacore/about

About

Multi-label SDG classification

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •