Skip to content
/ PTCD Public

[EMNLP 2025]Towards Transferable Personality Representation Learning based on Triplet Comparisons and Its Applications

Notifications You must be signed in to change notification settings

zjutangk/PTCD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PTCD

This repository contains the official implementation for the paper "Towards Transferable Personality Representation Learning based on Triplet Comparisons and Its Applications" (EMNLP-Main 2025).

Environment Requirements

The code requires the following dependencies (see requirements.txt):

torch>=1.8.0
numpy
pandas
scikit-learn
transformers
datasets
tqdm
scipy

Datasets

Download: The datasets are available at Google Drive: Download Datasets

Organization: The dataset contains three main folders:

  1. utterances: Raw single-sentence corpora.
  2. triplet: Generated and filtered triplets used for encoder training.
  3. by-product: By-product datasets used for downstream verification (personality detection).

After downloading, please organize the data files into the data/ directory.

Training Pipeline

The training process consists of Pre-training (Contrastive Learning) and Downstream Personality Detection.

Step 1: Pre-training (Embedding)

We use contrastive learning to fine-tune the BERT embeddings.

1. Warm-up Training: This step performs Masked Language Modeling (MLM) or similar warm-up tasks.

cd embedding
bash scripts/train_warm_ml.sh

2. Contrastive Pre-training: Train the encoder using contrastive loss.

cd embedding
bash scripts/train.sh

The trained model will be saved in embedding/output_embedding/ (or as configured in the script).

Step 2: Downstream Personality Detection

Train the classifier for personality traits (e.g., MBTI/Big5) using the pre-trained embeddings.

1. Train:

cd personality_detection
bash scripts/run.sh

2. Test:

cd personality_detection
bash scripts/test.sh

Citation

If you find this code useful, please cite our paper:

@inproceedings{tang2025towards,
  title = {Towards Transferable Personality Representation Learning based on Triplet Comparisons and Its Applications},
  author = {Kai Tang and Rui Wang and Renyu Zhu and Minmin Lin and Xiao Ding and Tangjie Lv and Changjie Fan and Runze Wu and Haobo Wang},
  booktitle = {The Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  year = {2025}
}

About

[EMNLP 2025]Towards Transferable Personality Representation Learning based on Triplet Comparisons and Its Applications

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published