translingual-language-markers

Official Code for Paper: "Translingual Language Markers for Cognitive Assessment from Spontaneous Speech" Bao Hoang, Yijiang Pang, Hiroko Dodge, and Jiayu Zhou, InterSpeech 2024

Overview

Mild Cognitive Impairment (MCI) is considered a prodromal stage of dementia, including Alzheimer's disease. It is characterized by behavioral changes and decreased cognitive function, while individuals can still maintain their independence. Early detection of MCI is critical, as it allows for timely intervention, enrichment of clinical trial cohorts, and the development of therapeutic approaches. Recently, language markers have been shown to be a promising approach to identifying MCI in a non-intrusive, affordable, and accessible fashion. In the InterSpeech 2024 TAUKADIAL Challenge, we study language markers from spontaneous speech in English and Chinese and use the bilingual language markers to identify MCI cases and predict the Mini-Mental Status Examination (MMSE) scores.

Our proposed framework combines the power from

feature extraction of a comprehensive set of bilingual acoustic features, and semantic and syntactic features from language models;
careful treatment of model complexity for small sample size;
consideration of imbalanced demographic structure, potential outlier removal, and a multi-task treatment that uses the prediction of clinical classification as prior for MMSE prediction.

The proposed approach delivers an average of 78.2% Balanced Accuracy in MCI detection and an averaged RMSE of 2.705 in predicting MMSE in a validation data split from the training data from the competition. Our empirical evaluation shows that translingual language markers can improve the detection of MCI from spontaneous speech. Our codes are provided in https://github.com/illidanlab/translingual-language-markers.

Package dependencies

Use conda env create -f environment.yml to create a conda env and activate by conda activate InterSpeech.

Data Preparation

Data is available upon request to the TAUKADIAL Challenge Organizers. More information can be found at https://taukadial-luzs-69e3bf4b9878b99a6f03aea43776344580b77b9fe54725f4.gitlab.io/.

After successfully requesting the dataset, please place the folder TAUKADIAL-24 in this repository.

1. Transcribe speech

We use the Whisper Large model to transcribe speech: python WhisperTranscribe.py.

2. Translate

We use M2M100 model to do translation: python WhisperTranscribe.py.

3. Acoustic Feature

We use librosa and opensmile libraries to extract acoustic feature: python acoustic.py.

Demos

Here we provide several demos of results in the paper. You can change the arguments from main.py to try different strategies.

Arguments of main.py

--embedding-model (string, optional, default: "bert-base-uncased"):
- Specifies the embedding model to use.
- Options include: "bert-base-uncased", "bert-base-multilingual-uncased".
--features (list of strings, optional, default: ["embedding", "acoustic"]):
- Defines the features to use.
- Options include: "embedding", "acoustic".
--acoustic (list of strings, optional, default: ["MFCC"]):
- Specifies the acoustic features to use.
- Options include: "MFCC", "GeMAPS", "eGeMAPS".
--iteration (integer, optional, default: 100):
- The number of different train test split.
--embedding-layer (string, optional, default: "last_hidden_state"):
- Specifies the embedding layer to use.
- Options include: "last_hidden_state", "pooler_output".
--translate (flag, optional):
- If set, use back-translation method.
- Use as --translate to enable.
--clf (string, optional, default: "logistic"):
- Specifies the classifier to use.
- Options include: "logistic", "mlp".
--reg (string, optional, default: "RandomForest"):
- Specifies the regression model to use.
- Options include: "svr", "RandomForest".
--ft_sel (flag, optional):
- If set, enables feature selection.
- Use as --ft_sel to enable.
--ft_num (integer, optional, default: 1600):
- The number of features to select.
--flag_bad_train_filter (flag, optional):
- If set, enables filtering of bad subjects by 2-stage method.
- Use as --flag_bad_train_filter to enable.
--flag_multi_reg (flag, optional, default: True):
- If set, enables multiple regressors for multi-tasking with MMSE distribution priors.
- Use as --flag_multi_reg to enable.
--img_index_sbj (list of integers, optional, default: [0, 1, 2]):
- Specifies the image indices to use as features.
- Examples include: [0, 1, 2], [0].

Examples

Use MFCC acoustic feature: python main.py --features acoustic --acoustic MFCC.
Use embedding feature with back-translation method: python main.py --features embedding --embedding-model bert-base-uncased --translate.
Use both acoustic and embedding feature: python main.py --features embedding acoustic --translate.
Use LASSO feature selection method to select top 1500 features: python main.py --ft_sel --ft_num 1500.
Use remove bad subjects strategy + multi regressors: python main.py --flag_bad_train_filter --flag_multi_reg.

Acknowledgement

This material is based in part upon work supported by the National Science Foundation under Grant IIS-2212174, IIS-1749940, , Office of Naval Research N00014-24-1-2168, and National Institute on Aging (NIA) RF1AG072449.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Results		Results
test		test
train		train
.gitignore		.gitignore
Adam.py		Adam.py
LICENSE		LICENSE
README.md		README.md
WhisperTranscribe.py		WhisperTranscribe.py
acoustic.py		acoustic.py
embedding.py		embedding.py
environment.yml		environment.yml
main.py		main.py
main_generate_result.py		main_generate_result.py
mlp_pytorch.py		mlp_pytorch.py
pipeline.png		pipeline.png
preprocess.py		preprocess.py
train.py		train.py
translate.py		translate.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

translingual-language-markers

Overview

Package dependencies

Data Preparation

1. Transcribe speech

2. Translate

3. Acoustic Feature

Demos

Arguments of main.py

Examples

Acknowledgement

About

Releases

Packages

Contributors 2

Languages

License

illidanlab/translingual-language-markers

Folders and files

Latest commit

History

Repository files navigation

translingual-language-markers

Overview

Package dependencies

Data Preparation

1. Transcribe speech

2. Translate

3. Acoustic Feature

Demos

Arguments of main.py

Examples

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages