Skip to content

[EMNLP 2025] Official implementation of "SensorLLM: Aligning Large Language Models with Motion Sensors for Human Activity Recognition"

Notifications You must be signed in to change notification settings

cruiseresearchgroup/SensorLLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SensorLLM

Aligning Large Language Models with Motion Sensors for Human Activity Recognition

EMNLP 2025 Main Conference

Zechen Li1   Shohreh Deldari1   Linyao Chen2   Hao Xue1   Flora D. Salim1

1 University of New South Wales, Sydney
2 University of Tokyo

arXiv

🌟 Overview

SensorLLM is a two-stage framework that aligns sensor time series with human-intuitive text, enabling LLMs to interpret complex numerical data and achieve SOTA human activity recognition across varying sensor types, counts, and sequence lengths.

sensorllm_model

🔑 Key Features

  • Aligns sensor time-series with human-intuitive, annotation-free textual trend descriptions and summaries via a QA-based framework.
  • Sensor–Language Alignment Stage operates on single-channel, variable-length segments for fine-grained trend-text alignment.
  • Task-Aware Tuning Stage handles multi-channel, multi-sensor data for downstream human activity recognition (HAR).

📂 Datasets

The current implementation supports five HAR datasets: USC-HAD, UCI-HAR, MHealth, Capture-24, and PAMAP2.

To apply SensorLLM to other datasets, please refer to the code and configuration examples provided for the supported datasets. In particular, you may need to modify the corresponding entries in ts_backbone.yaml and adapt the data loading logic in the ./sensorllm/data folder to match your dataset’s format.

🚀 Getting started

Currently supported pretrained models:

Other pretrained models can be used with minor modifications to the SensorLLM framework.

Sensor-Language QA Pairs Generation

We provide two example notebooks to generate QA pairs for aligning sensor time-series data with human-intuitive text:

  • mhealth_stage1.ipynb: Generates QA pairs for Stage 1 by aligning single-channel sensor segments with trend-based natural language descriptions.
  • mhealth_stage2.ipynb: Generates statistical information text for Stage 2, performing HAR classification using multi-channel sensor data.

You can also customize or extend the QA templates in these notebooks to generate more diverse types of sensor–language QA pairs for your own use cases.

Sensor–Language Alignment

To align sensor time-series data with text, run the following command:

torchrun --nproc_per_node=[NUM_GPUS] sensorllm/train/train_mem.py   \
--model_name_or_path [LLM_PATH] \
--pt_encoder_backbone_ckpt [TS_EMBEDDER_PATH]   \
--tokenize_method 'StanNormalizeUniformBins'    \
--dataset [DATASET_NAME] \
--data_path [TS_TRAIN_PATH]   \
--eval_data_path [TS_EVAL_PATH]   \
--qa_path [QA_TRAIN_PATH]   \
--eval_qa_path [QA_EVAL_PATH]   \
--output_dir [OUTPUT_PATH]    \
--model_max_length [MAX_LEN]    \
--num_train_epochs [EPOCH]    \
--per_device_train_batch_size [TRAIN_BATCH]    \
--per_device_eval_batch_size [EVAL_BATCH]    \
--evaluation_strategy "steps"    \
--save_strategy "steps"    \
--save_steps [SAVE_STEPS]    \
--eval_steps [EVAL_STEPS]    \
--learning_rate 2e-3   \
--weight_decay 0.0   \
--warmup_ratio 0.03   \
--lr_scheduler_type "cosine"   \
--logging_steps 1   \
--gradient_checkpointing True   \
--save_total_limit 1    \
--bf16 True    \
--fix_llm True   \
--fix_ts_encoder True   \
--model_type CasualLM   \
--load_best_model_at_end True  

Evaluation or Inference

To perform evaluation or inference for the Sensor–Language Alignment stage, run the following command:

python sensorllm/eval/eval.py   \
--model_name_or_path [STAGE1_MODEL_PATH]  \
--pt_encoder_backbone_ckpt [TS_EMBEDDER_PATH]   \
--torch_dtype bfloat16	\
--tokenize_method 'StanNormalizeUniformBins'    \
--dataset [DATASET_NAME] \
--data_path [TS_DATASET_PATH]   \
--qa_path [QA_DATASET_PATH]  \
--output_file_name [OUTPUT_FILE_NAME]	\
--model_max_length [MAX_LEN]	\
--shuffle False

Task-Aware Tuning

To perform a HAR task, use the following command:

torchrun --nproc_per_node=[NUM_GPUS] sensorllm/train/train_mem.py   \
--model_name_or_path [STAGE1_MODEL_PATH] \
--pt_encoder_backbone_ckpt [TS_EMBEDDER_PATH]   \
--model_type "SequenceClassification" \
--num_labels [ACTIVITY_NUM]  \
--use_weighted_loss True  \
--tokenize_method 'StanNormalizeUniformBins'    \
--dataset [DATASET_NAME] \
--data_path [TS_TRAIN_PATH]   \
--eval_data_path [TS_EVAL_PATH]   \
--qa_path [QA_TRAIN_PATH]   \
--eval_qa_path [QA_EVAL_PATH]   \
--output_dir [OUTPUT_PATH]    \
--model_max_length [MAX_LEN]    \
--num_train_epochs [EPOCH]    \
--num_train_epochs [EPOCH]    \
--per_device_train_batch_size [TRAIN_BATCH]    \
--per_device_eval_batch_size [EVAL_BATCH]    \
--evaluation_strategy "steps"    \
--save_strategy "steps"    \
--save_steps [SAVE_STEPS]    \
--eval_steps [EVAL_STEPS]    \
--save_total_limit 1    \
--load_best_model_at_end True    \
--learning_rate 2e-3    \
--weight_decay 0.0    \
--warmup_ratio 0.03    \
--lr_scheduler_type "cosine"    \
--logging_steps 1    \
--bf16 True      \
--fix_llm True  \
--fix_cls_head False  \
--fix_ts_encoder True    \
--gradient_checkpointing True    \
--metric_for_best_model  "f1_macro" \
--preprocess_type "smry+Q" \
--greater_is_better True  \
--stage_2 True  \
--shuffle True

See ./sensorllm/data/utils.py for all available preprocess_type options or to make edits.

🌍 Citation

If you find this repository useful for your research, please cite our paper:

@misc{li2025sensorllm,
      title={SensorLLM: Aligning Large Language Models with Motion Sensors for Human Activity Recognition}, 
      author={Zechen Li and Shohreh Deldari and Linyao Chen and Hao Xue and Flora D. Salim},
      year={2025},
      eprint={2410.10624},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2410.10624}, 
}

📄 License

Creative Commons License
This work is under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

📩 Contact

If you have any questions or suggestions, feel free to contact Zechen at zechen.li(at)unsw(dot)edu(dot)au.

About

[EMNLP 2025] Official implementation of "SensorLLM: Aligning Large Language Models with Motion Sensors for Human Activity Recognition"

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published