Data and Code for "Forecasting the Buzz: Enriching Hashtag Popularity Prediction with LLM Reasoning" (CIKM 2025)

This repo contains the data and code for the following paper:

Yifei Xu, Jiaying Wu, Herun Wan, Yang Li, Zhen Hou, Min-Yen Kan. Forecasting the Buzz: Enriching Hashtag Popularity Prediction with LLM Reasoning, ACM International Conference on Information and Knowledge Management (CIKM) 2025.

Abstract

Hashtag trends ignite campaigns, shift public opinion, and steer millions of dollars in advertising spend, yet forecasting which tag goes viral is elusive. Classical regressors digest surface features but ignore context, while large language models (LLMs) excel at contextual reasoning but misestimate numbers. We present BuzzProphet, a reasoning-augmented hashtag popularity prediction framework that (1) instructs an LLM to articulate a hashtag’s topical virality, audience reach, and timing advantage; (2) utilizes these popularity-oriented rationales to enrich the input features; and (3) regresses on these inputs. To facilitate evaluation, we release HashView, a 7,532-hashtag benchmark curated from social media. Across diverse regressor—LLM combinations, BuzzProphet reduces RMSE by up to 2.8% and boosts correlation by 30% over baselines, while producing human-readable rationales. Results demonstrate that using LLMs as context reasoners rather than numeric predictors injects domain insight into tabular models, yielding an interpretable and deployable solution for social media trend forecasting.

🔧 Installation

Install the required dependencies:

pip install -r requirement.txt

📂 HashView dataset (`data/`)

new_processed_time-sorted_data.csv: The original HashView dataset for hashtag popularity prediction, collected from Chinese Weibo.

This dataset includes the following attributes: id, title, datetime, browse_count, and browse_log_norm.

id: hashtag ID.
title: hashtag text.
datetime: posting time of the hashtag, in the format YYYY-MM-DD hh:mm:ss.
browse_count: view count of the hashtag, which serves as the main indicator of popularity.
browse_log_norm: log-normalized value of browse_count, used as the prediction target.

o3_instruction.csv: popularity-oriented reasoning elicited from the o3-mini model.

id, title, datetime, browse_log_norm: same as new_processed_time-sorted_data.csv.
category_instruction: o3-mini reasoning about the hashtag's topic category attribute.
audience_instruction: o3-mini reasoning about the hashtag's target audience attribute.
time_instruction: o3-mini reasoning about the hashtag's posting time attribute.
merge_instruction: o3-mini reasoning about the hashtag's overall popularity by jointly considering all three attributes.

🚀 Run BuzzProphet

To run BuzzProphet on the basis of different regression models, use the following shell scripts:

Run RandomForest + BuzzProphet:
```
sh run_RF_BuzzProphet.sh
```
Run CatBoost + BuzzProphet:
```
sh run_CB_BuzzProphet.sh
```

After running the script, the results will be saved under browse_trained_results/ in CSV format.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Data and Code for "Forecasting the Buzz: Enriching Hashtag Popularity Prediction with LLM Reasoning" (CIKM 2025)

Abstract

🔧 Installation

📂 HashView dataset (`data/`)

🚀 Run BuzzProphet

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
browse_trained_results		browse_trained_results
codes		codes
data		data
README.md		README.md
requirement.txt		requirement.txt
run_CB_BuzzProphet.sh		run_CB_BuzzProphet.sh
run_RF_BuzzProphet.sh		run_RF_BuzzProphet.sh

WING-NUS/BuzzProphet

Folders and files

Latest commit

History

Repository files navigation

Data and Code for "Forecasting the Buzz: Enriching Hashtag Popularity Prediction with LLM Reasoning" (CIKM 2025)

Abstract

🔧 Installation

📂 HashView dataset (data/)

🚀 Run BuzzProphet

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

📂 HashView dataset (`data/`)

Packages