Skip to content

This repository contains the codes and course conclusion reports for Tsinghua's 2024 "Machine Learning (80245013-0)"

Notifications You must be signed in to change notification settings

bconstantine/Mining-Misconception-in-Mathematics

Repository files navigation

Mining Misconception in Mathematics

This repository contains the code for our "Mining Misconception in Mathematics" project. Detailed project report is included in [ML]7. Mining Misconception in Mathematics.pdf, written in NeurIPS format.

Getting Started

Multiple-choice questions are widely used to evaluate student knowledge. Well-designed questions use distractors that are associated with common misconceptions. Large language models (LLMs) have performed well in math reasoning benchmarks, but they struggle with understanding misconceptions. In this work, we propose a method to determine the misconception that leads to an incorrect answer. We use an LLM of the Qwen 2.5 family to hypothesize a potential misconception, which is used to assist in the retrieval of the related misconceptions from a list of 2,586 categories. The retrieval process leverages embeddings generated by a fine-tuned Mistral-based LLM trained with a synthetic dataset. The relevant misconceptions are then analyzed by Qwen 2.5, which uses a logits processor to determine the most likely misconception. We evaluate this method using mean average precision on a Kaggle dataset of 1,868 math-related multiple-choice questions, achieving a maximum score of 0.4706. Our results demonstrate the potential of LLMs for assessing incorrect answers and identifying misconceptions in math education.

Prerequisites

Before running the code, there are some models that are required to be downloaded

There are also some required datasets:

  • Mining-Misconception-Dataset. There will be 4 datasets to be downloaded: mining_misconception_mapping.csv, sample_submission.csv, test.csv, train.csv.
  • MATH-Dataset All of the notebooks are guaranteed to be executed under 1 H100 GPU.

Usage

This project is divided into three main phases: Dataset Generation, Finetuning, and Inference Pipeline. Below are the descriptions and instructions for each Jupyter notebook used. Note that the path of the files to be read in these notebooks might be correspond to our local address. In case of running the notebook locally, user might need to change these address to reflect the file location on the device.

Synthetic Dataset Generation Phase

The notebook data_generation.ipynb is going to create synthetic dataset in Mining-Misconception-Dataset format. In this notebook, we reformatted MATH-Dataset to follow Mining-Misconception-Dataset's format. The output of our synthetic dataset is pasted in this project as eedi_synthetic.csv.

Embedding Model Finetuning phase

In this project, we opted to finetune our embedding model SFR-Embedding-Mistral_2R with LoRA to query related misconception. The notebook train_and_infer_hardNegatives.ipynb is the script to our model finetuning. Note that for the input of the finetuning, Initial Reasoning of training data is needed, which the script is contained in the Inference phase, specifically in LLM Reasoning section, see the corresponding section for more detail. Our best LoRA model is included in this project as SFR-Embedding-2_R_ZeroShot_CleanLatex_UsingNEWLINES_Quantization_HardNegatives_12batch_8accumulation_20negatives_moreSteps\

Inference Phase

The notebook train_and_infer_hardNegatives.ipynb is for the final inference of the whole misconception classification process. First, we use LLM to generate initial reasoning for HyDE. The output of this part is also essential for the model finetuning input. Next, we run the finetuned embedding model to query 25 most similar misconceptions. Finally, we implemented a reranking algorithm for the LLM to judge top k most accurate misconception.

Project Report

Detailed project report is included in [ML]7. Mining Misconception in Mathematics.pdf, written in NeurIPS format.

About

This repository contains the codes and course conclusion reports for Tsinghua's 2024 "Machine Learning (80245013-0)"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •