Authors: Nguyen Huu Loc, Van Tuan Kiet
Supervisor: Dr. Do Nhu Tai
Institution: Faculty of Information Technology - Saigon University
In the context of Education 4.0, traditional Learning Management Systems (LMS) typically apply a uniform learning pathway for all learners, leading to ineffective personalization. This project proposes an adaptive learning framework based on Q-learning algorithm, integrated into the Moodle platform via LTI 1.3 standard.
The learning process is modeled as a Markov Decision Process (MDP), combined with K-means behavioral clustering to construct a multi-dimensional learner state space. Experimental results from 500 simulation episodes demonstrate that the system improves average scores by 22.5% and reduces weak skills by up to 51.0%.
Keywords: Reinforcement Learning β’ Q-learning β’ Personalized Learning β’ STEM Education β’ Moodle LMS β’ Adaptive Learning
- π₯ Project Information
- π Abstract
- π Introduction
- π Proposed Method
- π Experimental Results
- ποΈ System Architecture
- π» Installation
- π References
- π License
- π€ Contributing
- π Contact
STEM education faces significant challenges due to substantial differences in students' abilities, foundational knowledge, and learning pace. Learning Management Systems (LMS) like Moodle typically function only as content repositories and grade trackers, lacking behavioral analysis capabilities and timely pedagogical intervention.
This project proposes an adaptive learning framework based on Reinforcement Learning (Q-learning) - enabling an AI Agent to autonomously explore and optimize teaching strategies through trial-and-error mechanisms, continuously adapting based on learner feedback.
The system models the learning process as a Markov Decision Process (MDP) with three components: multi-dimensional state space (6 features), action space (15 pedagogical actions), and multi-objective reward function.
6-dimensional learner state representation:
| Dimension | Description | Values |
|---|---|---|
| Cluster | Behavioral cluster (K-means) | 0-4 |
| Module | Current learning module | 1-N |
| Progress | Completion progress | 0.0-1.0 |
| Score Level | Performance level | 0-4 |
| Phase | Learning phase (Quiz/Forum/Assignment) | 0-2 |
| Engagement | Interaction level | 0-4 |
15 pedagogical actions organized by temporal axis:
- Past (Remedial): Review weak Learning Outcomes (LO)
- Present (Standard): Follow standard learning pathway
- Future (Advanced): Preview advanced content
Where:
-
$R_{base}$ : Base reward from score performance -
$R_{LO}$ : Reward for improving weak skills -
$R_{bonus}$ : Bonus for active engagement -
$P_{penalty}$ : Penalty for inappropriate actions
The Q-learning algorithm uses Bellman update rule with epsilon-greedy strategy for exploration-exploitation balance:
Where:
To interpret Agent decisions, the system integrates SHAP (SHapley Additive exPlanations) - measuring each state feature's contribution to action selection:
This helps educators understand why the system recommends specific actions for each student.
- Scale: 500 episodes Γ 100 virtual students = 50,000 interaction trajectories
- Dataset: Moodle Log & Grades - Course ID 670 (public dataset)
- Baseline: Param Policy (historical behavior simulation)
- Learner modeling: 70% Linear learners, 20% Video-first, 10% Practice-first
Figure: Q-learning convergence over 500 episodes
| Metric | Param Policy (Baseline) | Q-learning (Ours) | Improvement |
|---|---|---|---|
| Average Score (scale 0-10) | 5.82 Β± 0.48 | 7.14 Β± 0.82 | β¬οΈ +22.5% |
| Weak Skills Count | 3.02 | 1.48 | β¬οΈ -51.0% |
| Average Reward | 59.95 Β± 12.38 | 264.26 Β± 27.33 | β¬οΈ +340.8% |
π‘ Conclusion: Q-learning significantly outperforms Param Policy across all metrics, demonstrating its capability to optimize personalized learning pathways.
Figure: SHAP values reveal that Cluster and Score Level are the two most important features in the Agent's decision-making process.
moodle-adaptive-learning-plugin/
βββ user-segmentation-service/ # Student behavioral clustering (K-means)
βββ course-service/ # Course and content management
βββ user-service/ # User information management
βββ question-service/ # Question bank management
βββ recommend-service/ # Learning recommendation (Q-learning Agent)
βββ lti-service-python/ # LTI 1.3 Authentication & Integration
βββ FE-service-v3/ # Frontend React + TypeScript
βββ kong-gateway/ # API Gateway & Load Balancer
- Docker & Docker Compose: 20.10+
- Moodle: 4.5+ with LTI 1.3 enabled
# Clone repository
git clone https://github.com/kltn-moolde/moodle-adaptive-learning-plugin.git
cd moodle-adaptive-learning-plugin
# Launch entire system
docker compose --env-file .env.production -f docker-compose.prod.yml up -d --pull always --buildThe system will automatically:
- β Build all microservices
- β Initialize database
- β Configure API Gateway (Kong)
- β Deploy frontend React app
[1] M. T. Chi and R. Wylie, "The ICAP framework: Linking cognitive engagement to active learning outcomes," Educational Psychologist, 2014.
[2] R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction, MIT Press, 1998.
[3] S. M. Lundberg and S.-I. Lee, "A unified approach to interpreting model predictions," Advances in Neural Information Processing Systems, 2017.
[4] IMS Global Learning Consortium, "LTI 1.3 Core Specification," 2019. [Online]. Available: https://www.imsglobal.org/spec/lti/v1p3/
[5] Moodle Documentation, "LTI and Moodle," 2023. [Online]. Available: https://docs.moodle.org/
This project is licensed under the MIT License - see the LICENSE file for details.
We welcome contributions to this project!
- Fork the repository
- Create a new branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
- Python: Follow PEP 8
- JavaScript/TypeScript: Use ESLint + Prettier
- Commit messages: Conventional Commits format
Research Team:
- π§ Email: lockbkbang@gmail.com
- π± GitHub Issues: Report bugs
This project was conducted with support from:
- Faculty of Information Technology - Saigon University
- Dr. Do Nhu Tai (Supervisor)
β If you find this project useful, please give us a star! β
Made with β€οΈ by Adaptive Learning Team




