NLP applied to Wikipedia. DurHack 2025 submission.
Connects any two Wikipedia articles using pretained vector embeddings.
Euclidian distance and cosine similarity is used as a heuristic for A* search.
Interface for adding and subtracting article embeddings in order to find related articles.
Tool that checks if article references match the content of in text citations using RAG and Gemini.
Make a .env file in the wikitrace directory and fill out the following fields:
wikitrace/.env
GEMINI_API_KEY=[insert key]
USER_AGENT=[insert user agent]Then, either download a pretained model from the Wikipedia2Vec website or train one yourself.
English 100d (bin) is recommended.
Copy the model file to wikitrace/model.pkl.
Dependencies are managed with uv on the backend:
backend/
$ uv install
To run the backend:
backend/
$ fastapi dev main.py
Make a .env file in the frontend directory and fill out the following fields:
frontend/.env
PUBLIC_GEMINI_API_KEY=[insert key]Dependencies are managed with npm on the frontend:
frontend/
$ npm install
To run the frontend:
frontend/
$ npm run dev
