Caribou is largely a prototype. This project is an attempt to better understand how to build an internet index and how to layer a primative search engine on top of that.
If you want to see the end result of this project see caribou. Also if you want to read my random musings on this project see this blog post
This covers loosely the applications in this repository.
This is a simple flask app that allows for the running of the crawler via scheduling jobs.
The admin application is a general admin application that allows of the configuration of the crawler. This also houses some high level stats about the crawled content and allows you to invoke the crawler so long as the crawler server is running.
This is a search engine based on the results from the crawler. This has two modes a basic search engine look and an interactive star map for exploring the crawled content.
You can run the full application in docker using docker compose. This assumes that you have a folder called db_test in the source directory. Depending on profile this will either use sqlite or postgres as the backing store.
$ docker compose --profile sqlite-backed up --build$ docker compose --profile postgres-backed up --build- Change directories to
crawler - Create a new python environment and install requirements
- Run
flask --app app run
- Change directories to
admin - Run
cargo run
- Change directories to
grepper - Run
cargo run