Plex KG builds a knowledge graph from Plex media data, linking movies, genres, and people through semantic relationships. It enables querying and simple recommendations.
- PlexAPI: collect Plex data.
- rdflib: extract/map Plex data to turtle only. Doesn't handle large graphs well because it runs in memory.
- pyshacl: validate graphs.
- Fuseki: handle RDF storage, query and reasoning.
- FastAPI: easily access user facing functions.
- Get data from Plex.
- Transform Plex data to a graph in the Turtle format.
- Create media relationship graph.
- Validate graphs against expected shapes in:
./rdf/shapes/. - Upload graphs fuseki.
Note
The docker compose automagically creates a dataset called 'plex' and will be used as the dataset throughout the project.
You can view the graphs in fuseki or download them.
If you'd like to see an example graph, you can check out ./rdf/example.ttl. That file has example nodes for the media graph and relationship graph.
Additionally, ./rdf/ontology.ttl adds to the existing schema.org ontology.
Developed on Python 3.12.
- Generate
X-Plex-Client-Identifierfor the python client. You can generate it in many different ways but I prefer to do it on the CLI usinguuidgen. - Get your
authTokenby sending a request tohttps://plex.tv/api/v2/users/signinwith the following items in your request body:X-Plex-Client-Identifier: from step 1.login: emailpassword: password
- Add
X-Plex-Client-IdentifierandauthTokento your.env. - You might need to change the permission for the directories inside
./fuseki-data/to 100:100 (fuseki UID and GID).
chown -R 100:100 ./fuseki-data/*docker compose upURLs:
http://localhost:3030/- Fusekihttp://localhost:8000/- FastAPI
Run queries using fuseki API:
curl POST \
--data-urlencode "query@{path-to-rq-file}" \
http://localhost:3030/plex/query | jq- If it's saying unauthorized, use curl's -u parameter:
curl -u user:pw ... - If you get an error saying that the URL doesn't support POST requests, ensure that the dataset name is correct.
Note
If you'd like to see the queries being run in the project, you can find them in ./rdf/queries/.
To reduce the project's complexity, the media is limited to a single Plex section and a single user. Note: Movies and TV Shows can be considered Plex sections. The project was developed and tested using only movies so the other sections might not even work.
If you would like to see the available Plex sections, you can use the FastAPI endpoint /library, "Get Plex Libraries". I know... confusing names but that's Plex naming scheme. The section ID is called "key" in the dataset.
Another major limitation is that I did not implement any error handling... 😅 I just didn't feel like, lol. You can check docker logs to help clear any blockers you might have.
Oh, and no OWL...
Note
Check out prereqs.md for some key points on Semantic Web.