Description
Several options can be considered for storing files to be used with the new API. While kernelci-backend has file storage integrated in the API, it would seem a better approach to make it entirely separate with the new API and the meta-data would just contain URLs to the artifacts. This would make it possible to use different storage mechanisms without changing the API implementation.
Simple solution: direct copy or upload via SSH and serve via nginx
#15
The easiest solution for a local setup is probably to have a directory mounted as a volume in a container running nginx
. The user can directly copy files to the directory and they will be served by nginx
via HTTP. In production, the files could be sent over SSH to the host. Alternatively, another container could be run with the SSH server on a special port number and the same volume mounted in the 2 containers.
API solution: upload via FastAPI and serve via nginx
FastAPI support for uploading files: https://fastapi.tiangolo.com/tutorial/request-files/
This is similar to the simple solution except that instead of using SSH to send files they would be uploaded via FastAPI. The same user token can be used as for other API endpoints. Some meta-data could be kept in Mongo DB about the files, although it's not clear how much value this would add. This breaks the idea of decoupling the API from storage, so it's not great for that reason. It's still technically a possibility.
Advanced option: file / artifact manager
A dedicated service could be set up, a bit like Artifactory but with an open source framework. This would provide an API just for uploading and managing files. Ideally, this could have extra features such as automatically deleting files after a period of time, managing permissions with user accounts, quotas, backups and if possible replication to maintain mirrors in different geographic locations. It's not clear yet which framework could be used for that. There are several file management packages for Django. There's also NextCloud.
Mongo DB: GridFS
Another possibility is to use Mongo DB's GridFS feature, to directly store files in a Mongo database. It would probably also require an API endpoint to upload files. One advantage is that we can directly rely on the Mongo DB replication to distribute the files geographically. Otherwise, it doesn't necessarily seem like a very good fit but it's worth keeping in mind.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status