Skip to content

Add Object Storage #969

@nagem

Description

@nagem

A central ticket to discuss all changes required to support object storage other than local file systems. Feel free to edit requirements and add tickets to represent requirements listed below.

First pass

  • file input will still stream through API via file forms
  • remove CAS from existing file storage
  • abstract internal file I/O for local and S3 storage (use python library: <>)
  • write migration script to move file objects from local storage to object storage
    • support accessing files in either location during transition
  • add ability to view/access zip members
  • serve files stored in Object Storage directly through OS provider rather than stream through API using redirection (local storage will stream through API as expected)
  • support range reads on individual files (no need to support individual zip member range reads)

Second pass

  • Add service that mimics S3 signed URLs for upload, serves files for download
  • Update API endpoints to use signed URLs for upload, can be done strategy by strategy
    • Format will change into 3 steps:
      • Request permission to upload from API, specify file information for intended upload, saved in mongo collection
      • Use signed URL to upload directly to OS service
      • Complete upload process, adding file objects and updating containers as necessary
    • reaper
    • packfile
    • label
    • uid
    • uid-match
    • engine
    • targetted
    • More (?)

Future work

  • Break files into separate Mongo container
    • Consider constraining file name uniqueness per container to support download
    • Will get unique Mongo (or other) id
    • Many places in db use a “FileReference” that uses a filename and container id/type. Mostly job inputs. Consider how breaking out files into their own collection will affect existing jobs.
  • Redesign /download to stream from OS service (unknowns about implementation)
  • Cache zip member list
    • Cache after access rather than async add member list as zip files are added
    • 2 options:
      1. Store .meta file for each zip document uploaded to S3
      2. Functool/other memory caching

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions