Skip to content

Discuss for .raw Type and Preservation of File Metadata #44

Closed
@chlins

Description

@chlins

Background

The current model-spec (as outlined in docs/spec.md) supports bundling model artifacts in various formats, including .tar and potentially .raw. For .tar archives, file metadata such as file mode, ownership, and other attributes can be embedded in the tar header, which is useful for runtimes that rely on this information to properly handle the files (e.g., setting executable permissions or preserving ownership).

However, the .raw type, being a flat, uncompressed file format, does not inherently provide a mechanism to store such metadata. This poses a challenge for runtimes that may depend on these attributes to function correctly, especially in scenarios where permissions or other file properties are critical (e.g., executable scripts or binaries within a model artifact).

Problem

When using the .raw type, there is no clear way to preserve or convey file metadata that was previously available in .tar headers. This limitation could lead to:

  • Loss of critical information (e.g., file modes like 755 for executables).
  • Inconsistent behavior across runtimes that expect this metadata.
  • Potential security or operational issues if the runtime cannot infer the intended file attributes.
    For example, a model artifact might include a script or binary that needs to be executable, but without metadata, the runtime would either need to assume defaults (potentially incorrect) or fail to execute it properly.

Questions for Discussion

  • Should .raw artifacts be required to preserve the same metadata as .tar artifacts? If so, what’s the minimum set of attributes we need (e.g., mode, ownership, timestamps)?
  • How should runtimes handle cases where metadata is missing for .raw files? Should there be a fallback mechanism?
  • Are there existing standards or tools (e.g., OCI image spec) we can borrow from to address this?
    What would be the best way to store this metadata for .raw files while aligning with the goals of simplicity and portability in the model-spec?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions