Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(wasm): clean up download, byte handling and hashing #379

Merged
merged 4 commits into from
Dec 10, 2024

Conversation

akacase
Copy link
Contributor

@akacase akacase commented Dec 6, 2024

What kind of change does this PR introduce?

#378

What is the current behavior?

#378

What is the new behavior?

The wasmtime errors were probably because:

  • The file was incomplete (partial download)
  • The file was corrupted during the save
  • The file wasn't properly closed/flushed after writing

These issues might not have appeared until recently because:

  • Those servers might have better-behaved HTTP responses
  • The download paths might have been more thoroughly tested

The new code is more defensive about:

  • Complete downloads
  • Proper byte handling
  • Atomic file operations
  • Error propagation

Additional context

I have added bytes here for a specific reason:

The bytes crate is specifically designed for working with raw binary data
It ensures we don't accidentally corrupt binary data through string conversions or character encoding issues. This is crucial for WASM files, which must maintain exact byte-level integrity. Bytes itself is highly efficient (zero-copy).

Thanks!

@akacase
Copy link
Contributor Author

akacase commented Dec 6, 2024

fixed logs leaking, we should look into a log framework, where debug level logs can be used and then those paths are not valid in production.

@akacase
Copy link
Contributor Author

akacase commented Dec 6, 2024

another failure mode is cache permission, this would be more ideal:

   let mut path = std::env::var("PGDATA")
        .map(PathBuf::from)
        .map_err(|_| "PGDATA environment variable not set")?
        .join("extension_cache")
        .join("wasm_fdw");

Three issues I see with the current approach:

  • dirs::cache_dir() uses the system's default cache directory
  • Could lead to permission problems when Postgres is running as a different user
  • Cache poisoning, because files are being stored in a user-writable location rather than a Postgres-controlled directory

@burmecia
Copy link
Member

burmecia commented Dec 9, 2024

Thanks for the PR! All the improvements look good to me but I am not much in favour of using PGDATA as the cache dir. The PGDATA dir is mainly used by Postgres internally and stored the core files and folders. IMO Wrappers as an extension should not touch the internal Postgres storage, rather it is better to use cache dir outside Postgres core system.

dirs::cache_dir() is not the system's default cache directory, it is specific to current user which is the user running Postgres process. For its documentation,

Platform Value Example
Linux $XDG_CACHE_HOME or $HOME/.cache /home/alice/.cache
macOS $HOME/Library/Caches /Users/Alice/Library/Caches
Windows {FOLDERID_LocalAppData} C:\Users\Alice\AppData\Local

So the cache dir is still owned by Postgres, and normally it is not writable by other users.

@akacase
Copy link
Contributor Author

akacase commented Dec 9, 2024

Thanks for the PR! All the improvements look good to me but I am not much in favour of using PGDATA as the cache dir. The PGDATA dir is mainly used by Postgres internally and stored the core files and folders. IMO Wrappers as an extension should not touch the internal Postgres storage, rather it is better to use cache dir outside Postgres core system.

dirs::cache_dir() is not the system's default cache directory, it is specific to current user which is the user running Postgres process. For its documentation,

Platform Value Example
Linux $XDG_CACHE_HOME or $HOME/.cache /home/alice/.cache
macOS $HOME/Library/Caches /Users/Alice/Library/Caches
Windows {FOLDERID_LocalAppData} C:\Users\Alice\AppData\Local
So the cache dir is still owned by Postgres, and normally it is not writable by other users.

understood, thanks!

@akacase
Copy link
Contributor Author

akacase commented Dec 9, 2024

I'll hold back on the patch for cache path, but all the commits here I believe are good to go. Let me know if you need anything else.

@burmecia
Copy link
Member

burmecia commented Dec 10, 2024

@akacase I've made some minor changes on top of your patch, please have a look at your repo and review it. If everything is fine, we will be ready to merge. Thanks.

chore: improve code to make it consistent and clippy happy
@akacase
Copy link
Contributor Author

akacase commented Dec 10, 2024

merged, thanks!

@burmecia burmecia added the wasm label Dec 10, 2024
@burmecia burmecia merged commit 2ea964b into supabase:main Dec 10, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants