Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test access points #1

Open
wants to merge 93 commits into
base: accessPointTest
Choose a base branch
from
Open

Test access points #1

wants to merge 93 commits into from

Conversation

sauraank
Copy link
Owner

Description of change

Added AccessPoints

Refactored a bit of code and added some tests.

Relevant issues:
awslabs#4

Does this change impact existing behavior?

No

No


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the Developer Certificate of Origin (DCO).

passaro and others added 30 commits July 7, 2023 11:16
* Implement forget

Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>

* fix comparison

Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>

* fix fmt

Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>

* Split readdir and readdirplus in fs

Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>

* Remove TODO

Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>

---------

Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>
Linux filesystems allow UTF-8 in filenames, and S3 allows UTF-8 in keys,
so we expect this to work. It does! But it did expose a bug in our mock
client's ListObjects implementation, which was dealing with bytes
instead of characters (unlike the real S3).

Signed-off-by: James Bornholt <bornholt@amazon.com>
Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>
Signed-off-by: Daniel Carl Jones <djonesoa@amazon.com>
We adopt two Github Action to automate release for Mounpoint-S3
* Github Action ID: [taiki-e/create-gh-release-action](https://github.com/taiki-e/create-gh-release-action) for create Github Release.
    * During the release, operator must have consensus with team on version,
      draft, prefix and title as workflow inputs.
    * This workflow can also take changelog as inputs, we can opt in
      that once we have a change log format ready.
* Github Action ID: [taiki-e/upload-rust-binary-action](https://github.com/taiki-e/upload-rust-binary-action) for uploading Mountpoint-S3 binary and all other artifacts.
    * This workflow uploads compiled binary on target platforms. When we
      do the release, artifacts must includes: bin, LICENSE, archive and
      checksum.

Signed-off-by: Charles Zhang <zyaoshen@amazon.com>
Co-authored-by: Yaosheng Zhang <zyaoshen@amazon.com>
This is new in 1.71, but seems to get confused by the test_case macro.
It was supposedly fixed in
rust-lang/rust-clippy#10992 but that seems to
not have entirely worked.

Signed-off-by: James Bornholt <bornholt@amazon.com>
This is pretty annoying because of some weird edge cases around implicit
directories being removed while local children are present. I think in
the long term we want to fix this, but it's a similar problem to what we
saw in awslabs#359 -- we need `readdir` to clean up removed directories
properly. So for now, I've changed the reference model to match our
current semantics, which is that if an ancestor of a local
file/directory is removed, that file/directory will no longer be visible
through the filesystem. Of course, once that file/directory becomes
remote it will become visible, and the model still captures that.

Signed-off-by: James Bornholt <bornholt@amazon.com>
Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>
Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>
…bs#367)

* Revert workaround for empty PutObject requests in awslabs#295

CRT now supports empty PutObject requests with no ContentLength header

Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>

* Move crc32c-related functions into Crc32c type

Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>

* Add support for upload review callback

Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>

* Compute checksums on write and review on upload

Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>

* Move checksum types to mountpoint-s3-client

Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>

* Rename/review UploadReview types

Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>

* Simplify handling of upload review callback

Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>

* Remove ChecksummedSlice

Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>

* Address PR feedback

Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>

* Move checksums back to mountpoint-s3-crt

Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>

---------

Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>
…lease (awslabs#376)

Signed-off-by: Daniel Carl Jones <djonesoa@amazon.com>
* Disable logging to file by default

This is a breaking change.

Logging to disk is now disabled by default.
Logs will not longer be written to `$HOME/.mountpoint-s3/` and should be configured using `--log-directory <DIRECTORY>`.

Signed-off-by: Daniel Carl Jones <djonesoa@amazon.com>

* Cleanup argument a little

Signed-off-by: Daniel Carl Jones <djonesoa@amazon.com>

* Improve clarity of documentation around foreground logging

Signed-off-by: Daniel Carl Jones <djonesoa@amazon.com>

* Update log file name to improve timestamp clarity

Signed-off-by: Daniel Carl Jones <djonesoa@amazon.com>

---------

Signed-off-by: Daniel Carl Jones <djonesoa@amazon.com>
:(

Signed-off-by: James Bornholt <bornholt@amazon.com>
- desired throughput -> maximum throughput
- move AWS credentials above mount options
- change placeholder name for mount point to match `mount`'s docs

Signed-off-by: James Bornholt <bornholt@amazon.com>
* Support open with O_RDWR flag

Currently, Mountpoint supports either open with O_WRONLY or O_RDONLY
because we don't allow applications to do both read and write at the same
time. However, it's possible support O_RDWR flag too since we can decide
at open time whether to give a read handle or a write handle back, and
for any inode it's never possible for both start_reading and start_writing
to work.

Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>

* Update semantics document

Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>

* Update mountpoint-s3/tests/fuse_tests/write_test.rs

Co-authored-by: James Bornholt <jamesbornholt@gmail.com>
Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>

* Add logs

Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>

* Update document

Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>

---------

Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>
Co-authored-by: James Bornholt <jamesbornholt@gmail.com>
* adds region when running fuse tests

Signed-off-by: Ahmar Suhail <ahmarsu@amazon.co.uk>

* adds in secondary region

Signed-off-by: Ahmar Suhail <ahmarsu@amazon.co.uk>

* removes extra new lines

Signed-off-by: Ahmar Suhail <ahmarsu@amazon.co.uk>

* adds domain env variable

Signed-off-by: Ahmar Suhail <ahmarsu@amazon.co.uk>

* adds fips_tests feature to CI

Signed-off-by: Ahmar Suhail <ahmarsu@amazon.co.uk>

* use full domain

Signed-off-by: Ahmar Suhail <ahmarsu@amazon.co.uk>

* rustfmt

Signed-off-by: James Bornholt <bornholt@amazon.com>

---------

Signed-off-by: Ahmar Suhail <ahmarsu@amazon.co.uk>
Signed-off-by: James Bornholt <bornholt@amazon.com>
Co-authored-by: Ahmar Suhail <ahmarsu@amazon.co.uk>
Co-authored-by: James Bornholt <bornholt@amazon.com>
We missed this in awslabs#375 and it broke mainline.

Signed-off-by: James Bornholt <bornholt@amazon.com>
* Remove inodes from their parent in `forget`

The parent directory still holds onto an `Inode` (an `Arc<InodeInner>`),
so right now our `forget` is leaking the actual inode. We need to remove
it from its parent at `forget` time. Also updated the tests to check
that the inode is in fact free'd.

I tested this by listing a directory with 2M objects on an instance with
1GiB of memory, and saw constant memory usage.

Signed-off-by: James Bornholt <bornholt@amazon.com>

* Don't forget the wrong inode

Signed-off-by: James Bornholt <bornholt@amazon.com>

---------

Signed-off-by: James Bornholt <bornholt@amazon.com>
* Report current file size during writes

Mountpoint currently reports file size as 0 until the upload is complete.
In this commit, we instead report how many bytes have been streamed to S3
as some applications want to know current size of the file during writes.

Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>

* Add some tests

Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>

* Address PR comment

Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>

---------

Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>
* Reconcile remote and existing inodes at `update` time

To date we haven't thought too carefully about what happens if objects
are put/deleted from the S3 bucket while conflicting state is present
locally. There are a lot of edge cases here -- the Cartesian product of
existing state (local/remote file/directory) and new remote state
(file/directory), as well as two paths for inodes to be updated (readdir
vs lookup).

This change defines a semantics for these permutations. The overall idea
is that (a) remote state shadows local state, and (b) directories shadow
files. But those axioms alone aren't enough to break all ties; for
example, what if the existing state is a local directory but the new
state is a remote file -- which should win? I chose to break the tie by
saying that remote directories > any local state > remote files. So, for
example, if a user creates a local directory, and then a conflicting
object appears in the remote bucket, the directory will still be
visible instead of the new file.

I spent some time trying to patch the existing inode update path to do
what I needed but it ended up being easier to just refactor it. I think
we could still find a better factoring for this path, but it now
explicitly accounts for all the permutations above and does the right
thing (at least according to our reference model) for them all.
Happily, proptest has done a good job at rooting out the many edge
cases, as you can see by all the new regression tests in this change.

Signed-off-by: James Bornholt <bornholt@amazon.com>

* PR feedback

Signed-off-by: James Bornholt <bornholt@amazon.com>

---------

Signed-off-by: James Bornholt <bornholt@amazon.com>
* Add detailed CRT metrics and tweak per-request logging

This change is a few related things to make logging more useful:
- Include verbose CRT request metrics at trace level. We emit what we
  think is the "interesting" stuff at higher levels, but for detailed
  investigation we might want to see the raw CRT view.
- Add parameters to request spans. This ensures that we know _which_
  request is going wrong when we see log messages about requests.

Signed-off-by: James Bornholt <bornholt@amazon.com>

* Make ThreadId work on macOS

Signed-off-by: James Bornholt <bornholt@amazon.com>

---------

Signed-off-by: James Bornholt <bornholt@amazon.com>
…s#392)

Signed-off-by: Daniel Carl Jones <djonesoa@amazon.com>
Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>
…labs#397)

* Add always-successful workflow for DCO on merge_group event only

Signed-off-by: Daniel Carl Jones <djonesoa@amazon.com>

* Address PR feedback adding comment

Signed-off-by: Daniel Carl Jones <djonesoa@amazon.com>

---------

Signed-off-by: Daniel Carl Jones <djonesoa@amazon.com>
…s#387)

* Move logging module into its own file

No code changes, just relocating the module in preparation for the next
commit.

Signed-off-by: James Bornholt <bornholt@amazon.com>

* Emit warning-level logs to syslog when log directory is unset

Signed-off-by: James Bornholt <bornholt@amazon.com>

* Appease clippy

Signed-off-by: James Bornholt <bornholt@amazon.com>

* Document a little better

Signed-off-by: James Bornholt <bornholt@amazon.com>

* PR feedback

Signed-off-by: James Bornholt <bornholt@amazon.com>

---------

Signed-off-by: James Bornholt <bornholt@amazon.com>
…bs#394)

Signed-off-by: Daniel Carl Jones <djonesoa@amazon.com>
* Implement setattr to support changing time attributes

Some applications like `touch` requires the file system to support
changing file last access and modification times. We don't support this
operation because the last modification time for objects can't be set
via S3 API. However, it's possible to allow this only for the files that
are being written because at that time it's still a temporary stat in
Mountpoint.

Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>

* Update doc/SEMANTICS.md

Co-authored-by: Alessandro Passaro <alessandro.passaro@gmail.com>
Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>

* Fix unit test

Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>

---------

Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>
Co-authored-by: Alessandro Passaro <alessandro.passaro@gmail.com>
Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>
Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>
monthonk and others added 4 commits August 16, 2023 15:13
* Run throughput benchmark multiple times

Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>

* Update name for sequential write direct io job

Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>

* Update benchmark doc

Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>

* Update config for write benchmarks

Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>

---------

Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>
Right now if you try to do something that's totally unsupported, like
rename, there won't be a log entry unless you're debug logging, and even
then it will only be a fuser log entry rather than something more
specific to Mountpoint. This makes it hard for customers to know what's
happening when an operation fails, and hard for us to debug with them.
So let's stub out all the FUSE methods we haven't implemented in a way
that will log the failure as a warning, like our other "unsupported"
semantics.

Signed-off-by: James Bornholt <bornholt@amazon.com>
This change adds a bunch of new metrics for investigating performance.
It lets us track per-IO read/write size, number of open read/write
handles, directory listing throughput, and meta request throughput for
uploads and downloads.

Signed-off-by: James Bornholt <bornholt@amazon.com>
Signed-off-by: James Bornholt <bornholt@amazon.com>
@sauraank sauraank force-pushed the test_access_points branch from c3f5369 to c33859f Compare August 23, 2023 15:10
jamesbornholt and others added 17 commits August 23, 2023 21:28
This removes webpki from our dependencies to fix this: https://rustsec.org/advisories/RUSTSEC-2023-0052

Signed-off-by: James Bornholt <bornholt@amazon.com>
Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>
* Fixes for Rust 1.72

Signed-off-by: James Bornholt <bornholt@amazon.com>

* Fix tracing

Signed-off-by: James Bornholt <bornholt@amazon.com>

---------

Signed-off-by: James Bornholt <bornholt@amazon.com>
* improves encryption documentation

Signed-off-by: Ahmar Suhail <ahmarsu@amazon.co.uk>

* Update doc/CONFIGURATION.md

Co-authored-by: James Bornholt <jamesbornholt@gmail.com>
Signed-off-by: ahmarsuhail <ahmar.suhail@gmail.com>

---------

Signed-off-by: Ahmar Suhail <ahmarsu@amazon.co.uk>
Signed-off-by: ahmarsuhail <ahmar.suhail@gmail.com>
Co-authored-by: Ahmar Suhail <ahmarsu@amazon.co.uk>
Co-authored-by: James Bornholt <jamesbornholt@gmail.com>
Bumps [aws-actions/configure-aws-credentials](https://github.com/aws-actions/configure-aws-credentials) from 2 to 3.
- [Release notes](https://github.com/aws-actions/configure-aws-credentials/releases)
- [Changelog](https://github.com/aws-actions/configure-aws-credentials/blob/main/CHANGELOG.md)
- [Commits](aws-actions/configure-aws-credentials@v2...v3)

---
updated-dependencies:
- dependency-name: aws-actions/configure-aws-credentials
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
In order to build with Clang 16, the bindgen dependency needs to be
upgraded to at least version 0.62.0 (this change bumps to the latest,
which is 0.66.1). See awslabs#485

Signed-off-by: Sean Arms <67096+lesserwhirls@users.noreply.github.com>
This fixes two issues that were preventing Mountpoint from working
against Outposts buckets:
1. Outposts doesn't include the bucket name in ListObjectsV2 responses.
   We weren't actually using that output anyway, so I just removed it.
2. For GetObject requests, we were sending a HTTP header like
   `Accept: application/xml,*/*`. While technically valid HTTP, it's
   weird to accept */* as well as something else, and it was confusing
   Outposts' request signing. So I switched to overwriting the existing
   header, which is what the comment suggested the code was intended to
   do anyway.

I also took this chance to make a little cleanup to parsing
ListObjectsV2 responses: the `parse` functions shouldn't be defined on
the generic `ListObjectsResult` structs, which are shared by all
clients.

Signed-off-by: James Bornholt <bornholt@amazon.com>
Signed-off-by: Vlad Volodkin <vlaad@amazon.com>
Co-authored-by: Vlad Volodkin <vlaad@amazon.com>
When we run in background mode, the child process inherits the
stdin/stdout/stderr of the parent. That's good because we can print
mount errors from the child and have them reach the parent. But once
we're mounted and the parent exits, the child still holds onto those
handles. This is bad if those handles are pipes, which are often used
when trying to launch a daemon (e.g. Python subprocess.check_output). In
that case, the pipes will never close and the caller will keep waiting
for output on them forever.

We need to close these handles once we're successfully daemonized. This
will prevent us from seeing anything the process prints after they're
closed, but from that point we should be logging anyway, so shouldn't be
printing. Printing still works (doesn't panic or anything), just doesn't
go anywhere.

With this change, a Python script like

    import subprocess
    subprocess.check_output(['mount-s3', 'doc-example-bucket', '/bucket'])

works correctly: once the mount has succeeded, it returns. Without this
change, this program blocks until the bucket is unmounted.

Signed-off-by: James Bornholt <bornholt@amazon.com>
* Bump version of Mountpoint to v1.0.1

Signed-off-by: Ankit Saurabh <sauraank@amazon.co.uk>

* Added latest PRs to CHANGELOG.md

Signed-off-by: Ankit Saurabh <sauraank@amazon.co.uk>

* Added latest PRs to CHANGELOG.md

Signed-off-by: Ankit Saurabh <sauraank@amazon.co.uk>

* Added description of changes in changelog

Signed-off-by: Ankit Saurabh <sauraank@amazon.co.uk>

* Added PR in the changelog

Signed-off-by: Ankit Saurabh <sauraank@amazon.co.uk>

* Added PR in the changelog

Signed-off-by: Ankit Saurabh <sauraank@amazon.co.uk>

---------

Signed-off-by: Ankit Saurabh <sauraank@amazon.co.uk>
* Make it clear that live editing is not a good fit for mountpoint

Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>

* Update README.md

Co-authored-by: James Bornholt <jamesbornholt@gmail.com>
Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>

---------

Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>
Co-authored-by: James Bornholt <jamesbornholt@gmail.com>
Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>
Signed-off-by: Ankit Saurabh <sauraank@amazon.co.uk>
Signed-off-by: Monthon Klongklaew <monthonk@amazon.com>
Previously, mountpoint-s3 would not cancel prefetch tasks that it was going to ignore.
Instead, they would continue to be polled by the executor despite the results never being checked.
This change ensures that the task handles are dropped which cancels the task/future.

In the future, we may want to retain some of these tasks where the prefetcher may still be able to make use of them.

Signed-off-by: Daniel Carl Jones <djonesoa@amazon.com>
Signed-off-by: Ankit Saurabh <sauraank@amazon.co.uk>
Signed-off-by: Ankit Saurabh <sauraank@amazon.co.uk>
@sauraank sauraank force-pushed the test_access_points branch 6 times, most recently from 7b8334f to fed323b Compare September 14, 2023 13:49
Signed-off-by: Ankit Saurabh <sauraank@amazon.co.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.