Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot]: Add a VendoredFileSystem implementation #11863

Merged
merged 43 commits into from
Jun 18, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
41a1aff
Refactor `file` and `vendored` methods and make them standalone funct…
MichaReiser Jun 10, 2024
d3c8e83
Make `file_system_path_to_file` return an Option
MichaReiser Jun 10, 2024
1a69745
Improve documentation, rewrite inline comment, rename `file_system_pa…
MichaReiser Jun 13, 2024
43fa7bf
Update crates/ruff_db/src/vfs.rs
MichaReiser Jun 13, 2024
b054315
Don't use non-platform agnostic `is_absolute` method
MichaReiser Jun 13, 2024
383691e
Add `ModuleName` struct with tests
MichaReiser Jun 10, 2024
5a1da62
Module resolver
MichaReiser Jun 10, 2024
01041cd
Port module resolver to salsa
MichaReiser Jun 11, 2024
f16c045
Add `remove_file` and `remove_directory` tests
MichaReiser Jun 11, 2024
56f2caf
Suppress warning about unused `Os` file system that are only used on …
MichaReiser Jun 11, 2024
bcaea19
Change `ModuleName::new` to return an `Option`
MichaReiser Jun 13, 2024
4df3bea
Use `touch` over `set_permission` and `set_revision` in Tests
MichaReiser Jun 13, 2024
4441c9e
[red-knot]: Add a `VendoredFileSystem` implementation
AlexWaygood Jun 13, 2024
ae49c71
Fix many easily addressed review comments
AlexWaygood Jun 14, 2024
bd94c60
More relatively easily addressed comments
AlexWaygood Jun 14, 2024
43e3626
Just use the file hash for `FileRevision`
AlexWaygood Jun 14, 2024
13ee2d2
simplify
AlexWaygood Jun 14, 2024
05846ae
Refactor `file` and `vendored` methods and make them standalone funct…
MichaReiser Jun 10, 2024
88a6aa4
Make `file_system_path_to_file` return an Option
MichaReiser Jun 10, 2024
64b67ef
Improve documentation, rewrite inline comment, rename `file_system_pa…
MichaReiser Jun 13, 2024
cbb8992
Update crates/ruff_db/src/vfs.rs
MichaReiser Jun 13, 2024
00477c7
Add `ModuleName` struct with tests
MichaReiser Jun 10, 2024
de1ec29
Module resolver
MichaReiser Jun 10, 2024
92b3465
Port module resolver to salsa
MichaReiser Jun 11, 2024
2ee6973
Add `remove_file` and `remove_directory` tests
MichaReiser Jun 11, 2024
96d75a7
Suppress warning about unused `Os` file system that are only used on …
MichaReiser Jun 11, 2024
e5e8da6
Change `ModuleName::new` to return an `Option`
MichaReiser Jun 13, 2024
2348fe6
Use `touch` over `set_permission` and `set_revision` in Tests
MichaReiser Jun 13, 2024
eea6023
Merge branch 'salsa-memory-resolver' into vendored-filesystem
AlexWaygood Jun 14, 2024
07e5a80
Reduce clones and make them more explicit
AlexWaygood Jun 15, 2024
514e9b6
Use interior mutability to avoid cloning when unnecessary
AlexWaygood Jun 15, 2024
1f68832
Do not implement `FileSystem`
AlexWaygood Jun 16, 2024
035ea53
Simplify normalization logic
AlexWaygood Jun 17, 2024
5c33610
Update crates/ruff_db/src/vendored.rs
AlexWaygood Jun 17, 2024
ecf97a9
Get rid of unneeded methods
AlexWaygood Jun 17, 2024
be8f952
Return `Result` from `read()`
AlexWaygood Jun 17, 2024
f4f97d4
Replace a wrapper type with a type alias
AlexWaygood Jun 17, 2024
df248ce
fix tests
AlexWaygood Jun 17, 2024
f05bcb2
Bring back some notion of vendored-stub-file "metadata"
AlexWaygood Jun 17, 2024
89ebc91
Merge remote-tracking branch 'upstream/main' into vendored-filesystem
AlexWaygood Jun 18, 2024
13c2802
add some newlines
AlexWaygood Jun 18, 2024
5d511bb
More docs etc.
AlexWaygood Jun 18, 2024
3127ff6
See if bumping `zstd-sys` changes anything
AlexWaygood Jun 18, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Do not implement FileSystem
  • Loading branch information
AlexWaygood committed Jun 16, 2024
commit 1f68832384541cb9b480f33142e665aa7cac0836
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 2 additions & 3 deletions crates/ruff_db/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,10 @@ camino = { workspace = true }
countme = { workspace = true }
dashmap = { workspace = true }
filetime = { workspace = true }
is-macro = { workspace = true }
itertools = { workspace = true }
once_cell = { workspace = true }
salsa = { workspace = true }
tracing = { workspace = true }
rustc-hash = { workspace = true }
zip = { workspace = true }

[dev-dependencies]
once_cell = { workspace = true }
2 changes: 0 additions & 2 deletions crates/ruff_db/src/file_system.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,9 @@ use filetime::FileTime;

pub use memory::MemoryFileSystem;
pub use os::OsFileSystem;
pub use vendored::VendoredFileSystem;

mod memory;
mod os;
mod vendored;

pub type Result<T> = std::io::Result<T>;

Expand Down
1 change: 1 addition & 0 deletions crates/ruff_db/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ use crate::vfs::{Vfs, VfsFile};
pub mod file_system;
pub mod parsed;
pub mod source;
pub mod vendored;
pub mod vfs;

pub(crate) type FxDashMap<K, V> = dashmap::DashMap<K, V, BuildHasherDefault<FxHasher>>;
Expand Down
2 changes: 1 addition & 1 deletion crates/ruff_db/src/parsed.rs
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ mod tests {
use crate::file_system::FileSystemPath;
use crate::parsed::parsed_module;
use crate::tests::TestDb;
use crate::vfs::VendoredPath;
use crate::vendored::VendoredPath;
use crate::vfs::{system_path_to_file, vendored_path_to_file};

#[test]
Expand Down
Original file line number Diff line number Diff line change
@@ -1,114 +1,113 @@
use std::cell::{RefCell, RefMut};
use std::io::{Cursor, Read};
use std::io::{self, Read};
use std::ops::{Deref, DerefMut};
use std::sync::{Mutex, MutexGuard};

use itertools::Itertools;
use zip::{read::ZipFile, ZipArchive};
use once_cell::sync::Lazy;
use zip::{read::ZipFile, ZipArchive, ZipWriter};

use crate::file_system::{FileRevision, FileSystem, FileSystemPath, FileType, Metadata, Result};
pub use path::{VendoredPath, VendoredPathBuf};

pub mod path;

type Result<T> = io::Result<T>;

#[derive(Debug)]
pub struct VendoredFileSystem {
AlexWaygood marked this conversation as resolved.
Show resolved Hide resolved
AlexWaygood marked this conversation as resolved.
Show resolved Hide resolved
inner: VendoredFileSystemInner,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I would remove the inner file type. It doesn't provide that much functionality but the nesting becomes somewhat hard to understand. It also allows to remove other wrapper types, reducing the overall code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got rid of one of the other wrapper types in f4f97d4, but I think I would prefer to keep using an inner file type for this. I tried changing it to a type alias, and in my opinion it made the VendoredFileSystem unduly coupled to the inner representation that used a lock/RefCell; I also found it less readable.

}

impl VendoredFileSystem {
const READ_ONLY_PERMISSIONS: u32 = 0o444;
impl Default for VendoredFileSystem {
fn default() -> Self {
static DATA: Lazy<Box<[u8]>> = Lazy::new(|| {
let mut buffer = Vec::new();
let cursor = io::Cursor::new(&mut buffer);
{
let mut archive = ZipWriter::new(cursor);
archive.finish().unwrap();
}
AlexWaygood marked this conversation as resolved.
Show resolved Hide resolved
AlexWaygood marked this conversation as resolved.
Show resolved Hide resolved
buffer.into_boxed_slice()
});
Self::new(&DATA).unwrap()
}
}

impl VendoredFileSystem {
pub fn new(raw_bytes: &'static [u8]) -> Result<Self> {
Ok(Self {
inner: VendoredFileSystemInner::new(raw_bytes)?,
})
}
}

impl FileSystem for VendoredFileSystem {
fn exists(&self, path: &FileSystemPath) -> bool {
pub fn file_type(&self, path: &VendoredPath) -> Option<FileType> {
let normalized = normalize_vendored_path(path);
let inner_locked = self.inner.lock();
let mut archive = inner_locked.borrow_mut();
if let Ok(zip_file) = archive.lookup_path(&normalized) {
if zip_file.is_dir() {
return Some(FileType::Directory);
}
return Some(FileType::File);
}
if normalized.has_trailing_slash() {
return None;
}
if let Ok(zip_file) = archive.lookup_path(&normalized.with_trailing_slash()) {
if zip_file.is_dir() {
return Some(FileType::Directory);
}
return Some(FileType::File);
}
None
}

pub fn exists(&self, path: &VendoredPath) -> bool {
let normalized = normalize_vendored_path(path);
let inner_locked = self.inner.lock();
let mut archive = inner_locked.borrow_mut();
archive.lookup_path(&normalized).is_ok()
|| (!normalized.has_trailing_slash()
&& archive
.lookup_path(&normalized.with_trailing_slash())
.is_ok())
}

fn is_directory(&self, path: &FileSystemPath) -> bool {
/// Return the crc32 hash of the file
pub fn file_hash(&self, path: &VendoredPath) -> Option<u32> {
let normalized = normalize_vendored_path(path);
let inner_locked = self.inner.lock();
let mut archive = inner_locked.borrow_mut();
let normalized = normalize_vendored_path(path);
if let Ok(zip_file) = archive.lookup_path(&normalized) {
return zip_file.is_dir();
return Some(zip_file.crc32());
}
if normalized.has_trailing_slash() {
return false;
return None;
}
archive
.lookup_path(&normalized.with_trailing_slash())
.is_ok_and(|zip_file| zip_file.is_dir())
.map(|zip_file| zip_file.crc32())
.ok()
}

fn is_file(&self, path: &FileSystemPath) -> bool {
if path.as_str().ends_with('/') {
return false;
}
/// Read the entire contents of the path into a string
pub fn read(&self, path: &VendoredPath) -> Option<String> {
let inner_locked = self.inner.lock();
let mut archive = inner_locked.borrow_mut();
archive
.lookup_path(&normalize_vendored_path(path))
.is_ok_and(|zip_file| zip_file.is_file())
let mut zip_file = archive.lookup_path(&normalize_vendored_path(path)).ok()?;
let mut buffer = String::new();
zip_file.read_to_string(&mut buffer).ok()?;
AlexWaygood marked this conversation as resolved.
Show resolved Hide resolved
Some(buffer)
}
}

fn metadata(&self, path: &FileSystemPath) -> Result<Metadata> {
let normalized = normalize_vendored_path(path);
let inner_locked = self.inner.lock();
let mut archive = inner_locked.borrow_mut();
#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, is_macro::Is)]
AlexWaygood marked this conversation as resolved.
Show resolved Hide resolved
pub enum FileType {
/// The path exists in the zip archive and represents a vendored file
File,

let metadata = archive
.lookup_path(&normalized)
.map(|zip_file| (zip_file.is_dir(), zip_file.crc32()));

let (is_dir, file_hash) = match metadata {
Ok((is_dir, file_hash)) => (is_dir, file_hash),
Err(err) => {
if normalized.has_trailing_slash() {
return Err(err);
}
let lookup_result = archive.lookup_path(&normalized.with_trailing_slash());
if let Ok(zip_file) = lookup_result {
(zip_file.is_dir(), zip_file.crc32())
} else {
return Err(err);
}
}
};

let file_type = if is_dir {
FileType::Directory
} else {
FileType::File
};

Ok(Metadata {
// Using the file hash is fine here,
// as `FileRevision`s are not guaranteed to be monotonically increasing
// for newer revisions
revision: FileRevision::new(u128::from(file_hash)),
permissions: Some(Self::READ_ONLY_PERMISSIONS),
file_type,
})
}

fn read(&self, path: &FileSystemPath) -> Result<String> {
let inner_locked = self.inner.lock();
let mut archive = inner_locked.borrow_mut();
let mut zip_file = archive.lookup_path(&normalize_vendored_path(path))?;
let mut buffer = String::new();
zip_file.read_to_string(&mut buffer)?;
Ok(buffer)
}
/// The path exists in the zip archive and represents a vendored directory of files
Directory,
}

#[derive(Debug)]
Expand Down Expand Up @@ -159,11 +158,11 @@ impl<'a> DerefMut for ArchiveReader<'a> {

/// Newtype wrapper around a ZipArchive.
#[derive(Debug)]
struct VendoredZipArchive(ZipArchive<Cursor<&'static [u8]>>);
struct VendoredZipArchive(ZipArchive<io::Cursor<&'static [u8]>>);

impl VendoredZipArchive {
pub fn new(data: &'static [u8]) -> Result<Self> {
AlexWaygood marked this conversation as resolved.
Show resolved Hide resolved
Ok(Self(ZipArchive::new(Cursor::new(data))?))
Ok(Self(ZipArchive::new(io::Cursor::new(data))?))
}

fn lookup_path(&mut self, path: &NormalizedVendoredPath) -> Result<ZipFile> {
Expand Down Expand Up @@ -200,7 +199,7 @@ impl NormalizedVendoredPath {
/// If a path with an unsupported component for vendored paths is passed.
/// Unsupported components are path prefixes,
/// and path root directories appearing anywhere except at the start of the path.
fn normalize_vendored_path(path: &FileSystemPath) -> NormalizedVendoredPath {
fn normalize_vendored_path(path: &VendoredPath) -> NormalizedVendoredPath {
debug_assert!(
!path.as_std_path().is_absolute(),
"Attempted to lookup an absolute filesystem path relative to a vendored zip archive"
Expand Down Expand Up @@ -248,7 +247,7 @@ mod tests {

static MOCK_ZIP_ARCHIVE: Lazy<Box<[u8]>> = Lazy::new(|| {
let mut typeshed_buffer = Vec::new();
let typeshed = Cursor::new(&mut typeshed_buffer);
let typeshed = io::Cursor::new(&mut typeshed_buffer);

let options = FileOptions::default()
.compression_method(CompressionMethod::Zstd)
Expand Down Expand Up @@ -282,16 +281,13 @@ mod tests {
fn test_directory(dirname: &str) {
let mock_typeshed = mock_typeshed();

let path = FileSystemPath::new(dirname);
let path = VendoredPath::new(dirname);

assert!(mock_typeshed.exists(path));
assert!(mock_typeshed.is_directory(path));
assert!(!mock_typeshed.is_file(path));

let path_metadata = mock_typeshed.metadata(path).unwrap();
assert!(path_metadata.file_type().is_directory());
assert!(path_metadata.permissions().is_some());
path_metadata.revision();
assert!(mock_typeshed
.file_type(path)
.is_some_and(|file_type| file_type.is_directory()));
assert!(mock_typeshed.file_hash(path).is_some())
}

#[test]
Expand Down Expand Up @@ -326,12 +322,11 @@ mod tests {

fn test_nonexistent_path(path: &str) {
let mock_typeshed = mock_typeshed();
let path = FileSystemPath::new(path);
let path = VendoredPath::new(path);
assert!(!mock_typeshed.exists(path));
assert!(!mock_typeshed.is_directory(path));
assert!(!mock_typeshed.is_file(path));
assert!(mock_typeshed.metadata(path).is_err());
assert!(mock_typeshed.read(path).is_err());
assert!(mock_typeshed.file_type(path).is_none());
assert!(mock_typeshed.file_hash(path).is_none());
assert!(mock_typeshed.read(path).is_none());
}

#[test]
Expand All @@ -354,21 +349,18 @@ mod tests {
test_nonexistent_path("./foo/../../../foo")
}

fn test_file(mock_typeshed: &VendoredFileSystem, path: &FileSystemPath) {
fn test_file(mock_typeshed: &VendoredFileSystem, path: &VendoredPath) {
assert!(mock_typeshed.exists(path));
assert!(mock_typeshed.is_file(path));
assert!(!mock_typeshed.is_directory(path));

let path_metadata = mock_typeshed.metadata(path).unwrap();
assert!(path_metadata.file_type().is_file());
assert!(path_metadata.permissions().is_some());
path_metadata.revision();
assert!(mock_typeshed
.file_type(path)
.is_some_and(|file_type| file_type.is_file()));
assert!(mock_typeshed.file_hash(path).is_some());
}

#[test]
fn functools_file_contents() {
let mock_typeshed = mock_typeshed();
let path = FileSystemPath::new("stdlib/functools.pyi");
let path = VendoredPath::new("stdlib/functools.pyi");
test_file(&mock_typeshed, path);
let functools_stub = mock_typeshed.read(path).unwrap();
assert_eq!(functools_stub.as_str(), FUNCTOOLS_CONTENTS);
Expand All @@ -382,14 +374,14 @@ mod tests {
fn functools_file_other_path() {
test_file(
&mock_typeshed(),
FileSystemPath::new("stdlib/../stdlib/../stdlib/functools.pyi"),
VendoredPath::new("stdlib/../stdlib/../stdlib/functools.pyi"),
)
}

#[test]
fn asyncio_file_contents() {
let mock_typeshed = mock_typeshed();
let path = FileSystemPath::new("stdlib/asyncio/tasks.pyi");
let path = VendoredPath::new("stdlib/asyncio/tasks.pyi");
test_file(&mock_typeshed, path);
let asyncio_stub = mock_typeshed.read(path).unwrap();
assert_eq!(asyncio_stub.as_str(), ASYNCIO_TASKS_CONTENTS);
Expand All @@ -399,7 +391,7 @@ mod tests {
fn asyncio_file_other_path() {
test_file(
&mock_typeshed(),
FileSystemPath::new("./stdlib/asyncio/../asyncio/tasks.pyi"),
VendoredPath::new("./stdlib/asyncio/../asyncio/tasks.pyi"),
)
}
}
Loading
Loading