Skip to content

handle archives #28

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Sep 13, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,19 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Added

- Added a method `Cache::cached_path_with_options` and a corresponding `Options` struct.
- Added ability to automatically extract archives through the `Cache::cached_path_with_options` method.
- Added integration tests.

### Changed

- `Meta` struct is no longer public.
- `Cache::cached_path_in_subdir` is now deprecated.

### Removed

- Removed the `only_keep_latest` setting for the `Cache`.

## [v0.4.2](https://github.com/epwalsh/rust-cached-path/releases/tag/v0.4.3) - 2020-09-11

### Changed
Expand Down
6 changes: 5 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "cached-path"
version = "0.4.3"
version = "0.4.4"
authors = ["epwalsh <epwalsh10@gmail.com>"]
edition = "2018"
keywords = ["http", "caching"]
Expand Down Expand Up @@ -32,6 +32,10 @@ serde_json = "1.0"
rand = "0.7"
glob = "0.3"
thiserror = "1.0"
flate2 = "1.0"
tar = "0.4"
zip = "0.5"
zip-extensions = "0.5"
env_logger = { version = "0.7", optional = true }
structopt = { version = "0.3", optional = true }
anyhow = { version = "1.0", optional = true }
Expand Down
30 changes: 25 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,9 @@ cached version:
```rust
use cached_path::cached_path;

let path =
cached_path("https://github.com/epwalsh/rust-cached-path/blob/master/README.md").unwrap();
let path = cached_path(
"https://github.com/epwalsh/rust-cached-path/blob/master/README.md"
).unwrap();
assert!(path.is_file());
```

Expand All @@ -54,13 +55,32 @@ assert_eq!(path.to_str().unwrap(), "README.md");

```bash
# From the command line:
$ cached-path https://github.com/epwalsh/rust-cached-path/blob/master/README.md
$ cached-path README.md
README.md
```

For resources that are archives, like `*.tar.gz` files, `cached-path` can also
automatically extract the files:

```rust
use cached_path::{cached_path_with_options, Options};

let path = cached_path_with_options(
"https://raw.githubusercontent.com/epwalsh/rust-cached-path/master/test_fixtures/utf-8_sample/archives/utf-8.tar.gz",
&Options::default().extract(),
).unwrap();
assert!(path.is_dir());
```

```bash
# From the command line:
$ cached-path --extract https://raw.githubusercontent.com/epwalsh/rust-cached-path/master/test_fixtures/utf-8_sample/archives/utf-8.tar.gz
README.md
```

It's easy to customize the cache location, the HTTP client, and other options
It's also easy to customize the cache location, the HTTP client, and other options
using a [`CacheBuilder`](https://docs.rs/cached-path/*/cached_path/struct.CacheBuilder.html) to construct a custom
[`Cache`](https://docs.rs/cached-path/*/cached_path/struct.Cache.html) object. This is also the recommended thing
[`Cache`](https://docs.rs/cached-path/*/cached_path/struct.Cache.html) object. This is the recommended thing
to do if your application makes multiple calls to `cached_path`, since it avoids the overhead
of creating a new HTTP client on each call:

Expand Down
56 changes: 56 additions & 0 deletions src/archives.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
use crate::error::Error;
use flate2::read::GzDecoder;
use std::fs::{self, File};
use std::path::Path;
use tempfile::tempdir_in;
use zip_extensions::read::zip_extract;

/// Supported archive types.
pub(crate) enum ArchiveFormat {
TarGz,
Zip,
}

impl ArchiveFormat {
/// Parse archive type from resource extension.
pub(crate) fn parse_from_extension(resource: &str) -> Result<Self, Error> {
if resource.ends_with(".tar.gz") {
Ok(Self::TarGz)
} else if resource.ends_with(".zip") {
Ok(Self::Zip)
} else {
Err(Error::ExtractionError("unsupported archive format".into()))
}
}
}

pub(crate) fn extract_archive<P: AsRef<Path>>(
path: P,
target: P,
format: &ArchiveFormat,
) -> Result<(), Error> {
// We'll first extract to a temp directory in the same parent as the target directory.
let target_parent_dir = target.as_ref().parent().unwrap();
let temp_target = tempdir_in(target_parent_dir)?;

match format {
ArchiveFormat::TarGz => {
let tar_gz = File::open(path)?;
let tar = GzDecoder::new(tar_gz);
let mut archive = tar::Archive::new(tar);
archive.unpack(&temp_target)?;
}
ArchiveFormat::Zip => {
zip_extract(
&path.as_ref().to_path_buf(),
&temp_target.path().to_path_buf(),
)
.map_err(|e| Error::ExtractionError(format!("{:?}", e)))?;
}
};

// Now rename the temp directory to the final target directory.
fs::rename(temp_target, target)?;

Ok(())
}
Loading