Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compression (DBTL-1) #29

Merged
merged 17 commits into from
Aug 2, 2024
Merged

Compression (DBTL-1) #29

merged 17 commits into from
Aug 2, 2024

Conversation

seb-hyland
Copy link
Member

This code is heavily WIP at the moment and is untested. Exec.rs likely contains bugs or problems.

The goal of this branch is to create:

  1. A function that can arbitrarily run binaries on Windows/Linux (the binary will be written into a variable at compile time, then ran as an anonymous file at runtime)
  2. Create the DBTL-1 variant of the compression implementation.

@lhao03 lhao03 marked this pull request as draft July 20, 2024 21:37
@seb-hyland seb-hyland marked this pull request as ready for review July 22, 2024 22:14
@seb-hyland
Copy link
Member Author

Provides compression and decompression mechanisms for lz4 and ts_zip. Exec.rs has been removed.

Compression takes inpath:PathBuf as input and outputs Result<PathBuf, io::Error> where PathBuf references the output file.
Decompression takes inpath:PathBuf and outpath:PathBuf as input and outputs Result<()>.

@seb-hyland seb-hyland changed the title WIP: Arbitrary binary execution function and compression (DBTL-1) Compression (DBTL-1) Jul 22, 2024
let mut output_file = File::create(outpath.as_path())?;
io::copy(&mut decoder, &mut output_file)?;

Ok(())
}
}
Copy link
Member

@lhao03 lhao03 Jul 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a test with quickcheck, just to quickly verify the decompress and compress work?

#[quickcheck]
fn rotation_encode_decode(bytes: Vec<u8>) -> bool {
let encoder = RotationEncoder {};
let decoder = RotationDecoder {};
let bits = BitVec::from_vec(bytes);
bits == decoder.decode(&encoder.encode(&bits))
}
#[quickcheck]
fn quaternary_encode_decode(bytes: Vec<u8>) -> bool {
if bytes.len() % 2 == 1 {
true
} else {
let encoder = QuaternaryEncoder {};
let decoder = QuaternaryDecoder {};
let bits = BitVec::from_vec(bytes);
bits == decoder.decode(&encoder.encode(&bits))
}


// Trait for all compression and decompression functions
pub trait Compressor {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also can you edit main.rs to add the compression during encode_sequence?

#[tauri::command]
fn encode_sequence(encoder_type: &str, file_path: &str) -> Result<Vec<Base>, String> {
let bytes = fs::read(file_path).map_err(|err| err.to_string())?;
let bits = BitVec::<_, Msb0>::from_vec(bytes);
let encoder: Box<dyn Encoder> = match encoder_type {
"quaternary" => Box::new(QuaternaryEncoder {}),
"rotation" => Box::new(RotationEncoder {}),
_ => return Err("Selected encoder does not exist.".to_string()),
};
Ok(encoder.encode(&bits).into())
}

it should follow the old deleted sequence.rs almost exactly

fn encode(
path: impl AsRef<Path>,
primer: Primer,
compressor: impl Compressor,
encoder: impl Encoder,
) -> io::Result<Vec<Base>> {
let file = fs::read(path)?;
let compressed = compressor.compress(file);
let bit_sequence = BitVec::from_vec(compressed);
Ok(encoder.encode(bit_sequence))
}

Comment on lines 36 to 45
fn encode_sequence(
encoder_type: &str,
file_path: PathBuf,
) -> Result<Vec<Base>, String> {
let compressor = VoidCompressor{};
let compressed = compressor
.compress(file_path)
.map_err(|err| err.to_string())?;
let bytes = fs::read(compressed).map_err(|err| err.to_string())?;
let bits = BitVec::<_, Msb0>::from_slice(&bytes);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To test this, do the below
image

Comment on lines 91 to 94
dbg!("here");
let out_dir = "compressed_lz4/";
fs::create_dir_all(out_dir);
let outpath = Path::new(out_dir).join(filename).with_extension("lz4");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assumption that this directory exists causes an error to be thrown; additionally, there is no handling of the case where the directory does not exist; I pushed a way to fix this, but this assumption is made in several other places in the code; please fix them and add proper handling of the case when the directory doesn't exist (either make it through Rust, or prompt the user to make it themselves).

@seb-hyland
Copy link
Member Author

Hi Lucy, I pushed up a hefty merge with my changes, your changes, and changes in main. This should include:

  1. Your changes to compress_lz4 to ensure the directory exists
  2. My updates to compressor.rs and the updated encode_sequence function with my changes
  3. I have kept your dbg!("here"); macros; do you wish to keep those for now?

Although I haven't gone over everything again since the merge (except for resolving conflicts), the project is compiling and tests are passing (including the quickcheck for (de)compression). Do you wish to make a test for encode_sequence? I'm not sure how we would do that save for a test similar to chaosDNA (no assertion, just use some test sentence and print the output to CLI).

@lhao03
Copy link
Member

lhao03 commented Aug 2, 2024

Hi Lucy, I pushed up a hefty merge with my changes, your changes, and changes in main. This should include:

  1. Your changes to compress_lz4 to ensure the directory exists
  2. My updates to compressor.rs and the updated encode_sequence function with my changes
  3. I have kept your dbg!("here"); macros; do you wish to keep those for now?

Although I haven't gone over everything again since the merge (except for resolving conflicts), the project is compiling and tests are passing (including the quickcheck for (de)compression). Do you wish to make a test for encode_sequence? I'm not sure how we would do that save for a test similar to chaosDNA (no assertion, just use some test sentence and print the output to CLI).

Yeah we can have a test for encode_sequence using quickcheck, I can set that up (but yeah it would be a test sentence or str that we can check compress -> bit -> base -> bit -> decompress)

Copy link
Member

@lhao03 lhao03 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this Sebastian!

@lhao03 lhao03 merged commit b8d7fba into main Aug 2, 2024
1 check passed
@lhao03 lhao03 mentioned this pull request Aug 2, 2024
@lhao03 lhao03 deleted the compression branch August 2, 2024 15:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants