-
Notifications
You must be signed in to change notification settings - Fork 63
feat: Massive refactor with a new API #45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,203 @@ | ||
use std::convert::TryFrom; | ||
use std::{cmp, fmt, hash}; | ||
|
||
use unsigned_varint::{decode as varint_decode, encode as varint_encode}; | ||
|
||
use crate::errors::{DecodeError, DecodeOwnedError}; | ||
use crate::hashes::Code; | ||
use crate::storage::Storage; | ||
|
||
/// Represents a valid multihash. | ||
#[derive(Clone)] | ||
pub struct Multihash { | ||
storage: Storage, | ||
} | ||
|
||
impl fmt::Debug for Multihash { | ||
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { | ||
f.debug_tuple("Multihash").field(&self.as_bytes()).finish() | ||
} | ||
} | ||
|
||
impl PartialEq for Multihash { | ||
fn eq(&self, other: &Self) -> bool { | ||
self.storage.bytes() == other.storage.bytes() | ||
} | ||
} | ||
|
||
impl Eq for Multihash {} | ||
|
||
impl hash::Hash for Multihash { | ||
fn hash<H: hash::Hasher>(&self, state: &mut H) { | ||
self.storage.bytes().hash(state); | ||
} | ||
} | ||
|
||
impl Multihash { | ||
/// Verifies whether `bytes` contains a valid multihash, and if so returns a `Multihash`. | ||
pub fn from_bytes(bytes: Vec<u8>) -> Result<Multihash, DecodeOwnedError> { | ||
if let Err(err) = MultihashRef::from_slice(&bytes) { | ||
return Err(DecodeOwnedError { | ||
error: err, | ||
data: bytes, | ||
}); | ||
} | ||
Ok(Multihash { | ||
storage: Storage::from_slice(&bytes), | ||
}) | ||
} | ||
|
||
/// Returns the bytes representation of the multihash. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I apologize if I'm misunderstanding this but as far as I understand this would return bytes without consuming If my assumption is correct this method probably should either be removed or changed so it consumes multihash to match String behavior. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is correct. A method named For naming guidelines, see https://rust-lang.github.io/api-guidelines/naming.html#ad-hoc-conversions-follow-as_-to_-into_-conventions-c-conv |
||
pub fn into_bytes(self) -> Vec<u8> { | ||
self.to_vec() | ||
} | ||
|
||
/// Returns the bytes representation of the multihash. | ||
pub fn to_vec(&self) -> Vec<u8> { | ||
Vec::from(self.as_bytes()) | ||
} | ||
|
||
/// Returns the bytes representation of this multihash. | ||
pub fn as_bytes(&self) -> &[u8] { | ||
self.storage.bytes() | ||
} | ||
|
||
/// Builds a `MultihashRef` corresponding to this `Multihash`. | ||
pub fn as_ref(&self) -> MultihashRef { | ||
MultihashRef { | ||
bytes: self.as_bytes(), | ||
} | ||
} | ||
|
||
/// Returns which hashing algorithm is used in this multihash. | ||
pub fn algorithm(&self) -> Code { | ||
self.as_ref().algorithm() | ||
} | ||
|
||
/// Returns the hashed data. | ||
pub fn digest(&self) -> &[u8] { | ||
self.as_ref().digest() | ||
} | ||
} | ||
|
||
impl AsRef<[u8]> for Multihash { | ||
fn as_ref(&self) -> &[u8] { | ||
self.as_bytes() | ||
} | ||
} | ||
|
||
impl<'a> PartialEq<MultihashRef<'a>> for Multihash { | ||
fn eq(&self, other: &MultihashRef<'a>) -> bool { | ||
&*self.as_bytes() == other.as_bytes() | ||
} | ||
} | ||
|
||
impl TryFrom<Vec<u8>> for Multihash { | ||
type Error = DecodeOwnedError; | ||
|
||
fn try_from(value: Vec<u8>) -> Result<Self, Self::Error> { | ||
Multihash::from_bytes(value) | ||
} | ||
} | ||
|
||
impl PartialOrd for Multihash { | ||
fn partial_cmp(&self, other: &Self) -> Option<cmp::Ordering> { | ||
Some(self.cmp(other)) | ||
} | ||
} | ||
|
||
impl Ord for Multihash { | ||
fn cmp(&self, other: &Self) -> cmp::Ordering { | ||
self.as_ref().cmp(&other.as_ref()) | ||
} | ||
} | ||
|
||
/// Represents a valid multihash. | ||
#[derive(Debug, Copy, Clone, PartialEq, Eq, PartialOrd, Ord, Hash)] | ||
pub struct MultihashRef<'a> { | ||
bytes: &'a [u8], | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since this hasn't been brought up before, structures like this make me wonder if
This would remove the need for separate There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think a multihash with a lifetime would be a world of hurt in terms of ergonomics. I can already see myself fighting lifetime issues to do utterly trivial things with multihashes and futures. Also, I kinda dislike Cow since it has some surprising footguns... E.g. rust-lang/rust#50160 I think an There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree, to use The downside of using |
||
|
||
impl<'a> MultihashRef<'a> { | ||
/// Creates a `MultihashRef` from the given `input`. | ||
pub fn from_slice(input: &'a [u8]) -> Result<Self, DecodeError> { | ||
if input.is_empty() { | ||
return Err(DecodeError::BadInputLength); | ||
} | ||
|
||
let (_code, bytes) = varint_decode::u64(&input).map_err(|_| DecodeError::BadInputLength)?; | ||
|
||
let (hash_len, bytes) = | ||
varint_decode::u64(&bytes).map_err(|_| DecodeError::BadInputLength)?; | ||
if (bytes.len() as u64) != hash_len { | ||
return Err(DecodeError::BadInputLength); | ||
} | ||
|
||
Ok(MultihashRef { bytes: input }) | ||
} | ||
|
||
/// Returns which hashing algorithm is used in this multihash. | ||
pub fn algorithm(&self) -> Code { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So wrap allows creating a multihash with non standard code, but such multihash will cause panic if it's Personally I would prefer to have custom |
||
let (code, _bytes) = | ||
varint_decode::u64(&self.bytes).expect("multihash is known to be valid algorithm"); | ||
Code::from_u64(code) | ||
} | ||
|
||
/// Returns the hashed data. | ||
pub fn digest(&self) -> &'a [u8] { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would much rather have no way to creating invalid |
||
let (_code, bytes) = | ||
varint_decode::u64(&self.bytes).expect("multihash is known to be valid digest"); | ||
let (_hash_len, bytes) = | ||
varint_decode::u64(&bytes).expect("multihash is known to be a valid digest"); | ||
&bytes[..] | ||
} | ||
|
||
/// Builds a `Multihash` that owns the data. | ||
/// | ||
/// This operation allocates. | ||
pub fn to_owned(&self) -> Multihash { | ||
Multihash { | ||
storage: Storage::from_slice(self.bytes), | ||
} | ||
} | ||
|
||
/// Returns the bytes representation of this multihash. | ||
pub fn as_bytes(&self) -> &'a [u8] { | ||
&self.bytes | ||
} | ||
} | ||
|
||
impl<'a> PartialEq<Multihash> for MultihashRef<'a> { | ||
fn eq(&self, other: &Multihash) -> bool { | ||
self.as_bytes() == &*other.as_bytes() | ||
} | ||
} | ||
|
||
/// The `MultihashDigest` trait specifies an interface common for all multihash functions. | ||
pub trait MultihashDigest { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is the functional benefit to having the builtin functionality tied to this trait? Seems unnecessary to have to import the MultihashDigest trait just to use a builtin hash. I only see the need for the DynMultihashDigest trait, unless I'm missing the use case for this trait? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see this trait as the thing you would actually want. The What do others think? Does it make sense to have both traits, or should we only have the hacky There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. to be clear about what I am suggesting: austinabell@131cc94 Just quickly put together, could add code as another base impl as non associative and it would be functionally the same but not require you to import this trait whenever you want to hash in a non-dynamic context. Definitely more verbose, and could be cleaned up, but more clean usage imo There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Having There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah I was just too lazy to move the associated code const to the base impl, does this satisfy that criteria: austinabell@fd07805? fixed changes in edit There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just to be clear, I have no opinion or preference, just offering an alternative since it seems weird to me to require importing that trait that does nothing just to be able to hash using builtin implementations. Seems to be an even tradeoff since declaring custom implementations could be slightly confusing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's (from the end-user perspective) quite nice. We could also have a derive macro as the trait implementation would be the same. Could please someone with more Rust experience comment on which approach is better? The one proposed by @austinabell (#45 (comment)) feels a bit un-idiomatic, but the resulting API is quite nice. @dignifiedquire perhaps? |
||
/// The Mutlihash byte value. | ||
fn code(&self) -> Code; | ||
|
||
/// Hash some input and return the digest. | ||
/// | ||
/// # Panics | ||
/// | ||
/// Panics if the digest length is bigger than 2^32. This only happens for identity hasing. | ||
fn digest(&self, data: &[u8]) -> Multihash; | ||
} | ||
|
||
/// Wraps a hash digest in Multihash with the given Mutlihash code. | ||
/// | ||
/// The size of the hash is determoned by the size of the input hash. If it should be truncated | ||
/// the input data must already be the truncated hash. | ||
pub fn wrap(code: Code, data: &[u8]) -> Multihash { | ||
let mut code_buf = varint_encode::u64_buffer(); | ||
let code = varint_encode::u64(code.to_u64(), &mut code_buf); | ||
|
||
let mut size_buf = varint_encode::u64_buffer(); | ||
let size = varint_encode::u64(data.len() as u64, &mut size_buf); | ||
|
||
Multihash { | ||
storage: Storage::from_slices(&[code, &size, &data]), | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand it
MultihashRef
is dropped here but we knowbytes
are valid or we'd haveErr
. I do however find it confusing / awkward, probably my lack of Rust experience, makes me react with Parse don't validateI think parse approach would not work here as
MultihashRef
contains&[u8]
, but maybe it make sense to have separate fn that does not constructMultihashRef
?Maybe this is idiomatic in Rust, but maybe a comment for newbies (like myself) would be worth it.