Skip to content

[API Proposal]: One shot hashing of StreamΒ #62489

Closed
@vcsjones

Description

@vcsjones

Background and motivation

As of .NET 5 (Hash) and .NET 6 (HMAC) we have static one-shots for computing the hash of bytes, either in the form of an array of a ReadOnlySpan<byte>.

This one shot I think is a good direction for .NET. To re-cap what we have today in terms of APIs:

  1. HASHALG.HashData - one shot of fixed-length buffers, array or span
  2. IncrementalHash - updatable hash
  3. HASHALG.Create - .NET Framework 1.0 design

My personal belief is that number three should be mostly de-emphasized for common use cases. One shots should use option one, and incremental should use two. Create should be there for cases where polymorphic behavior is desired, if ever.

There is one case remaining (I believe) where option one and two don't handle as well as three: hashing streams, because the HashAlgorithm instances have ComputeHash(Stream stream)

It's not uncommon to need to hash a stream all in one go (thus still a one-shot). I would propose that these should also be one shots to further reduce the need for option three.

As a matter of implementation, the goal is not to necessarily use platform APIs to produce the most optimal hashing API (though we can certainly do our best to optimize some certain situations). Rather the goal is to provide an API surface that does not require developers to manage an instance of a hash object.

API Proposal

namespace System.Security.Cryptography {
    public abstract partial class MD5 {
        public static byte[] HashData(Stream source);
        public static int HashData(Stream source, Span<byte> destination);

        public static async ValueTask<byte[]> HashDataAsync(Stream source, CancellationToken cancellationToken = default);
        public static async ValueTask<int> HashDataAsync(Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
    }

    public abstract partial class SHA1 {
        public static byte[] HashData(Stream source);
        public static int HashData(Stream source, Span<byte> destination);

        public static async ValueTask<byte[]> HashDataAsync(Stream source, CancellationToken cancellationToken = default);
        public static async ValueTask<int> HashDataAsync(Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
    }

    public abstract partial class SHA256 {
        public static byte[] HashData(Stream source);
        public static int HashData(Stream source, Span<byte> destination);

        public static async ValueTask<byte[]> HashDataAsync(Stream source, CancellationToken cancellationToken = default);
        public static async ValueTask<int> HashDataAsync(Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
    }

    public abstract partial class SHA384 {
        public static byte[] HashData(Stream source);
        public static int HashData(Stream source, Span<byte> destination);

        public static async ValueTask<byte[]> HashDataAsync(Stream source, CancellationToken cancellationToken = default);
        public static async ValueTask<int> HashDataAsync(Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
    }

    public abstract partial class SHA512 {
        public static byte[] HashData(Stream source);
        public static int HashData(Stream source, Span<byte> destination);

        public static async ValueTask<byte[]> HashDataAsync(Stream source, CancellationToken cancellationToken = default);
        public static async ValueTask<int> HashDataAsync(Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
    }

    public abstract partial class HMACMD5 {
        public static byte[] HashData(ReadOnlySpan<byte> key, Stream source);
        public static int HashData(ReadOnlySpan<byte> key, Stream source, Span<byte> destination);

        public static async ValueTask<byte[]> HashDataAsync(ReadOnlyMemory<byte> key, Stream source, CancellationToken cancellationToken = default);
        public static async ValueTask<int> HashDataAsync(ReadOnlyMemory<byte> key, Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
    }

    public abstract partial class HMACSHA1 {
        public static byte[] HashData(ReadOnlySpan<byte> key, Stream source);
        public static int HashData(ReadOnlySpan<byte> key, Stream source, Span<byte> destination);

        public static async ValueTask<byte[]> HashDataAsync(ReadOnlyMemory<byte> key, Stream source, CancellationToken cancellationToken = default);
        public static async ValueTask<int> HashDataAsync(ReadOnlyMemory<byte> key, Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
    }

    public abstract partial class HMACSHA256 {
        public static byte[] HashData(ReadOnlySpan<byte> key, Stream source);
        public static int HashData(ReadOnlySpan<byte> key, Stream source, Span<byte> destination);

        public static async ValueTask<byte[]> HashDataAsync(ReadOnlyMemory<byte> key, Stream source, CancellationToken cancellationToken = default);
        public static async ValueTask<int> HashDataAsync(ReadOnlyMemory<byte> key, Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
    }

    public abstract partial class HMACSHA384 {
        public static byte[] HashData(ReadOnlySpan<byte> key, Stream source);
        public static int HashData(ReadOnlySpan<byte> key, Stream source, Span<byte> destination);

        public static async ValueTask<byte[]> HashDataAsync(ReadOnlyMemory<byte> key, Stream source, CancellationToken cancellationToken = default);
        public static async ValueTask<int> HashDataAsync(ReadOnlyMemory<byte> key, Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
    }

    public abstract partial class HMACSHA512 {
        public static byte[] HashData(ReadOnlySpan<byte> key, Stream source);
        public static int HashData(ReadOnlySpan<byte> key, Stream source, Span<byte> destination);

        public static async ValueTask<byte[]> HashDataAsync(ReadOnlyMemory<byte> key, Stream source, CancellationToken cancellationToken = default);
        public static async ValueTask<int> HashDataAsync(ReadOnlyMemory<byte> key, Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
    }
}

API Usage

static void Example1() {
    using (FileStream fs = File.Open("/please/hash/me", FileMode.Open)) {
        byte[] fileHash = SHA256.HashData(fs);
    }
}

static void Example2() {
    using (FileStream fs = File.Open("/please/hash/me", FileMode.Open)) {
        Span<byte> buffer = stackalloc byte[32];
        int written = SHA256.TryHashData(fs, buffer);
    }
}

static async Task Example3() {
    using (FileStream fs = File.Open("/please/hash/me", FileMode.Open)) {
        byte[] fileHash = await SHA256.HashDataAsync(fs);
    }
}

static async Task Example4() {
    using (FileStream fs = File.Open("/please/hash/me", FileMode.Open)) {
        Memory<byte> existingBuffer = default; // Something reasonable here.
        int written = await SHA256.TryHashDataAsync(fs, existingBuffer);
    }
}

Alternative Designs

"Do nothing" is likely the best alternative to this. It's possible to do this today using IncrementalHash (read from the stream, update, done) or HASHALG.ComputeHash{Async}. Both of these options however require managing an instance of the hash object.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions