Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ChaCha and Salsa intrinsics. #128

Open
wants to merge 65 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
6ecc467
Initial integration of intrinsics
macaba Apr 6, 2020
24901e3
Update Program.cs
macaba Apr 6, 2020
6424c40
Removed 5.0 target to allow CI to build
macaba Apr 6, 2020
b675b83
Resolve conflicts
TimothyMakkison Oct 7, 2022
f1f3460
Remove treat warning as errors
TimothyMakkison Oct 8, 2022
b3d9aa6
Alter Dencrypt test
TimothyMakkison Oct 8, 2022
a6b8a4a
Add Salsa20 and XSalsa20 benchmarks
TimothyMakkison Oct 10, 2022
565ad2c
Added back .net 6 support
TimothyMakkison Oct 11, 2022
2184dba
Parameterise Salsa20TestVectors
TimothyMakkison Oct 11, 2022
e18036f
Added Intrinsics check
TimothyMakkison Oct 11, 2022
fdc1e1c
Added intrinsics support for Salsa20 (64 <= is broken)
TimothyMakkison Oct 11, 2022
0b81a0d
Fix Salsa20 intrinsics for bytes >= 64
TimothyMakkison Oct 11, 2022
c134b8b
Merge benchmark updates
TimothyMakkison Oct 11, 2022
e149e3c
Fix benchmark
TimothyMakkison Oct 11, 2022
f6e1680
Use transpose rotate transpose trick - 10-20% faster
TimothyMakkison Oct 11, 2022
cbf3778
Add Transpose method
TimothyMakkison Oct 11, 2022
3ff01b1
Revert "Add Transpose method"
TimothyMakkison Oct 11, 2022
d8b341a
Add transpose method
TimothyMakkison Oct 11, 2022
c286d7b
Fix XSalsa benchmark
TimothyMakkison Oct 11, 2022
e770b25
Code cleanup
TimothyMakkison Oct 11, 2022
836ae77
Correct nonce size
TimothyMakkison Oct 11, 2022
e4d35c6
Inline pre processor variable
TimothyMakkison Oct 11, 2022
288ae8c
Add variable length test for ChaCha and Salsa
TimothyMakkison Oct 11, 2022
e090aa9
Refactor ChaCha20BaseIntrinsics
TimothyMakkison Oct 12, 2022
b2478ac
Refactor Salsa20BaseIntrinsics
TimothyMakkison Oct 12, 2022
27b544d
Added HChaCha and HSalsa
TimothyMakkison Oct 12, 2022
79ca4ce
Added little endian check
TimothyMakkison Oct 13, 2022
589912e
Added intrinsics process stream
TimothyMakkison Oct 14, 2022
5deec51
Added Salsa ProcessKeyStreamBlock test
TimothyMakkison Oct 14, 2022
37b41ab
Minor HSalsa & KeyStream api changes
TimothyMakkison Oct 14, 2022
e26c7fb
Remove pre processor per method system checks. Instead using either a…
TimothyMakkison Oct 15, 2022
58d6c25
Use ChaChaCore or ChaChaIntrinsics instead of pre processor checks
TimothyMakkison Oct 16, 2022
24b3357
Delete Snuffle pre processor checks
TimothyMakkison Oct 16, 2022
8c88ea5
Delete Snuffle pre processor checks
TimothyMakkison Oct 16, 2022
ae42b81
Resolve
TimothyMakkison Oct 16, 2022
9270136
Fix process stackoverflow bug
TimothyMakkison Oct 16, 2022
b46ca34
Fix incorrect system checks
TimothyMakkison Oct 17, 2022
b340bc4
Fix incorrect system checks
TimothyMakkison Oct 17, 2022
a583d02
Resolve
TimothyMakkison Oct 17, 2022
f3605ba
Rewrite HSalsa to use only Sse2
TimothyMakkison Oct 17, 2022
8db68e1
Minor refactoring, add comments
TimothyMakkison Oct 17, 2022
202ff15
Refactor BaseIntrinsics to use pointers
TimothyMakkison Oct 17, 2022
4d68791
Resolve conflicts
TimothyMakkison Oct 17, 2022
59797d5
Move Core files into folders
TimothyMakkison Oct 17, 2022
cd7cd48
Fix process methods
TimothyMakkison Oct 17, 2022
6b3298d
Added powershell test file, runs tests with various simd modes enable…
TimothyMakkison Oct 18, 2022
d1ede2a
Update core namespaces
TimothyMakkison Oct 18, 2022
2ba36af
Added Intrinsic/Scalar tests
TimothyMakkison Oct 18, 2022
d7d04d4
Updated Intrinsics test powershell script
TimothyMakkison Oct 18, 2022
e559677
Code cleanup and formatting changes
TimothyMakkison Oct 18, 2022
1613592
Remove unused code
TimothyMakkison Nov 8, 2022
6ca8d65
Rename internal protected to protected internal
TimothyMakkison Nov 8, 2022
6c2b54b
Remove unnecessary usings
TimothyMakkison Nov 8, 2022
f54e5c0
Simplified slices
TimothyMakkison Nov 8, 2022
6319ec1
Removed unnecessary using statements
TimothyMakkison Nov 8, 2022
8d0d596
Add explcit access modifiers
TimothyMakkison Nov 8, 2022
3a5051b
Edit error message, move namespace and change visibility to internal
TimothyMakkison Nov 8, 2022
c8ff6fc
Remove internal core code, replaced with pre processor functions
TimothyMakkison Nov 9, 2022
2c55739
Delete core scalar and itnrinsics tests
TimothyMakkison Nov 9, 2022
23722b4
Code cleanup
TimothyMakkison Nov 9, 2022
3970397
Deleted powershell script and updated cake script
TimothyMakkison Nov 9, 2022
4cebc30
Correct visibility, remove unneeded InternalsVisibleTo, minor chnages
TimothyMakkison Nov 9, 2022
dd6178c
Add Salsa64 SSE41 shuffle optimisation, updated guards/checks and add…
TimothyMakkison Nov 9, 2022
1efadaa
Add Salsa64 SSE41 shuffle optimisation, updated guards/checks, change…
TimothyMakkison Nov 9, 2022
a3a7bb6
Merge branch 'merge_intrinsics' of https://github.com/TimothyMakkison…
TimothyMakkison Nov 9, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
Initial integration of intrinsics
  • Loading branch information
macaba committed Apr 6, 2020
commit 6ecc4671d73cac6f601ff826f9289cf1a9b1b219
13 changes: 13 additions & 0 deletions src/NaCl.Core/Base/ChaCha20Base.cs
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,19 @@ public override void ProcessKeyStreamBlock(ReadOnlySpan<byte> nonce, int counter
ArrayUtils.StoreArray16UInt32LittleEndian(block, 0, state);
}

#if INTRINSICS
public override unsafe void ProcessStream(ReadOnlySpan<byte> nonce, Span<byte> output, ReadOnlySpan<byte> input, int initialCounter, int offset = 0)
{
Span<uint> state = stackalloc uint[BLOCK_SIZE_IN_INTS];
SetInitialState(state, nonce, initialCounter);
fixed(uint* x = state)
fixed (byte* m = input, c = output.Slice(offset))
{
ChaCha20BaseIntrinsics.ChaCha20(x, m, c, (ulong)input.Length);
}
}
#endif

/// <summary>
/// Process a pseudorandom keystream block, converting the key and part of the <paramref name="nonce"> into a <paramref name="subkey">, and the remainder of the <paramref name="nonce">.
/// </summary>
Expand Down
570 changes: 570 additions & 0 deletions src/NaCl.Core/Base/ChaCha20BaseIntrinsics.cs

Large diffs are not rendered by default.

9 changes: 9 additions & 0 deletions src/NaCl.Core/Base/Snuffle.cs
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,10 @@ public Snuffle(ReadOnlyMemory<byte> key, int initialCounter)
/// <returns>ByteBuffer.</returns>
public abstract void ProcessKeyStreamBlock(ReadOnlySpan<byte> nonce, int counter, Span<byte> block);

#if INTRINSICS
public abstract void ProcessStream(ReadOnlySpan<byte> nonce, Span<byte> output, ReadOnlySpan<byte> input, int initialCounter, int offset = 0);
#endif

/// <summary>
/// The size of the randomly generated nonces.
/// ChaCha20 uses 12-byte nonces, but XSalsa20 and XChaCha20 use 24-byte nonces.
Expand Down Expand Up @@ -193,6 +197,10 @@ public virtual void Decrypt(ReadOnlySpan<byte> ciphertext, ReadOnlySpan<byte> no
Process(nonce, plaintext, ciphertext);
}


#if INTRINSICS
private void Process(ReadOnlySpan<byte> nonce, Span<byte> output, ReadOnlySpan<byte> input, int offset = 0) => ProcessStream(nonce, output, input, InitialCounter, offset);
#else
/// <summary>
/// Processes the Encryption/Decryption function.
/// </summary>
Expand Down Expand Up @@ -237,6 +245,7 @@ private void Process(ReadOnlySpan<byte> nonce, Span<byte> output, ReadOnlySpan<b
}
}
}
#endif

/// <summary>
/// Formats the nonce length exception message.
Expand Down
12 changes: 8 additions & 4 deletions src/NaCl.Core/NaCl.Core.csproj
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
<Project Sdk="Microsoft.NET.Sdk">
<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<TargetFrameworks>netstandard1.6;netstandard2.0;netstandard2.1;netcoreapp3.1</TargetFrameworks>
<TargetFrameworks Condition="'$(OS)' != 'Unix'">netstandard1.6;netstandard2.0;netstandard2.1;netcoreapp3.1;net45;net48</TargetFrameworks>
<TargetFrameworks>netstandard1.6;netstandard2.0;netstandard2.1;netcoreapp3.1;netcoreapp5.0</TargetFrameworks>
<TargetFrameworks Condition="'$(OS)' != 'Unix'">netstandard1.6;netstandard2.0;netstandard2.1;netcoreapp3.1;netcoreapp5.0;net45;net48</TargetFrameworks>
<AllowUnsafeBlocks>true</AllowUnsafeBlocks>
<Version>1.2.0</Version>
<Authors>David De Smet</Authors>
Expand All @@ -26,10 +26,14 @@
<PackageReference Include="System.Memory" Version="4.5.3" />
</ItemGroup>

<PropertyGroup Condition="$(TargetFramework) == 'netcoreapp3.1'">
<PropertyGroup Condition="$(TargetFramework) == 'netcoreapp3.1' OR $(TargetFramework) == 'netcoreapp5.0'">
<DefineConstants>FCL_BITOPS</DefineConstants>
</PropertyGroup>

<PropertyGroup Condition="$(TargetFramework) == 'netcoreapp3.1' OR $(TargetFramework) == 'netcoreapp5.0'">
<DefineConstants>INTRINSICS</DefineConstants>
</PropertyGroup>

<Target Name="LogDebugInfo">
<Message Text="Building for $(TargetFramework) on $(OS)" Importance="High" />
</Target>
Expand Down
3 changes: 3 additions & 0 deletions test/NaCl.Core.Benchmarks/ChaCha20Benchmark.cs
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,11 @@
using Internal;

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Jobs;

[BenchmarkCategory("Stream Cipher")]
[SimpleJob(RuntimeMoniker.NetCoreApp21, baseline: true)]
[SimpleJob(RuntimeMoniker.NetCoreApp31)]
[MemoryDiagnoser]
[RPlotExporter, RankColumn]
public class ChaCha20Benchmark
Expand Down
4 changes: 2 additions & 2 deletions test/NaCl.Core.Benchmarks/NaCl.Core.Benchmarks.csproj
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
<Project Sdk="Microsoft.NET.Sdk">
<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>netcoreapp3.1</TargetFramework>
<TargetFrameworks>netcoreapp3.1;netcoreapp2.1</TargetFrameworks>
</PropertyGroup>

<ItemGroup>
Expand Down
8 changes: 4 additions & 4 deletions test/NaCl.Core.Benchmarks/Program.cs
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,11 @@ static void Main(string[] args)
// Execute following code:
// $ dotnet run -c release --framework netcoreapp3.1
//BenchmarkSwitcher.FromAssembly(typeof(Program).Assembly).Run(args);
BenchmarkRunner.Run<Poly1305Benchmark>();
//BenchmarkRunner.Run<Poly1305Benchmark>();
BenchmarkRunner.Run<ChaCha20Benchmark>();
BenchmarkRunner.Run<ChaCha20Poly1305Benchmark>();
BenchmarkRunner.Run<XChaCha20Benchmark>();
BenchmarkRunner.Run<XChaCha20Poly1305Benchmark>();
//BenchmarkRunner.Run<ChaCha20Poly1305Benchmark>();
//BenchmarkRunner.Run<XChaCha20Benchmark>();
//BenchmarkRunner.Run<XChaCha20Poly1305Benchmark>();

Console.ReadLine();
}
Expand Down
2 changes: 1 addition & 1 deletion test/NaCl.Core.Tests/NaCl.Core.Tests.csproj
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<TargetFramework>netcoreapp3.1</TargetFramework>
<TargetFrameworks>netcoreapp3.1;netcoreapp2.1</TargetFrameworks>
<LangVersion>latest</LangVersion>
<IsTestProject>true</IsTestProject>
</PropertyGroup>
Expand Down