-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Tweak Rune.DecodeFromUtf8 #68799
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tweak Rune.DecodeFromUtf8 #68799
Conversation
|
I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label. |
|
Tagging subscribers to this area: @dotnet/area-system-text-encoding Issue DetailsIt's quite possible this isn't the right test or that other inputs would fare worse; @GrabYourPitchforks, please do let me know if I should be running something different (or if my machine may not be representative).
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System;
using System.Text;
[MemoryDiagnoser]
public class Program
{
static void Main(string[] args) => BenchmarkSwitcher.FromAssembly(typeof(Program).Assembly).Run(args);
private byte[] _bytes;
[Params(0, 1, 2, 3, 4)]
public int Text { get; set; }
[GlobalSetup]
public void Setup()
{
string text = Text switch
{
// From https://lipsum.com/
0 => "Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.",
// Below text generated with https://generator.lorem-ipsum.info/
// Standard Lorem Ipsum
1 => "In laoreet petentium hendrerit mel, posse perpetua id vim. Te vis esse pertinax. Sed ad laboramus intellegat. Esse nostro ceteros per ut, no idque solet melius usu, amet utinam populo vel no. No omnis moderatius vim.",
// Cyrillic
2 => "Лорем ипсум долор сит амет, хис адхуц дицта ат. Ест омнес партем нострум ан, нец цивибус денияуе перципитур ет. Ут анимал праесент цонтентионес вис, сеа хинц минимум ут, симул виртуте хендрерит дуо цу. Ех алияуид денияуе диссентиунт вел. Амет цонсецтетуер ид вел.",
// Greek
3 => "Λορεμ ιπσθμ δολορ σιτ αμετ, vελ ατ θνθμ θλλθμ λιβεραvισσε, vιξ σθασ γλοριατθρ αβηορρεαντ αν, σεδ ει ιθvαρετ ιντελλεγεβατ. Εα εστ ιδqθε ιντελλεγατ φορενσιβθσ, ειθσ φαστιδιι δισπθτανδο νε ηασ. Νεc αν αccομμοδαρε cονcλθδατθρqθε, ιγνοτα ριδενσ ινcιδεριντ θσθ αδ, θτ αλτερα αππελλαντθρ θσθ. Vισ θτ διcιτ qθοδσι δεβιτισ. Vελ εα ορατιο περσιθσ, ιδ οπορτεατ cονcλθσιονεμqθε ηασ, qθο ιν τραcτατοσ μνεσαρcηθμ. Πρι εqθιδεμ σθσcιπιτ ιμπερδιετ εα, ναμ vιδιτ λθδθσ νεγλεγεντθρ αν, qθι φαcετε qθαεqθε μεντιτθμ νε. Ηισ qθοτ εροσ νομιναvι ιδ, σθμο μαγνα νο σιτ, vενιαμ αντιοπαμ θτ vιξ.",
// Chinese Simplified
_ => "望敗必総減論元進東前世氏謙大。爆秋答染作構天団経処解室速物察書切。失医龍軽係府食君権題半感体。西標活街予女模行改円芸敷流比査岡海。歩読稿京慶王徳提陸情更典安要央汚。題羽次仲抜電伸華識表足前資通者撮存工局。安稿報会市芸喰職細療越変。報横必応設来向現菌米終半獲合代所泄写。右残保奏建週況将連部的際前済次質毎。",
};
_bytes = Encoding.UTF8.GetBytes(text);
}
[Benchmark]
public int IterateFromUTF8()
{
int aggregate = 0;
ReadOnlySpan<byte> bytes = _bytes;
while (!bytes.IsEmpty)
{
Rune.DecodeFromUtf8(bytes, out Rune result, out int bytesConsumed);
aggregate += result.Value;
bytes = bytes.Slice(bytesConsumed);
}
return aggregate;
}
}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! 🚀
It's quite possible this isn't the right test or that other inputs would fare worse; @GrabYourPitchforks, please do let me know if I should be running something different (or if my machine may not be representative).