Closed
Description
There's a function in corelib used for determining whether a char is whitespace:
https://source.dot.net/#System.Private.CoreLib/shared/System/Char.cs,264
private static bool IsWhiteSpaceLatin1(char c)
{
if ((c == ' ') || (c >= '\x0009' && c <= '\x000d') || c == '\x00a0' || c == '\x0085')
{
return (true);
}
return (false);
}
The JIT is generating this code for it:
G_M8395_IG01:
G_M8395_IG02:
0FB7C1 movzx rax, cx
83F820 cmp eax, 32
7418 je SHORT G_M8395_IG04
83F809 cmp eax, 9
7C05 jl SHORT G_M8395_IG03
83F80D cmp eax, 13
7E0E jle SHORT G_M8395_IG04
G_M8395_IG03:
3DA0000000 cmp eax, 160
7407 je SHORT G_M8395_IG04
3D85000000 cmp eax, 133
7506 jne SHORT G_M8395_IG06
G_M8395_IG04:
B801000000 mov eax, 1
G_M8395_IG05:
C3 ret
G_M8395_IG06:
33C0 xor eax, eax
G_M8395_IG07:
C3 ret
When I change it to instead be:
private static bool IsWhiteSpaceLatin1(char c)
{
return ((c == ' ') || (c >= '\x0009' && c <= '\x000d') || c == '\x00a0' || c == '\x0085');
}
the JIT instead generates:
G_M8395_IG01:
G_M8395_IG02:
0FB7C1 movzx rax, cx
83F820 cmp eax, 32
741D je SHORT G_M8395_IG05
83F809 cmp eax, 9
7C05 jl SHORT G_M8395_IG03
83F80D cmp eax, 13
7E13 jle SHORT G_M8395_IG05
G_M8395_IG03:
3DA0000000 cmp eax, 160
740C je SHORT G_M8395_IG05
3D85000000 cmp eax, 133
0F94C0 sete al
0FB6C0 movzx rax, al
G_M8395_IG04:
C3 ret
G_M8395_IG05:
B801000000 mov eax, 1
G_M8395_IG06:
C3 ret
I'd have expected (maybe naively?) these to generate the same asm. Is it expected that they result in different asm? At least on my machine, the change results in measurable throughput difference for Char.IsWhiteSpace (which calls Char.IsWhiteSpaceLatin1), around a ~20% improvement when the char isn't whitespace.
category:cq
theme:basic-cq
skill-level:expert
cost:medium