Skip to content

Different codegen for return cond; vs if (cond) return true; return false; #8363

Closed
@stephentoub

Description

@stephentoub

There's a function in corelib used for determining whether a char is whitespace:
https://source.dot.net/#System.Private.CoreLib/shared/System/Char.cs,264

        private static bool IsWhiteSpaceLatin1(char c)
        {
            if ((c == ' ') || (c >= '\x0009' && c <= '\x000d') || c == '\x00a0' || c == '\x0085')
            {
                return (true);
            }
            return (false);
        }

The JIT is generating this code for it:

G_M8395_IG01:

G_M8395_IG02:
       0FB7C1               movzx    rax, cx
       83F820               cmp      eax, 32
       7418                 je       SHORT G_M8395_IG04
       83F809               cmp      eax, 9
       7C05                 jl       SHORT G_M8395_IG03
       83F80D               cmp      eax, 13
       7E0E                 jle      SHORT G_M8395_IG04

G_M8395_IG03:
       3DA0000000           cmp      eax, 160
       7407                 je       SHORT G_M8395_IG04
       3D85000000           cmp      eax, 133
       7506                 jne      SHORT G_M8395_IG06

G_M8395_IG04:
       B801000000           mov      eax, 1

G_M8395_IG05:
       C3                   ret

G_M8395_IG06:
       33C0                 xor      eax, eax

G_M8395_IG07:
       C3                   ret

When I change it to instead be:

        private static bool IsWhiteSpaceLatin1(char c)
        {
            return ((c == ' ') || (c >= '\x0009' && c <= '\x000d') || c == '\x00a0' || c == '\x0085');
        }

the JIT instead generates:

G_M8395_IG01:

G_M8395_IG02:
       0FB7C1               movzx    rax, cx
       83F820               cmp      eax, 32
       741D                 je       SHORT G_M8395_IG05
       83F809               cmp      eax, 9
       7C05                 jl       SHORT G_M8395_IG03
       83F80D               cmp      eax, 13
       7E13                 jle      SHORT G_M8395_IG05

G_M8395_IG03:
       3DA0000000           cmp      eax, 160
       740C                 je       SHORT G_M8395_IG05
       3D85000000           cmp      eax, 133
       0F94C0               sete     al
       0FB6C0               movzx    rax, al

G_M8395_IG04:
       C3                   ret

G_M8395_IG05:
       B801000000           mov      eax, 1

G_M8395_IG06:
       C3                   ret

I'd have expected (maybe naively?) these to generate the same asm. Is it expected that they result in different asm? At least on my machine, the change results in measurable throughput difference for Char.IsWhiteSpace (which calls Char.IsWhiteSpaceLatin1), around a ~20% improvement when the char isn't whitespace.

category:cq
theme:basic-cq
skill-level:expert
cost:medium

Metadata

Metadata

Assignees

Labels

JitUntriagedCLR JIT issues needing additional triagearea-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMIenhancementProduct code improvement that does NOT require public API changes/additionsin-prThere is an active PR which will close this issue when it is mergedoptimizationtenet-performancePerformance related issue

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions