Description
Background and motivation
It turns out it's quite common to want to search for whitespace (char.IsWhiteSpace
) or things other than whitespace (!char.IsWhiteSpace
). This is true not only in regex (\s
and \S
) (according to our nuget regex database there are ~13,000 occurrences of a regex that's simply \s
) but also in many open-coded loops, e.g.
- https://github.com/dotnet/sdk/blob/4a27d47440dd3eaefbdb31135ef8867f6758161f/src/Tasks/Microsoft.NET.Build.Tasks/LockFileExtensions.cs#L127-L138
- https://github.com/JimmyCushnie/JimmysUnityUtilities/blob/834059548b2b392d692ddcf28194692e3ae7b2c1/Scripts/Extensions/Csharp%20types/StringBuilderExtensions.cs#L68-L77
- https://github.com/cake-build/cake/blob/a0298c0b5f76f819f0cc0d16ac9ef55d8b26adf9/src/Cake.Core/Configuration/Parser/ConfigurationParser.cs#L107-L117
- https://github.com/rubberduck-vba/Rubberduck/blob/3a9b233cf6ab519773d188e77d09ee8d8111bf49/Rubberduck.Core/UI/Refactorings/AnnotateDeclaration/AnnotationArgumentViewModel.cs#L195-L198
- https://github.com/aspnet/Razor/blob/5439cfe540084edd673b7ed626f2ec9cf3f13b18/src/Microsoft.AspNetCore.Razor.Language/DirectiveTokenEditHandler.cs#L36-L47
- https://github.com/OmniSharp/omnisharp-roslyn/blob/3ae5c8acd7ea3f03ab9e24c28280a320c573721a/src/OmniSharp.Cake/Configuration/Parser/ConfigurationParser.cs#L94-L104
- https://github.com/Unity-Technologies/UnityCsReference/blob/332310b494c5416cdae6c1209dbae7cfa6847c8d/Editor/Mono/Scripting/Compilers/MicrosoftResponseFileParser.cs#L195-L202
- https://github.com/PowerShell/PowerShell/blob/a2ee05400f8cb4a44cd87742f95ebc2c3472e649/src/System.Management.Automation/engine/parser/DebugViewWriter.cs#L1197-L1205
- https://github.com/mono/mono/blob/e2c5f4b0ad1a6b21ca0735f0b35b8611d4ad87b3/mcs/class/referencesource/System.Core/Microsoft/Scripting/Ast/DebugViewWriter.cs#L1155-L1162
- https://github.com/stripe/stripe-dotnet/blob/42dbc8371c5a4ee36df8933e6d72c2c2e3e41d2e/src/Stripe.net/Infrastructure/StringUtils.cs#L9
- https://github.com/mono/mono/blob/e2c5f4b0ad1a6b21ca0735f0b35b8611d4ad87b3/mcs/class/referencesource/System.Web/UI/Util.cs#L995-L1002
- https://github.com/VahidN/EFSecondLevelCache.Core/blob/1de038417ba22c40d9ebe411b67c9e1a7e4ad838/src/EFSecondLevelCache.Core/EFQueryExpressionVisitor.cs#L883-L893
- https://github.com/InstaSharp/InstaSharp/blob/7ab2aad6bdef175dd63620bf39f74fcf02696898/src/InstaSharp/Extensions/StringExtensions.cs#L13-L18
- https://github.com/FirelyTeam/Fhir.Metrics/blob/dd574b76077280299fd7104c754481a2e143ca72/src/Fhir.Metrics/Utils/Parser.cs#L57
- https://github.com/baohaojun/beagrep/blob/b1d56ef14d1d663d43b6af198600caa21623d2f2/Util/StringFu.cs#L388-L394
Etc. We should expose these as dedicated helpers, whether or not we're able to improve performance over a simple loop (we might be able to, for at least some kinds of input).
API Proposal
namespace System;
public static class MemoryExtensions
{
+ public static int IndexOfAnyWhiteSpace(this ReadOnlySpan<char> span);
+ public static int IndexOfAnyExceptWhiteSpace(this ReadOnlySpan<char> span);
+ public static int LastIndexOfAnyWhiteSpace(this ReadOnlySpan<char> span);
+ public static int LastIndexOfAnyExceptWhiteSpace(this ReadOnlySpan<char> span);
}
- This is only proposed for
ReadOnlySpan<char>
and not alsoSpan<char>
, since the most common case by far is expected to be spans derived from strings. The existing MemoryExtensions.IsWhiteSpace is also only exposed forReadOnlySpan<char>
.
API Usage
e.g. MemoryExtensions.IsWhiteSpace could be rewritten as simply:
public static bool IsWhiteSpace(this ReadOnlySpan<char> span) => span.IndexOfAnyExceptWhiteSpace() < 0;
Alternative Designs
If we want to expose these but don't want them to be so prominent, once #68328 is implemented (assuming it sticks with the proposed design), this could instead be exposed as a static property on IndexOfAnyValues
:
public static class IndexOfAnyValues
{
+ public static IndexOfAnyValues<char> WhiteSpace { get; }
}
in which case the same functionality could be achieved with:
int wsIndex = span.IndexOfAny(IndexOfAnyValues.WhiteSpace); // or IndexOfAnyExcept
The WhiteSpace property would cache a specialized concrete implementation that does what the proposed IndexOfAnyWhiteSpace would do.
Risks
No response