Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removed unnecessary Regex constructors for #353 #354

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 21 additions & 22 deletions src/Humanizer/StringHumanizeExtensions.cs
Original file line number Diff line number Diff line change
Expand Up @@ -9,37 +9,37 @@ namespace Humanizer
/// </summary>
public static class StringHumanizeExtensions
{
private static readonly Regex PascalCaseWordPartsRegex;
private static readonly Regex FreestandingSpacingCharRegex;

static StringHumanizeExtensions()
{
PascalCaseWordPartsRegex = new Regex(@"[A-Z]?[a-z]+|[0-9]+|[A-Z]+(?=[A-Z][a-z]|[0-9]|\b)",
RegexOptions.IgnorePatternWhitespace | RegexOptions.ExplicitCapture);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be RegexOptions.Compiled as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, but that option is not supported in portable class libraries, so we're stuck with interpreted expressions. This could be worked around by removing the regex object entirely and "pre-compiling" the expression to a function ourselves, but I'm not sure the added complexity would be worth the performance. Perhaps I'll give it a try, just to see what benefit there is.

FreestandingSpacingCharRegex = new Regex(@"\s[-_]|[-_]\s");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

}

static string FromUnderscoreDashSeparatedWords (string input)
{
return String.Join(" ", input.Split(new[] {'_', '-'}));
}

static string FromPascalCase(string input)
{
var pascalCaseWordBoundaryRegex = new Regex(@"
(?# word to word, number or acronym)
(?<=[a-z])(?=[A-Z0-9])|
(?# number to word or acronym)
(?<=[0-9])(?=[A-Za-z])|
(?# acronym to number)
(?<=[A-Z])(?=[0-9])|
(?# acronym to word)
(?<=[A-Z])(?=[A-Z][a-z])|
(?# words/acronyms/numbers separated by space)
(?<=[^\s])(?=[\s])
", RegexOptions.IgnorePatternWhitespace);
if (input.Length == 1)
return Char.ToUpper(input[0]).ToString();

var result = pascalCaseWordBoundaryRegex
.Split(input)
.Select(word =>
word.Trim().ToCharArray().All(Char.IsUpper) && word.Length > 1
? word
: word.ToLower())
.Aggregate((res, word) => res + " " + word.Trim());
var result = PascalCaseWordPartsRegex
.Matches(input).Cast<Match>()
.Select(match => match.Value.ToCharArray().All(Char.IsUpper) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.ToCharArray() is unnecessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

(match.Value.Length > 1 || (match.Index > 0 && input[match.Index - 1] == ' ') || match.Value == "I")
? match.Value
: match.Value.ToLower())
.Aggregate((res, word) => res + " " + word);

result = Char.ToUpper(result[0]) +
result.Substring(1, result.Length - 1);
return result.Replace(" i ", " I "); // I is an exception
return result;
}

/// <summary>
Expand All @@ -55,8 +55,7 @@ public static string Humanize(this string input)

// if input contains a dash or underscore which preceeds or follows a space (or both, i.g. free-standing)
// remove the dash/underscore and run it through FromPascalCase
Regex r = new Regex(@"[\s]{1}[-_]|[-_][\s]{1}", RegexOptions.IgnoreCase);
if (r.IsMatch(input))
if (FreestandingSpacingCharRegex.IsMatch(input))
return FromPascalCase(FromUnderscoreDashSeparatedWords(input));

if (input.Contains("_") || input.Contains("-"))
Expand Down
4 changes: 2 additions & 2 deletions src/Humanizer/Transformer/ToTitleCase.cs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
using System.Linq;

namespace Humanizer
{
Expand All @@ -25,7 +25,7 @@ public string Transform(string input)

static bool AllCapitals(string input)
{
return Regex.IsMatch(input, @"^[A-Z]+$");
return input.ToCharArray().All(Char.IsUpper);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.ToCharArray() is unnecessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this is a limitation with the portable class library framework. String doesn't implement IEnumerable<char> so we need that call. Also, it's consistent with everywhere else in the library that checks for all caps.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok... thanks! Shame on me...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries! I thought the same thing, too. I knew RegexOptions.Compiled wasn't available, but this one took me for surprise.

}
}
}