- 
                Notifications
    You must be signed in to change notification settings 
- Fork 15k
[llvm][mustache] Optimize accessor splitting with a single pass #159198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: users/ilovepi/mustache-accessor-opt
Are you sure you want to change the base?
[llvm][mustache] Optimize accessor splitting with a single pass #159198
Conversation
| @llvm/pr-subscribers-llvm-support Author: Paul Kirth (ilovepi) ChangesThe splitMustacheString function previously used a loop of This change introduces a custom splitAndTrim function that 
 Full diff: https://github.com/llvm/llvm-project/pull/159198.diff 1 Files Affected: 
 diff --git a/llvm/lib/Support/Mustache.cpp b/llvm/lib/Support/Mustache.cpp
index fcb55c4edd815..8139e9eb4717f 100644
--- a/llvm/lib/Support/Mustache.cpp
+++ b/llvm/lib/Support/Mustache.cpp
@@ -35,6 +35,32 @@ static bool isContextFalsey(const json::Value *V) {
   return isFalsey(*V);
 }
 
+static void splitAndTrim(StringRef Str, SmallVectorImpl<StringRef> &Tokens) {
+  size_t CurrentPos = 0;
+  while (CurrentPos < Str.size()) {
+    // Find the next delimiter.
+    size_t DelimiterPos = Str.find('.', CurrentPos);
+
+    // If no delimiter is found, process the rest of the string.
+    if (DelimiterPos == StringRef::npos) {
+      DelimiterPos = Str.size();
+    }
+
+    // Get the current part, which may have whitespace.
+    StringRef Part = Str.slice(CurrentPos, DelimiterPos);
+
+    // Manually trim the part without creating a new string object.
+    size_t Start = Part.find_first_not_of(" \t\r\n");
+    if (Start != StringRef::npos) {
+      size_t End = Part.find_last_not_of(" \t\r\n");
+      Tokens.push_back(Part.slice(Start, End + 1));
+    }
+
+    // Move past the delimiter for the next iteration.
+    CurrentPos = DelimiterPos + 1;
+  }
+}
+
 static Accessor splitMustacheString(StringRef Str, MustacheContext &Ctx) {
   // We split the mustache string into an accessor.
   // For example:
@@ -47,13 +73,7 @@ static Accessor splitMustacheString(StringRef Str, MustacheContext &Ctx) {
     // It's a literal, so it doesn't need to be saved.
     Tokens.push_back(".");
   } else {
-    while (!Str.empty()) {
-      StringRef Part;
-      std::tie(Part, Str) = Str.split('.');
-      // Each part of the accessor needs to be saved to the arena
-      // to ensure it has a stable address.
-      Tokens.push_back(Part.trim());
-    }
+    splitAndTrim(Str, Tokens);
   }
   // Now, allocate memory for the array of StringRefs in the arena.
   StringRef *ArenaTokens = Ctx.Allocator.Allocate<StringRef>(Tokens.size());
 | 
410fdc8    to
    f5b3210      
    Compare
  
    b68de04    to
    1acd655      
    Compare
  
    2de3866    to
    c6e7926      
    Compare
  
    bd084ea    to
    735490e      
    Compare
  
    28deddc    to
    ca0de5e      
    Compare
  
    cb29414    to
    f36f86e      
    Compare
  
    919e389    to
    943c634      
    Compare
  
    8401695    to
    df30efc      
    Compare
  
    0cd6777    to
    d5cdf06      
    Compare
  
    1e2990f    to
    c6534b6      
    Compare
  
    d5cdf06    to
    66ea253      
    Compare
  
    c6534b6    to
    bfe51da      
    Compare
  
    66ea253    to
    520cdc1      
    Compare
  
    bfe51da    to
    0230277      
    Compare
  
    520cdc1    to
    37926e7      
    Compare
  
    0230277    to
    d6ce86b      
    Compare
  
    7adaaa3    to
    9afa122      
    Compare
  
    d6ce86b    to
    ee52188      
    Compare
  
    The splitMustacheString function previously used a loop of StringRef::split and StringRef::trim. This was inefficient as it scanned each segment of the accessor string multiple times. This change introduces a custom splitAndTrim function that performs both operations in a single pass over the string, reducing redundant work and improving performance, most notably in the number of CPU cycles executed. Metric | Baseline | Optimized | Change -------------- | -------- | --------- | ------- Time (ms) | 35.57 | 35.36 | -0.59% Cycles | 34.91M | 34.26M | -1.86% Instructions | 85.54M | 85.24M | -0.35% Branch Misses | 111.9K | 112.2K | +0.27% Cache Misses | 242.1K | 239.9K | -0.91%
ee52188    to
    9808298      
    Compare
  
    9afa122    to
    7d30130      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM w/ nit
| size_t DelimiterPos = Str.find('.', CurrentPos); | ||
|  | ||
| // If no delimiter is found, process the rest of the string. | ||
| if (DelimiterPos == StringRef::npos) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can omit the curly braces here

The splitMustacheString function previously used a loop of
StringRef::split and StringRef::trim. This was inefficient as
it scanned each segment of the accessor string multiple times.
This change introduces a custom splitAndTrim function that
performs both operations in a single pass over the string,
reducing redundant work and improving performance, most notably
in the number of CPU cycles executed.