Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce amount of expression objects created during evaluations #15552

Merged
merged 15 commits into from
Dec 15, 2023

Conversation

kgyrtkirk
Copy link
Member

@kgyrtkirk kgyrtkirk commented Dec 13, 2023

I was looking into a query which was performing a bit poorly because the case_searched was touching more than 1 columns (if there is only 1 column there is a cache based evaluator).
While I was doing that I've noticed that there are a few simple things which could help a bit:

  • use a static TRUE/FALSE instead of creating a new object every time
  • create the ExprEval early for ConstantExpr -s (except the one for BigInteger which seem to have some odd contract)
  • return early from type autodetection

these changes mostly reduce the amount of garbage the query creates during case_searched evaluation; although ExpressionSelectorBenchmark shows some improvements ~15% - but my manual trials on the taxi dataset with 60M rows showed more improvements - probably due to the fact that these changes mostly only reduce gc pressure.

q1.3
SELECT 
  case when 
    LOOKUP("dropoff", 'look') = 'x' or 
  "max_temperature" = '125711'  or min_temperature = '11' or min_temperature = '112' or min_temperature = '1212' or min_temperature = '32' then min_temperature end,
  sum("total_amount"),
  count(1)
FROM "trips_xaa"
group by 1

execution time improved from 6.69s to 4.08s

ExpressionSelectorBenchmark results
 Benchmark                                                  (rowsPerSegment)  Mode  Cnt    Score   Error  Units
-ExpressionSelectorBenchmark.arithmeticOnLong                        1000000  avgt   10    8.371 ±  0.100  ms/op
+ExpressionSelectorBenchmark.arithmeticOnLong                        1000000  avgt   10    8.254 ± 0.041  ms/op
-ExpressionSelectorBenchmark.caseSearched1                           1000000  avgt   10    9.612 ±  0.581  ms/op
+ExpressionSelectorBenchmark.caseSearched1                           1000000  avgt   10    9.197 ± 0.033  ms/op
-ExpressionSelectorBenchmark.caseSearched2                           1000000  avgt   10  275.416 ± 15.150  ms/op
+ExpressionSelectorBenchmark.caseSearched2                           1000000  avgt   10  233.649 ± 4.713  ms/op
-ExpressionSelectorBenchmark.caseSearchedWithLookup                  1000000  avgt   10  125.643 ±  6.343  ms/op
+ExpressionSelectorBenchmark.caseSearchedWithLookup                  1000000  avgt   10  104.813 ± 1.242  ms/op
-ExpressionSelectorBenchmark.caseSearchedWithLookup2                 1000000  avgt   10  132.319 ± 10.260  ms/op
+ExpressionSelectorBenchmark.caseSearchedWithLookup2                 1000000  avgt   10  116.626 ± 2.644  ms/op
-ExpressionSelectorBenchmark.stringConcatAndCompareOnLong            1000000  avgt   10    8.355 ±  0.078  ms/op
+ExpressionSelectorBenchmark.stringConcatAndCompareOnLong            1000000  avgt   10    8.429 ± 0.119  ms/op
-ExpressionSelectorBenchmark.strlenUsingExpressionAsLong             1000000  avgt   10    9.236 ±  0.205  ms/op
+ExpressionSelectorBenchmark.strlenUsingExpressionAsLong             1000000  avgt   10    9.110 ± 0.084  ms/op
-ExpressionSelectorBenchmark.strlenUsingExpressionAsString           1000000  avgt   10    6.457 ±  0.129  ms/op
+ExpressionSelectorBenchmark.strlenUsingExpressionAsString           1000000  avgt   10    6.389 ± 0.077  ms/op
-ExpressionSelectorBenchmark.strlenUsingExtractionFn                 1000000  avgt   10    3.499 ±  0.126  ms/op
+ExpressionSelectorBenchmark.strlenUsingExtractionFn                 1000000  avgt   10    3.466 ± 0.041  ms/op
-ExpressionSelectorBenchmark.timeFloorUsingCursor                    1000000  avgt   10    8.149 ±  0.035  ms/op
+ExpressionSelectorBenchmark.timeFloorUsingCursor                    1000000  avgt   10    8.117 ± 0.052  ms/op
-ExpressionSelectorBenchmark.timeFloorUsingExpression                1000000  avgt   10    9.051 ±  0.106  ms/op
+ExpressionSelectorBenchmark.timeFloorUsingExpression                1000000  avgt   10    8.928 ± 0.058  ms/op
-ExpressionSelectorBenchmark.timeFloorUsingExtractionFn              1000000  avgt   10    7.765 ±  0.167  ms/op
+ExpressionSelectorBenchmark.timeFloorUsingExtractionFn              1000000  avgt   10    7.713 ± 0.125  ms/op
-ExpressionSelectorBenchmark.timeFormatUsingExpression               1000000  avgt   10   10.423 ±  0.158  ms/op
+ExpressionSelectorBenchmark.timeFormatUsingExpression               1000000  avgt   10    9.779 ± 0.167  ms/op
-ExpressionSelectorBenchmark.timeFormatUsingExtractionFn             1000000  avgt   10    7.752 ±  0.134  ms/op
+ExpressionSelectorBenchmark.timeFormatUsingExtractionFn             1000000  avgt   10    7.773 ± 0.103  ms/op

Benchmark                                                  (rowsPerSegment)  Mode  Cnt    Score    Error  Units
ExpressionSelectorBenchmark.arithmeticOnLong                        1000000  avgt    5   83.731 ±  0.979  ms/op
ExpressionSelectorBenchmark.strlenUsingExpressionAsLong             1000000  avgt    5   93.103 ± 15.030  ms/op
ExpressionSelectorBenchmark.strlenUsingExpressionAsString           1000000  avgt    5   66.854 ±  0.532  ms/op
ExpressionSelectorBenchmark.strlenUsingExtractionFn                 1000000  avgt    5   32.302 ±  0.150  ms/op
ExpressionSelectorBenchmark.timeFloorUsingCursor                    1000000  avgt    5   83.572 ±  0.092  ms/op
ExpressionSelectorBenchmark.timeFloorUsingExpression                1000000  avgt    5   91.505 ±  0.646  ms/op
ExpressionSelectorBenchmark.timeFloorUsingExtractionFn              1000000  avgt    5   77.279 ±  0.757  ms/op
ExpressionSelectorBenchmark.timeFormatUsingExpression               1000000  avgt    5  100.554 ±  1.365  ms/op
ExpressionSelectorBenchmark.timeFormatUsingExtractionFn             1000000  avgt    5   77.211 ±  0.555  ms/op
Benchmark                                           (rowsPerSegment)  Mode  Cnt    Score   Error  Units
ExpressionSelectorBenchmark.caseSearched1                    1000000  avgt   10    9.264 ± 0.115  ms/op
ExpressionSelectorBenchmark.caseSearched2                    1000000  avgt   10  272.054 ± 9.239  ms/op
ExpressionSelectorBenchmark.caseSearchedWithLookup           1000000  avgt   10  307.767 ± 5.537  ms/op
Benchmark                                           (rowsPerSegment)  Mode  Cnt    Score   Error  Units
ExpressionSelectorBenchmark.caseSearched1                    1000000  avgt   10    9.243 ± 0.030  ms/op
ExpressionSelectorBenchmark.caseSearched2                    1000000  avgt   10  215.796 ± 1.799  ms/op
ExpressionSelectorBenchmark.caseSearchedWithLookup           1000000  avgt   10  285.949 ± 8.445  ms/op
@kgyrtkirk kgyrtkirk marked this pull request as ready for review December 13, 2023 14:59
Copy link
Contributor

@soumyava soumyava left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM minor nits

@@ -464,15 +473,18 @@ public String toString()

class ComplexExpr extends ConstantExpr<Object>
{
final ExprEval expr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious since we are evaluating this early, does the equals and hash method need to honor this ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its not included in equals / hashCode as its a computed value from the value constructor argument;

as a matter of fact... ConstantExpr also has an outputType field which is also ignored in the equals / hashCode stuff - as that is passed in as a constructor parameter - actually this could probably be refactored to remove that field; however I'm not sure what benefits that would give...

the other way around could be to build the ContantExpr on top of an ExprEval - and use ExprEval.value() to get the inner value - but that approach would not like how the current BigIntegerExpr works...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how often Expr are used in equals/hashcode, my gut says not often since the thing that makes them also usually has the original string which was parsed into Expr and is likely cheaper to just use the string

@@ -464,15 +473,18 @@ public String toString()

class ComplexExpr extends ConstantExpr<Object>
{
final ExprEval expr;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how often Expr are used in equals/hashcode, my gut says not often since the thing that makes them also usually has the original string which was parsed into Expr and is likely cheaper to just use the string

@@ -55,6 +55,9 @@ public static ExpressionType autoDetect(ExprEval eval, ExprEval otherEval)
{
ExpressionType type = eval.type();
ExpressionType otherType = otherEval.type();
if (type == otherType && eval.value() != null && otherEval.value() != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does this need more than type == otherType? Later on we do

    type = eval.value() != null ? type : otherType;
    otherType = otherEval.value() != null ? otherType : type;

so if the value is null then it just takes on the other type, so i think being null or not here doesn't matter, because if the value is null and the same type then it is fine

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure - I was too cautious...if it will be null ; the type will be String as per the comment above the method - so it will be only == if the other is also a String

In this mode, null values are {@link ExprType#STRING} typed, despite
potentially coming from an underlying numeric column, or when an underlying column was completely missing and so all values are null

regarding those lines:
if eval.value() == null ; after the 1st line type = otherType ; independently from the the 2nd line that will mean that effectively numeric(otherType, otherType) will be called...

if(eval.value() == null) {
  return otherType;
}
if(otherEval.value() == null) {
  return type;
}
return numeric(type, otherType);

which made me wonder... can ExpressionType.getType() == COMPLEX for this method? if it could this change could alter its behaviour - as earlier DOUBLE was returned in that case

to avoid changing that behaviour - for now I've changed it to:

    if (type == otherType && type.getType().isPrimitive()) {
      return type;
    }

should these pass / fail / or shouldn't exists at all?

    final ExprEval<?> complexEval = ExprEval.ofComplex(ExpressionType.UNKNOWN_COMPLEX, new Object());
    Assert.assertEquals(ExpressionType.DOUBLE, ExpressionTypeConversion.autoDetect(longEval, complexEval));
    Assert.assertEquals(ExpressionType.DOUBLE, ExpressionTypeConversion.autoDetect(doubleEval, complexEval));
    Assert.assertEquals(ExpressionType.DOUBLE, ExpressionTypeConversion.autoDetect(arrayEval, complexEval));
    Assert.assertEquals(ExpressionType.DOUBLE, ExpressionTypeConversion.autoDetect(complexEval, complexEval));

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question, so this function predates expressions supporting complex types, and shouldn't be used by anything that handles arrays or complex types.

additionally, the comment about nulls always being strings isn't quite as true as it once was, null values coming from columns use the correctly typed null constants, so its mostly null constants and nulls created without additional type information (bestEffortOf) which are string types.

I guess autoDetect should be an invalid exception if it is trying to process complex types, but autoDetect is also only used inside of eval methods as far as I can tell, Expr.getOutputType all use other methods in this class like function and operator which do handle complex and array types and ensure that they are compatible, so I'm not sure it proves to be much of a problem in practice.

Copy link
Member

@clintropolis clintropolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure the isPrimitive check is needed, thinking about since this method is only used inside of eval and so will be called for every row the cheaper it can be the better, but its probably fine too if want to be conservative about the change

@kgyrtkirk
Copy link
Member Author

I wanted to remove the isPrimitive check; but then put it back - as I think the following test should pass:

    final ExprEval<?> complexEval = ExprEval.ofComplex(ExpressionType.UNKNOWN_COMPLEX, new Object());
    final ExprEval<?> complexEval2 = ExprEval.ofComplex(new ExpressionType(ExprType.COMPLEX, null, null), new Object());
    Assert.assertEquals(
        ExpressionTypeConversion.autoDetect(complexEval, complexEval),
        ExpressionTypeConversion.autoDetect(complexEval2, complexEval)
    );

DOUBLE is bad enough that if it starts returning that for COMPLEX it will be painfull enough to fix it...having at least consistent behaviuor will help to fix it correctly

@clintropolis
Copy link
Member

clintropolis commented Dec 14, 2023

DOUBLE is bad enough that if it starts returning that for COMPLEX it will be painfull enough to fix it...having at least consistent behaviuor will help to fix it correctly

If you really want to fix autoDetect, it should ensure that complex types are the same type, i think anything else is probably an error (though I imagine unknown complex could be allowed to be equal to complex with typeName to be consistent with leastRestrictiveType logic).

This would be a slight change of behavior from today where they are basically always handled by the functions that call autoDetect as nulls (ComplexExprEval returns true for isNumericNull and 0 for getDouble, so it doesn't fail), since it would now be invalid if complex types were not compatible.

@abhishekagarwal87 abhishekagarwal87 merged commit 7552dc4 into apache:master Dec 15, 2023
83 checks passed
@gianm
Copy link
Contributor

gianm commented Jan 6, 2024

create the ExprEval early for ConstantExpr -s (except the one for BigInteger which seem to have some odd contract)

This part is problematic since ExprEval is not thread-safe. For example, the numeric ExprEval do the following for asString(). There is no locking or volatile on the relevant fields, so there is no guarantee that this code will work properly when called from multiple threads. Perhaps some thread will erroneously see null when there really should be a value.

  @Nullable
  public String asString()
  {
    if (!stringValueCached) {
      stringValue = Evals.asString(value);
      stringValueCached = true;
    }

    return stringValue;
  }

To safely cache the ExprEval we have a couple of options.

One is to ensure that the ExprEval stashed in ConstantExpr are immutable. Perhaps by adding a method on ExprEval that is used to ensure all the caches for asDouble(), asString(), etc, are populated. Once those are all populated then the object becomes thread-safe.

Another is to add a singleThreaded() method to make a thread-unsafe + optimized version, like the technique used by GenericIndexed#singleThreaded. The equivalent here would be something like Expr#singleThreaded, and that would be the one that creates the ExprEval (in a copy of the Expr). Callers that are about to call expr.eval(bindings) in a hot loop would then first do singleThreadedExpr = expr.singleThreaded(), then do singleThreadedExpr.eval(bindings). Callers that don't do this will still function properly, but they won't get the stashed eval optimization. Sketch of the approach:

class DoubleExpr extends ConstantExpr<Double>
{
  @Nullable
  private final ExprEval<?> eval;

  // constructor used by Parser
  DoubleExpr(Double value)
  {
    super(ExpressionType.DOUBLE, Preconditions.checkNotNull(value, "value"));
  }

  // constructor used by singleThreaded
  DoubleExpr(ExprEval<?> eval)
  {
    super(ExpressionType.DOUBLE, Preconditions.checkNotNull(eval.value(), "value"));
    this.eval = eval;
  }

  @Override // overrides method from Expr
  public Expr singleThreaded()
  {
    return new DoubleExpr(ExprEval.ofDouble(value));
  }

  @Override
  public ExprEval eval(ObjectBindings bindings)
  {
    // thread-safe no matter what, but more optimized if someone had called singleThreaded()
    return eval != null ? eval : ExprEval.ofDouble(value);
  }
}

I'm partial to the second one, since I am thinking we might want to add a Utf8ExprEval in the future that has a ByteBuffer value. That would benefit from the second approach, since we can call ByteBuffer#duplicate in singleThreaded() and avoid calling it during eval. It wouldn't benefit from the first approach, since ByteBuffer cannot ever be shared across threads, due to the builtin position being changed by various operations. So we'd still need to call ByteBuffer#duplicate in ExprEval#eval.

@kgyrtkirk
Copy link
Member Author

honestly: I was not thinking that should be counted in when I was making this change; ExprEval seemed like an immutable class to me

I personally think that this will not end in erroreous behaviour

  • as it may only happen if stringValueCached would get its value later than stringValue (which is not true)
  • I believe in this case the only way this could end up causing a multithreading issue is when new stringValueCached value may become visible while stringValue is not
  • that's internally is kinda like when the stringValue pointer and the stringValueCached primitive boolean is not inside the same cacheLine
  • it seemed to me that cacheLine size is at least 64 bytes ; but it could highly depend on the model/etc
      stringValue = Evals.asString(value);
      stringValueCached = true;

this reasoning could be wrong and even if it may not happen - it would be better to fix it; even thru the ExprEval class seem to be pretty small.

Did it caused any real trouble (I deeply sorry for the it) - but I would be very interested in learning more about it if it did!

"makeItReallyImmutable" approach

it could achieved with a call to expr.getCachedStringValue() ; which is not nice...probably adding a makeImmutable() or something...

the singleThreaded appraoch
  • it seems to me it adds one more extra conditional to be evaluated...but no big deal I think the branch predictor can do a pretty good job with it..
  • wouldn't the above sample still retain the issue in ExprEval ? and it would still need the makeItReallyImmutable as well in some form?

I wonder about:

  • instead of caching I wonder why not remove this conditional caching layer -
  • if say on a DoubleExpr the asString() method will be called most of the time then why have a DoubleExpr with caching ? why not a ConstantStringExpr ?

or did you meaned something like:

  • pick up eval as field eval
  • have a singleThreaded method version on the DoubleExpr class as well
    something like:
  // constructor used by singleThreaded
  DoubleExpr(ExprEval<?> eval)
  {
    super(ExpressionType.DOUBLE, Preconditions.checkNotNull(eval.value(), "value"));
    this.eval = eval.singleThreaded();
  }

I think the ConstantExpr should also declare a new eval method without bind arguments and finalize method

note: when I was doing this change I really wanted to do it differently with a more immutable approach - I really wanted to make it more immutable:
but wasn't able to create these objects early for all cases: as for some odd reason BigIntegerExpr may throw an exception during evalution....but wanted to remain on the safe side and take my chances - for BigIntegerExpr its kinda true that its either a number which can be represented as a LONG or a STRING - but never a true BigInteger - or at least that's how I see it now...

I'll keep thinking about it and make some draft PR to address this issue

@gianm
Copy link
Contributor

gianm commented Jan 8, 2024

Did it caused any real trouble (I deeply sorry for the it) - but I would be very interested in learning more about it if it did!

I noticed this through reading the code, not through anything that happened in production. But this code hasn't shipped in a release yet, so probably not many people are running it in production yet.

As to whether it's a real issue I would refer to: https://shipilev.net/blog/2016/close-encounters-of-jmm-kind/#wishful-benign-is-resilient

There are now a bunch of races in various ExprEval implementations, so the question is are the races "benign" or not? Personally I try to avoid answering this question, and prefer to design things such that the races go away— by making objects thread-safe, or truly immutable, or by using them from only one thread.

However, I do suspect the asString() has a non-benign race:

  @Nullable
  public String asString()
  {
    if (!stringValueCached) {
      stringValue = Evals.asString(value);
      stringValueCached = true;
    }

    return stringValue;
  }

I think it's possible for thread 1 to write stringValue = Evals.asString(value) and stringValueCached = true in such a way that thread 2 reads them out-of-order, i.e., thread 2 sees true for stringValueCached but null for stringValue. (It breaks the "single read rule".)

Anyway, IMO even benign races are best avoided— since they are easy to mess up— so even if it was benign, I would still suggest changing the design of the class.

@gianm
Copy link
Contributor

gianm commented Jan 8, 2024

"makeItReallyImmutable" approach
it could achieved with a call to expr.getCachedStringValue() ; which is not nice...probably adding a makeImmutable() or something...

Yeah I was thinking of something like .immutable() or .threadSafe() on ExprEval.

the singleThreaded appraoch
* it seems to me it adds one more extra conditional to be evaluated...but no big deal I think the branch predictor can do a pretty good job with it..
* wouldn't the above sample still retain the issue in ExprEval ? and it would still need the makeItReallyImmutable as well in some form?

Ah, I forgot to add this.expr = expr; in the second DoubleExpr constructor. I just edited my comment to add that in. I had imagined that expr would only be set on DoubleExpr that are returned by singleThreaded(). The usage would be like:

// shared across threads
Expr expr = Parser.parse(expression, macroTable);

// in each processing thread
Expr localExpr = expr.singleThreaded();
ObjectBindings bindings = // build bindings somehow
while (... data ...) {
  // do stuff with localExpr.eval(bindings);
}

instead of caching I wonder why not remove this conditional caching layer -
if say on a DoubleExpr the asString() method will be called most of the time then why have a DoubleExpr with caching ? why not a ConstantStringExpr ?

That's a good question… it's definitely better to do this if we have the info about what type we're going to want from each Expr. Certainly sometimes we do. I'm not sure if we always have it. I think it's related to ExpressionPlan?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants