Is your feature request related to a problem? Please describe.
Currently, KLLSketch and DataType analyzer is implemented use the UserDefinedAggregateFunction
|
private [sql] class StatefulKLLSketch( |
|
private[sql] class StatefulDataType extends UserDefinedAggregateFunction { |
which is considered deprecated and should be replaced with Aggregator which offer much greater performance which was outlined here apache/spark#25024 (comment)
Describe the solution you'd like
Reimplement StatefulDataType and StatefulKLLSketch using Aggregator
I am happy to help with this implementation.
Is your feature request related to a problem? Please describe.
Currently,
KLLSketchandDataTypeanalyzer is implemented use the UserDefinedAggregateFunctiondeequ/src/main/scala/com/amazon/deequ/analyzers/catalyst/StatefulKLLSketch.scala
Line 29 in 3b1a3ec
deequ/src/main/scala/com/amazon/deequ/analyzers/catalyst/StatefulDataType.scala
Line 26 in 3b1a3ec
which is considered deprecated and should be replaced with Aggregator which offer much greater performance which was outlined here apache/spark#25024 (comment)
Describe the solution you'd like
Reimplement
StatefulDataTypeandStatefulKLLSketchusingAggregatorI am happy to help with this implementation.