Skip to content

Conversation

@natashasehgal
Copy link
Contributor

Summary:
Add MemoryPool support to SimpleFunction initialize() method to enable memory tracking for scalar functions (e.g merge_hll) that use HashStringAllocator.

Some types that require this include Hyperloglog, KHyperloglog, SetDigest, SfmSketch etc.

Changes
SimpleFunctionMetadata/SimpleFunctionAdapter/VectorFunction

  • New function signature with MemoryPool
  • Function caller tries method with MemoryPool first then without.

ExprCompiler

  • Gets MemoryPool from Execution Context passes through function factories
  • Create MemoryPool reference (with special no-op deleter so it doesn't try to delete memory it doesn't own)

ExprCompilerTest - Added test here

Differential Revision: D87932003

@netlify
Copy link

netlify bot commented Nov 26, 2025

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit efa6964
🔍 Latest deploy log https://app.netlify.com/projects/meta-velox/deploys/692767b5168ee8000814f4bb

@meta-codesync
Copy link

meta-codesync bot commented Nov 26, 2025

@natashasehgal has exported this pull request. If you are a Meta employee, you can view the originating Diff in D87932003.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 26, 2025
…ator#15641)

Summary:

Add MemoryPool support to SimpleFunction `initialize()` method to enable memory tracking for scalar functions (e.g merge_hll) that use HashStringAllocator.

Some types that require this include Hyperloglog, KHyperloglog, SetDigest, SfmSketch etc. 

**Main Change** 
- Modified SimpleFunctionAdapter to pass MemoryPool from ExprCompiler to UDF initialize()
- Two signatures in UDFHolder::initialize() for backward compatibility - 
  - New signature: initialize(vector<TypePtr>, QueryConfig, MemoryPool*, constantArgs...)
  - Old signature: initialize(vector<TypePtr>, QueryConfig, constantArgs...)

**Issue:** The initial straightforward implementation failed because some UDFs (e.g., Spark's https://www.internalfb.com/code/fbsource/fbcode/velox/functions/sparksql/Rand.h?lines=30 function) use template initialize() methods, that was matching fourth MemoryPool argument:

```cpp
template <typename TInput>
void initialize(vector<TypePtr>&, QueryConfig&, const TInput* param);

V2: Contains Template Detection via SFINAE
Compile-time template detection that probes initialize() with a DummyProbeType. If the method accepts this unrelated type, it's a template and is excluded from receiving MemoryPool.

Differential Revision: D87932003
natashasehgal added a commit to natashasehgal/velox that referenced this pull request Nov 26, 2025
…ator#15641)

Summary:

Add MemoryPool support to SimpleFunction `initialize()` method to enable memory tracking for scalar functions (e.g merge_hll) that use HashStringAllocator.

Some types that require this include Hyperloglog, KHyperloglog, SetDigest, SfmSketch etc. 

**Main Change** 
- Modified SimpleFunctionAdapter to pass MemoryPool from ExprCompiler to UDF initialize()
- Two signatures in UDFHolder::initialize() for backward compatibility - 
  - New signature: initialize(vector<TypePtr>, QueryConfig, MemoryPool*, constantArgs...)
  - Old signature: initialize(vector<TypePtr>, QueryConfig, constantArgs...)

**Issue:** The initial straightforward implementation failed because some UDFs (e.g., Spark's https://www.internalfb.com/code/fbsource/fbcode/velox/functions/sparksql/Rand.h?lines=30 function) use template initialize() methods, that was matching fourth MemoryPool argument:

```cpp
template <typename TInput>
void initialize(vector<TypePtr>&, QueryConfig&, const TInput* param);

V2: Contains Template Detection via SFINAE
Compile-time template detection that probes initialize() with a DummyProbeType. If the method accepts this unrelated type, it's a template and is excluded from receiving MemoryPool.

Differential Revision: D87932003
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant