[fix] fix dispatch table copy #24

Jonas-Heinrich · 2023-09-03T11:04:28Z

Hi,

while comparing QPL to TUM Umbra's related functionality, I noticed that the former has a large overhead for small amounts of tuples. After investigating with VTune and the following microbenchmark, I think I found a typo that leads to an accidental copy of the function pointer table in input_stream_t::initialize_sw_kernels. Here's the benchmark:

#include "qpl/qpl.h"
#include <benchmark/benchmark.h>

#include "../DataDistribution.hpp"
#include "../Utils.hpp"

int main() {
    // pointer wrapper with destructor
   ExecutionContext fixture(qpl_path_software);
   qpl_job* job = reinterpret_cast<qpl_job*>(fixture.getExecutionContext());

   uint32_t numTuples = 1;
   double selectivity = 0.5;
   uint32_t inBitWidth = 8;
   qpl_out_format outFormat = qpl_out_format::qpl_ow_32;

   // private utilities, does what you would expect
   LittleEndianBufferBuilder builder(inBitWidth, numTuples);
   Dataset dataset = UniformDistribution::generateDataset(
      builder,
      inBitWidth,
      (1ul << inBitWidth) - 1,
      numTuples,
      umbra::VectorizedFunctions::Mode::Eq,
      selectivity);
   std::vector<uint8_t> destination;
   destination.resize(divceil(32 * numTuples, 8));

   for (size_t i = 0; i < 1'000'000'000; i++) {
      if (i % 1'000'000 == 0) {
         std::cout << i << std::endl;
      }

      // Parameterize jobs.
      {
         job->parser = builder.getQPLParser();
         job->next_in_ptr = dataset.data.data();
         job->available_in = dataset.data.size();
         job->next_out_ptr = destination.data();
         job->available_out = static_cast<uint32_t>(destination.size());
         job->op = map_umbra_to_qpl_op(umbra::VectorizedFunctions::Mode::Eq);
         job->src1_bit_width = inBitWidth;
         job->num_input_elements = numTuples;
         job->out_bit_width = outFormat;
         auto [param_low, param_high] = dataset.predicateArguments->template getValues<uint32_t, 2>();
         job->param_low = param_low;
         job->param_high = param_high;
         job->flags = static_cast<uint32_t>(QPL_FLAG_OMIT_CHECKSUMS | QPL_FLAG_OMIT_AGGREGATES);
      }

      qpl_status status = qpl_execute_job(job);
      if (status != QPL_STS_OK) {
         ERROR("An error occurred during job execution: " << status);
      }

      const auto indicesByteSize = job->total_out;
      const auto bytesPerHit = divceil(32, 8);
      const auto qplHits = indicesByteSize / bytesPerHit;
      if ((outFormat != qpl_ow_nom && qplHits != dataset.predicateHits)) {
         ERROR("Result does not fit expectations: " << qplHits << " != " << dataset.predicateHits);
      } else if (outFormat == qpl_ow_nom && job->total_out != divceil(numTuples, 8)) {
         ERROR("Result does not fit expectations: " << job->total_out << " != " << divceil(numTuples, 8));
      }

      benchmark::DoNotOptimize(job->total_out);
      benchmark::DoNotOptimize(status);
      benchmark::DoNotOptimize(destination);
      benchmark::ClobberMemory();
   }
}

The benchmark was run for 15s on an i9 13900K. Screenshot of VTune summary before the PR:

after the PR:

Jonas-Heinrich · 2023-09-03T11:44:19Z

After further investigation, I noticed that the issue is also present in other locations. The second force push now includes other locations (found by searching for core_sw::dispatcher::kernels_dispatcher::get_instance()).

mzhukova · 2023-09-18T16:34:01Z

Hi @Jonas-Heinrich thank you for the contribution!
Team would review and do a thorough testing on our side in the upcoming weeks.

[fix] fix dispatch table copy

1d9fe92

Jonas-Heinrich force-pushed the fix-copy branch from d9de7b7 to 1d9fe92 Compare September 3, 2023 11:41

mzhukova self-assigned this Sep 5, 2023

mzhukova added the enhancement New feature or request label Sep 18, 2023

mzhukova unassigned mzhukova Jun 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[fix] fix dispatch table copy #24

[fix] fix dispatch table copy #24

Uh oh!

Jonas-Heinrich commented Sep 3, 2023 •

edited

Loading

Uh oh!

Jonas-Heinrich commented Sep 3, 2023 •

edited

Loading

Uh oh!

mzhukova commented Sep 18, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[fix] fix dispatch table copy #24

Are you sure you want to change the base?

[fix] fix dispatch table copy #24

Uh oh!

Conversation

Jonas-Heinrich commented Sep 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Jonas-Heinrich commented Sep 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mzhukova commented Sep 18, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Jonas-Heinrich commented Sep 3, 2023 •

edited

Loading

Jonas-Heinrich commented Sep 3, 2023 •

edited

Loading