-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
MDEV-36737: Research and Estimation for Adapting VIDEX to MariaDB #4217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 11.8
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A quick review, mostly compilation issues outlined.
To keep things organised in one place, we should do most of this effort under this pull request. If you face some issues and you feel a need to create new pull request, let us know. We could probably help you to edit this one.
Hi @svoj @gl-sergei @kr11, This PR implemented the Videx storage engine with the Optimizer part. For videx server usage, you can refer to bytedance/videx#47. The server will be covered in a future PR. One of the CI tests is currently failing — could you please help check it, or try re-running the workflow? Thanks a lot! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very good. Would it be possible to squash all commits in one? Some comments inline.
@YoungHypo regarding failing test: please disregard it. It is unrelated to this PR, we will sort it out when we're ready to merge. |
f8136b7
to
c36a948
Compare
Thanks for your feedback @svoj. We’ve completed the changes: all non-Videx code has been removed, and the core files containes only two parts, ha_videx.cc and videx_utils. The mysql-test directory has also been simplified as you suggested, and the commits have been squashed. Please let me know if further adjustments are needed.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should stabilise test suite.
With cmake -DPLUGIN_VIDEX=NO
I get expected result:
videx.create-table-and-index [ skipped ] Need VIDEX engine
videx.set-debug-skip-http [ skipped ] Need VIDEX engine
With cmake -DPLUGIN_VIDEX=STATIC
I get unexpected results:
videx.create-table-and-index [ fail ] Found warnings/errors in server log file!
Test ended at 2025-08-31 19:05:54
line
2025-08-31 19:05:53 4 [Warning] VIDEX: access videx_server failed res_code != curle_ok: 127.0.0.1:5001
2025-08-31 19:05:53 4 [Warning] VIDEX: access videx_server failed res_code != curle_ok: 127.0.0.1:5001
2025-08-31 19:05:53 4 [Warning] VIDEX: access videx_server failed res_code != curle_ok: 127.0.0.1:5001
2025-08-31 19:05:54 4 [Warning] VIDEX: access videx_server failed res_code != curle_ok: 127.0.0.1:5001
^ Found warnings in /dev/shm/build/videx-static/mysql-test/var/log/mysqld.1.err
ok
- saving '/dev/shm/build/videx-static/mysql-test/var/log/videx.create-table-and-index/' to '/dev/shm/build/videx-static/mysql-test/var/log/videx.create-table-and-index/'
videx.set-debug-skip-http [ fail ]
Test ended at 2025-08-31 19:05:54
CURRENT_TEST: videx.set-debug-skip-http
mysqltest: At line 16: query 'SET SESSION videx_debug_skip_http = 'True'' failed: ER_WRONG_VALUE_FOR_VAR (1231): Variable 'videx_debug_skip_http' can't be set to the value of 'True'
The result from queries just before the failure was:
CREATE TABLE `part` (
`P_PARTKEY` int NOT NULL,
`P_NAME` varchar(55) NOT NULL,
`P_MFGR` char(25) NOT NULL,
`P_BRAND` char(10) NOT NULL,
`P_TYPE` varchar(25) NOT NULL,
`P_SIZE` int NOT NULL,
`P_CONTAINER` char(10) NOT NULL,
`P_RETAILPRICE` decimal(15,2) NOT NULL,
`P_COMMENT` varchar(23) NOT NULL,
PRIMARY KEY (`P_PARTKEY`)
) ENGINE=VIDEX;
SET SESSION videx_debug_skip_http = 'True';
With cmake -DPLUGIN_VIDEX=DYNAMIC
I get unexpected results:
videx.create-table-and-index [ skipped ] Need VIDEX engine
videx.set-debug-skip-http [ skipped ] Need VIDEX engine
We should load ha_videx.so
in this case. There should be suite.pm:
package My::Suite::Videx;
@ISA = qw(My::Suite);
return "No VIDEX" unless $ENV{HA_VIDEX_SO} or
$::mysqld_variables{'videx'} eq "ON";
return "Not run for embedded server" if $::opt_embedded_server;
sub is_default { 1 }
bless { };
suite.opt should probably have:
--plugin-load-add=$HA_VIDEX_SO
If tests require videx server running, it should be checked by either have_videx.inc
or suite.pm
. We should probably even start/stop videx server for particular tests, but then we still have to check for videx server existence.
Anyway, it'd be great to make suite either "skipped" or "passed" in all of the above cases, that is -DPLUGIN_VIDEX=NO|STATIC|DYNAMIC
.
Thanks for your feedback @svoj ! I've removed some unnecessary code from In my local testing, Currently, the tests set skip_http to True, so they do not depend on the Videx server. In the next PR, we plan to discuss how to introduce a Python-based server implementation. Look forward to your further review and guidance. Thanks again! BTW, once it’s ready to be merged, I’ll squash all commits into one again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more minor issues outlined. Otherwise I'm happy to recommend it for merging.
Thanks @svoj. I’ve removed |
@YoungHypo I believe it is in a good enough shape for the merge, so please do squash. Still other developers may request some extra changes in the meantime. |
4efc530
to
f8f2810
Compare
@YoungHypo it'd also be good to reword main commit description so that it says something like:
|
f8f2810
to
bb68295
Compare
I’ve squashed the commits and updated the description. Thanks again @svoj for your guidance and feedback — it’s truly great to have your help. |
bb68295
to
bd334bd
Compare
VIDEX is a Disaggregated and Extensible Virtual Index Engine designed to perform efficient and accurate what-if analysis for tasks such as index recommendation.
bd334bd
to
2f8993d
Compare
@svoj May I ask what the next steps in the review process will be? We’re planning to share our current progress on Jira and Zulip — do you think that’s a good idea? Really looking forward to receiving feedback from other developers. |
I will ask some other developers for feedback here. Feel free to share our current progress via jira/zulip. |
@vuvova, @spetrunia I believe initial VIDEX version is in a good enough shape. I aim to get it merged to 11.8, disabled by default, plugin marked as experimental. Do you have any suggestions/objections? Will you want to review this PR too? |
Hi @svoj , just following up on the status of this PR. |
Hi @YoungHypo. No feedback yet. I will be trying to get things rolling. In the meantime, unless absolutely necessary, it'd be good to keep this PR intact, no need to perform merges. So that we can anchor to certain revision. We can update the tree when we're ready to merge. It'd be good to revert recent merge to rev 2f8993d, as it was at the time I approved it. |
7c09d20
to
2f8993d
Compare
Thanks @svoj! I've reset the branch to commit 2f8993d as suggested |
Hi @svoj , the branch has been reset as you suggested. It looks like the CI workflow is now awaiting approval ( |
@kr11 done, though it was just Windows on ARM, rather minor builder. |
@YoungHypo, just wanted to say, that we're still testing VIDEX — a couple of developers have it installed and run various queries. Unfortunately, it's not very visible in the PR, but we are working on it |
@vuvova @svoj @kr11 VIDEX - PR 47 already includes the installation and execution steps for VIDEX in MariaDB and its dependency (Statistic Server), as well as the TPC-H benchmark results. If anything in the PR description is unclear, we can continue the discussion either here in this PR or in the Zulip channel. Please feel free to let me know if there’s anything I can assist with. |
Description
This PR introduces VIDEX as a new storage engine plugin for MariaDB. VIDEX is a Disaggregated and Extensible Virtual Index Engine designed to perform efficient and accurate what-if analysis for tasks like index recommendation.
The VIDEX architecture is composed of two core parts:
The statistic server can be implemented in any language or framework. A reference implementation using Python/Flask has already been merged in bytedance/videx#47.
As discussed, this PR contains the implementation for the VIDEX-Optimizer. The corresponding VIDEX-Statistic-Server (developed mainly in Python) will be submitted in a follow-up PR.
As tested on the TPC-H benchmark, VIDEX is capable of producing query plans that are 100% identical to those from MariaDB's native InnoDB engine. The detailed results can be found in the description of bytedance/videx#47.
Features
records_in_range
,info_low
, etc.File Structure
Configuration Variables
debug_skip_http
: skip HTTP requests for debuggingserver_ip
: VIDEX server addressoptions
: connection options in JSON formatHow to start VIDEX-Server
see videx/PR-47: Implemented VIDEX-Server on MariaDB
mysql-test
Test Case 1: create-table-and-index.test
Validates basic VIDEX engine functionality:
Test Case 2: set-debug-skip-http.test
Validates debugging functionality:
DEBUG_SKIP_HTTP
variable settingBuild Configuration
Dependencies
CMake Options
PLUGIN_VIDEX=YES
: Enable VIDEX plugin (default)PLUGIN_VIDEX=STATIC
: Static linkingPLUGIN_VIDEX=DYNAMIC
: Dynamic loadingCompilation Command Example
Contributor Information
Future Plans
Summary
The VIDEX storage engine provides MariaDB with a flexible and extensible statistics information management solution. Through external statistics services, it achieves high compatibility with InnoDB while maintaining architectural flexibility and maintainability. This plugin is particularly suitable for scenarios requiring rapid iteration of statistics strategies or deployment of distributed statistics services.
Release Notes
Added a VIDEX engine in
storage/videx
to support what-if analysis for index strategies, integrates AI-based cardinality and NDV (number of distinct values) estimation algorithmsHow can this PR be tested?
TODO: modify the automated test suite to verify that the PR causes MariaDB to behave as intended.
Consult the documentation on "Writing good test cases".
If the changes are not amenable to automated testing, please explain why not and carefully describe how to test manually.
Basing the PR against the correct MariaDB version
11.8
tag.PR quality check