Fix InsertRelation on attached database #155

evertlammerts · 2025-11-01T10:26:36Z

fixes duckdb/duckdb#18396
related pr in core: duckdb/duckdb#19583

The checks of this PR can only run after duckdb/duckdb#19583 lands

src/duckdb_py/pyrelation.cpp

Tishj

I have some questions, but I'm also missing a test

Tishj

LGTM!

Fixes #18396 Related PR in duckdb-python: duckdb/duckdb-python#155

Squashed commit of the following: commit 68d7555f68bd25c1a251ccca2e6338949c33986a Merge: 3d4d568674 9c6efc7d89 Author: Mark <mark.raasveldt@gmail.com> Date: Tue Nov 11 11:59:30 2025 +0100 Fix minor crypto issues (#19716) commit 3d4d568674d1e05d221e8326c0d180336c350f18 Merge: 7386b4485d 0dea05daf8 Author: Mark <mark.raasveldt@gmail.com> Date: Tue Nov 11 10:58:18 2025 +0100 Logs to be case-insensitive also at enable_logging callsite (#19734) Currently `CALL enable_logging('http');` would succeed, but then select an empty subset of the available logs (`http` != `HTTP`), due to a quirk in the code. This PR fixes that up. commit 7386b4485d23bc99c9f6efab6ce0e33ecc23222b Merge: 1ef3444f09 d4a77c801b Author: Mark <mark.raasveldt@gmail.com> Date: Tue Nov 11 09:28:13 2025 +0100 Add explicit Initialize(HTTPParam&) method to HTTPClient (#19723) This allow explicit re-initialization of specific parts of HTTPClient(s) This diff would allow patterns such reusing partially constructed (but properly re-initialized) HTTPClient objects ```c++ struct CrossQueryState { // in some state kept around unique_ptr<HTTPClient>& client; }; void SomeFunction() { // ... http_util.Request(get_request, client); // some more logic, same query http_util.Request(get_request, client); } void SomeOtherFunction() { // Re-initialize part of the client, given some settings might have changed auto http_params = HTTPParams(http_util) client->Initialize(http_params); // ... http_util.Request(get_request, client); // some more logic, same query http_util.Request(get_request, client); } ``` Note that PR is fully opt-in from users, while if you implement a file-system abstraction inheriting from HTTPClient you should get a compiler error pointing to implementing the relevant function. commit 9c6efc7d89ee5ca60598c7e43778c0e9b34b266b Author: Mark <mark.raasveldt@gmail.com> Date: Tue Nov 11 08:09:02 2025 +0100 Fix typo commit e52f71387731da1202fc33755922999a472218a1 Author: Mark <mark.raasveldt@gmail.com> Date: Tue Nov 11 08:08:32 2025 +0100 Add require to test commit 1ef3444f09b1df6e4a7cc3ad1d67868ecaa1a6a4 Merge: 8090b8d52e dff5b7f608 Author: Mark <mark.raasveldt@gmail.com> Date: Tue Nov 11 08:07:17 2025 +0100 Bump the Postgres scanner extension (#19730) commit 0dea05daf823237a2de28ec7c0fec53dbb006475 Author: Carlo Piovesan <piovesan.carlo@gmail.com> Date: Tue Nov 11 06:42:36 2025 +0100 Logs to be case-insensitive also at enable_logging callsite commit 8090b8d52ed6bfd31b72013f6800cea89539cc2f Merge: 6667c7a3ec 5e9f88863f Author: Mark <mark.raasveldt@gmail.com> Date: Mon Nov 10 21:34:42 2025 +0100 [Dev] Fix assertion failure for empty ColumnData serialization (#19713) The `PersistentColumnData` constructor asserts that the pointers aren't empty. This assertion will fail if we try to serialize the child of a list, if all lists are empty (as the child will be entirely empty then) Backported fix for problem found by: #19674 commit 6667c7a3ecdc56cc144a9bcf8601001af66e6839 Merge: 3f0ad6958f 4a0f4b0b38 Author: Mark <mark.raasveldt@gmail.com> Date: Mon Nov 10 21:32:58 2025 +0100 Bump httpfs and resume testing on Windows (#19714) commit dff5b7f608b732a0e7c5d9a68e7e8d7db3c48478 Author: Mytherin <mark.raasveldt@gmail.com> Date: Mon Nov 10 21:31:46 2025 +0100 Bump the Postgres scanner extension commit 0e3d0b5af535fcde90d272d95b1d08cb5fb12d15 Author: Sam Ansmink <samansmink@hotmail.com> Date: Mon Nov 10 21:26:43 2025 +0100 remove deleted file from patch commit ffb7be7cc5f27d9945d6868f76ef769a3f8a43d4 Merge: 2142f0b10d 3f0ad6958f Author: Sam Ansmink <samansmink@hotmail.com> Date: Mon Nov 10 21:13:17 2025 +0100 Merge branch 'v1.4-andium' into fix-crypto-issue commit 2142f0b10db72b89c9101fa65ead619182f8e5d1 Author: Sam Ansmink <samansmink@hotmail.com> Date: Mon Nov 10 20:55:18 2025 +0100 fix duplicate job id commit 0a225cb99a130c2b1635d6ced03bc37f01ff9436 Author: Sam Ansmink <samansmink@hotmail.com> Date: Mon Nov 10 20:52:40 2025 +0100 fix ci for encryption commit 3f0ad6958f1952a083bc499fc147f69504a3c6d2 Merge: f3fb834ef7 a1eeb0df6f Author: Mark <mark.raasveldt@gmail.com> Date: Mon Nov 10 20:09:11 2025 +0100 Fix #19700: correctly sort output selection vector in nested selection operations (#19718) Fixes #19700 This probably should be maintained during the actual select - but for now just sorting it afterwards solves the issue. commit f3fb834ef7153b90ef3908eb51a5b85efa580ca5 Merge: 7333a0ae84 c8ddca6f3c Author: Mark <mark.raasveldt@gmail.com> Date: Mon Nov 10 20:09:03 2025 +0100 Fix #19355: correctly resolve subquery in MERGE INTO action condition (#19720) Fixes #19355 commit 7333a0ae84d51729fffe91e67f12c3cee526af2a Merge: 95fcb8f188 6595848a27 Author: Mark <mark.raasveldt@gmail.com> Date: Mon Nov 10 16:46:31 2025 +0100 Bump: delta, ducklake, httpfs (#19715) This PR bumps the following extensions: - `delta` from `0747c23791` to `6515bb2560` - `ducklake` from `022cfb1373` to `77f2512a67` - `httpfs` from `b80c680f86` to `041a782b0b` commit 35f98411037cb0499e236d0cbe20d6b3a0dcc43f Author: Sam Ansmink <samansmink@hotmail.com> Date: Mon Nov 10 14:51:29 2025 +0100 install curl commit d4a77c801bb1a88e634c12bc64e185ef2f147d2d Author: Carlo Piovesan <piovesan.carlo@gmail.com> Date: Mon Nov 10 14:37:42 2025 +0100 Add explicit Initialize(HTTPParams&) method to HTTPClient This allow explicit re-initialization of specific parts of HTTPClient(s) commit 6595848a27bd7fb271c63a99551d8326417320dd Author: Sam Ansmink <samansmink@hotmail.com> Date: Mon Nov 10 11:30:34 2025 +0100 bump extensions commit 7a7726214c86267d476a2edbc68656ebd6253fe8 Author: Sam Ansmink <samansmink@hotmail.com> Date: Mon Nov 10 11:28:32 2025 +0100 fix: ci issues commit 4a0f4b0b38b9d5660c8a5c848d8a1c71bc3220de Author: Carlo Piovesan <piovesan.carlo@gmail.com> Date: Mon Nov 10 11:07:58 2025 +0100 Bump httpfs and resume testing on Windows commit 5e9f88863f5f519620ae01f4ff873f6a2869343f Author: Tishj <t_b@live.nl> Date: Mon Nov 10 10:58:03 2025 +0100 conditionally create the PersistentColumnData, if there are no segments (as could be the case for a list's child), there won't be any data pointers commit 95fcb8f18819b1a77df079a7fcb753a8c2f52844 Merge: 396c86228b 4f3df42f20 Author: Laurens Kuiper <laurens@duckdblabs.com> Date: Mon Nov 10 10:50:38 2025 +0100 Bump: aws, ducklake, httpfs, iceberg (#19654) This PR bumps the following extensions: - `aws` from `18803d5e55` to `55bf3621fb` - `ducklake` from `2554312f71` to `022cfb1373` - `httpfs` from `8356a90174` to `b80c680f86` - `iceberg` from `5e22d03133` to `db7c01e92` commit c8ddca6f3c32aa0d3a9536371f9e3ca8cb00753e Author: Mytherin <mark.raasveldt@gmail.com> Date: Mon Nov 10 09:19:31 2025 +0100 Fix #19355: correctly resolve subquery in MERGE INTO action condition commit a1eeb0df6ffc2f129638a2dfaab9a70720c8db1b Author: Mytherin <mark.raasveldt@gmail.com> Date: Mon Nov 10 09:00:35 2025 +0100 Fix #19700: correctly sort output selection vector in nested selection operations commit 396c86228bda46929560affde7effdbab7d4e905 Merge: e3d242509e e501fcbd1a Author: Mark <mark.raasveldt@gmail.com> Date: Sat Nov 8 17:34:13 2025 +0100 Add missing query location to blob cast (#19689) commit e3d242509e5710314921a0d7debd0bedb4d10a3e Merge: 7ce99bc041 1ba198d711 Author: Mark <mark.raasveldt@gmail.com> Date: Sat Nov 8 17:34:04 2025 +0100 Add request timing to HTTP log (#19691) Demo: ```SQL D call enable_logging('HTTP'); D from read_csv_auto('s3://duckdblabs-testing/test.csv'); D select request.type, request.url, request.start_time, request.duration_ms from duckdb_logs_parsed('HTTP'); ┌─────────┬────────────────────────────────────────────────────────────────┬───────────────────────────────┬─────────────┐ │ type │ url │ start_time │ duration_ms │ │ varchar │ varchar │ timestamp with time zone │ int64 │ ├─────────┼────────────────────────────────────────────────────────────────┼───────────────────────────────┼─────────────┤ │ HEAD │ https://duckdblabs-testing.s3.us-east-1.amazonaws.com/test.csv │ 2025-11-07 10:17:56.052202+00 │ 417 │ │ GET │ https://duckdblabs-testing.s3.us-east-1.amazonaws.com/test.csv │ 2025-11-07 10:17:56.478847+00 │ 104 │ └─────────┴────────────────────────────────────────────────────────────────┴───────────────────────────────┴─────────────┘ ``` commit ae518d0a4e439f80c768388fab8f51d667f7e4b7 Author: Sam Ansmink <samansmink@hotmail.com> Date: Fri Nov 7 13:58:22 2025 +0100 minor ci fixes commit e501fcbd1af58cf147b80051b38ddf815d5e1b8c Author: Mytherin <mark.raasveldt@gmail.com> Date: Fri Nov 7 12:37:21 2025 +0100 move commit bc1a683d10150dfe15f2f4d69e505f6337c4fc27 Author: Sam Ansmink <samansmink@hotmail.com> Date: Fri Nov 7 11:22:35 2025 +0100 only load httpfs if necessary commit 1ba198d71106a851fb8234ccfb208ec66b0e1d17 Author: Sam Ansmink <samansmink@hotmail.com> Date: Fri Nov 7 11:16:38 2025 +0100 fix: check if logger exists commit f22e9a06ef6e1b6c999e8c7389b05e40ae9032fc Author: Sam Ansmink <samansmink@hotmail.com> Date: Fri Nov 7 11:13:11 2025 +0100 add test for http log timing commit f474ba123485377f94e5b57600fb720733050c98 Author: Sam Ansmink <samansmink@hotmail.com> Date: Fri Nov 7 11:00:19 2025 +0100 add http timings to logger commit 02bb5d19b9fc7a702184ffcf7d9688b88f54071a Author: Mytherin <mark.raasveldt@gmail.com> Date: Fri Nov 7 09:30:04 2025 +0100 Add query location to blob cast commit 7ce99bc04130615dfc3a39dfb79177a8942fefba Merge: 1555b0488e aea843492d Author: Laurens Kuiper <laurens@duckdblabs.com> Date: Fri Nov 7 09:22:48 2025 +0100 Fix InsertRelation on attached database (#19583) Fixes https://github.com/duckdb/duckdb/issues/18396 Related PR in duckdb-python: https://github.com/duckdb/duckdb-python/pull/155 commit 1555b0488e322998e6fd06cc47e1909c7bb4eba4 Merge: 783f08ffd8 98e2c4a75f Author: Laurens Kuiper <laurens@duckdblabs.com> Date: Fri Nov 7 08:31:05 2025 +0100 Log total probe matches in hash join (#19683) This is usually evident from the number of tuples coming out of a join, but it can be hard to understand what's going on when doing a `LEFT`/`RIGHT`/`OUTER` join. This PR adds one log call at the end of the hash join to report how many probe matches there were. ```sql D CALL enable_logging('PhysicalOperator'); ┌─────────┐ │ Success │ │ boolean │ ├─────────┤ │ 0 rows │ └─────────┘ D SELECT count(*) FROM range(3_000_000) t1(i) LEFT JOIN range(1_000_000, 2_000_000) t2(i) USING (i); ┌────────────────┐ │ count_star() │ │ int64 │ ├────────────────┤ │ 3000000 │ │ (3.00 million) │ └────────────────┘ D CALL disable_logging(); ┌─────────┐ │ Success │ │ boolean │ ├─────────┤ │ 0 rows │ └─────────┘ D SELECT info.total_probe_matches::BIGINT total_probe_matches FROM duckdb_logs_parsed('PhysicalOperator') WHERE class = 'PhysicalHashJoin' AND event = 'GetData'; ┌─────────────────────┐ │ total_probe_matches │ │ int64 │ ├─────────────────────┤ │ 1000000 │ │ (1.00 million) │ └─────────────────────┘ ``` Here we are able to see that the hash join produced 1M matches, but emitted 3M tuples. commit 783f08ffd89b1d1290b2d3dec0b3ba12d8c233bf Merge: 6c6af22ea4 1d5c9f5f3d Author: Laurens Kuiper <laurens@duckdblabs.com> Date: Thu Nov 6 15:57:35 2025 +0100 Fixup linking for LLVM (#19668) See conversation at https://github.com/llvm/llvm-project/issues/77653 This allows again: ``` brew install llvm CMAKE_LLVM_PATH=/opt/homebrew/Cellar/llvm/21.1.5 GEN=ninja make ``` to just work. Arguably very limited, but can as well be fixed. commit 6c6af22ea45effc67dc9e76feec3fb73208750bb Merge: 2892abafa7 f483e95d1c Author: Laurens Kuiper <laurens@duckdblabs.com> Date: Thu Nov 6 15:56:49 2025 +0100 Categorize ParseLogMessage as CAN_THROW_RUNTIME_ERROR (#19672) Currently we rely on filtering on query type AND executing scalar function `parse_duckdb_log_message` to not be reordered. This is somehow brittle, and have found locally cases where this cause problems that will result in wrong casts, such as: ``` Conversion Error: Type VARCHAR with value 'ColumnDataCheckpointer FinalAnalyze(COMPRESSION_UNCOMPRESSED) result for main.big.0(VALIDITY): 15360' can't be cast to the destination type STRUCT(metric VARCHAR, "value" VARCHAR) ``` Looking at the executed plan, it would look like: ``` ┌─────────────┴─────────────┐ │ FILTER │ │ ──────────────────── │ │ ((type = 'Metrics') AND │ │ (struct_extract │ │ (parse_duckdb_log_message(│ │ 'Metrics', message), │ │ 'metric') = 'CPU_TIME')) │ │ │ │ ~0 rows │ └─────────────┬─────────────┘ ``` Tagging `parse_duckdb_log_message` as potentially throwing on some input avoids reordering, and avoid the problem while improving the usability of logs. An alternative solution would be use explicit DefaultTryCast (instead of TryCast), at https://github.com/duckdb/duckdb/blob/v1.4-andium/src/function/scalar/system/parse_log_message.cpp#L70, either allow to solve the problem. commit 98e2c4a75f816eae6ef2893bbb581c9913293f2a Author: Laurens Kuiper <laurens.kuiper@cwi.nl> Date: Thu Nov 6 15:35:25 2025 +0100 log total probe matches in hash join commit 2892abafa772fffc4402e5125cf16a26c094cb44 Merge: ecc73b2b4b 488069ec8d Author: Laurens Kuiper <laurens@duckdblabs.com> Date: Thu Nov 6 14:21:05 2025 +0100 duckdb_logs_parsed to do case-insensitive matching (#19669) This is something me and @Tmonster bumped into while helping a customer debugging an issue. I think it's more intuitive and friendly that user facing functions are case insensitive, given that is the general user expectation around SQL. I am not sure `ILIKE` is the best way to do so (an alternative would be filtering on `lower(1) = lower(2)`). Note that passing `%` signs is currently checked elsewhere, for example: ```sql SELECT message FROM duckdb_logs_parsed('query%') WHERE starts_with(message, 'SELECT 1'); ``` would throw ``` Invalid Input Error: structured_log_schema: 'query%' not found ``` (while `querylog` already work, see test case, given there case-insensitivity comparison was already used) commit aea843492da3f40c30e6e88c12eb6da690348f2e Author: Evert Lammerts <evert.lammerts@gmail.com> Date: Thu Nov 6 11:40:11 2025 +0100 review feedback commit 094a54b890a2466aad743b1c372809849cdef283 Author: Evert Lammerts <evert.lammerts@gmail.com> Date: Sat Nov 1 11:22:34 2025 +0100 Fix InsertRelation on attached database commit 4f3df42f208d5e6dc602d2e688911ef13758d3aa Author: Sam Ansmink <samansmink@hotmail.com> Date: Thu Nov 6 11:31:58 2025 +0100 bump iceberg further commit f483e95d1c3983c2ba5758ebba1272f7ff12cd0d Author: Carlo Piovesan <piovesan.carlo@gmail.com> Date: Fri Oct 31 12:25:01 2025 +0100 Improve tests using now working FROM duckdb_logs_parsed() commit 6554c84a73b6c7857d2ec5ebf6f2019ceb56e6dc Author: Carlo Piovesan <piovesan.carlo@gmail.com> Date: Tue Nov 4 12:56:31 2025 +0100 parse_logs_message might throw commit 488069ec8d726d3b19093e8d57101c6c6af8910b Author: Carlo Piovesan <piovesan.carlo@gmail.com> Date: Thu Nov 6 09:29:49 2025 +0100 duckdb_logs_parsed to do case-insensitive matching commit 1d5c9f5f3d18c73e27b0bc4353d549680c5c82d5 Author: Carlo Piovesan <piovesan.carlo@gmail.com> Date: Thu Nov 6 09:13:41 2025 +0100 Fixup linking for LLVM See conversation at https://github.com/llvm/llvm-project/issues/77653 commit ecc73b2b4b10beb175968e55e24e69241d00df1b Merge: 2d69f075ee 4cb677238f Author: Mark <mark.raasveldt@gmail.com> Date: Thu Nov 6 08:58:09 2025 +0100 Always remember extra_metadata_blocks when checkpointing (#19639) This is a follow-up to https://github.com/duckdb/duckdb/pull/19588, adding the following: - Reenables block verification in a new test configuration. It further adds new checks to ensure that the metadata blocks that the RowGroup references after checkpointing corresponds to those that it would see if it were to reload them from disk. This verification would have caught the issue addressed by https://github.com/duckdb/duckdb/pull/19588 - Adds a small tweak in `MetadataWriter::SetWrittenPointers`. This ensures that the table writer does not track an `extra_metadata_block` that did not ever receive any writes as part of that rowgroup (as it immediately skipped to next block when calling `writer.GetMetaBlockPointer()` after `writer.StartWritingColumns`). With the added verification, not having this tweak fails e.g. the following test: ``` test/sql/storage/compression/bitpacking/bitpacking_compression_ratio.test_slow CREATE TABLE test_bitpacked AS SELECT i//2::INT64 AS i FROM range(0, 120000000) tbl(i); ================================================================================ TransactionContext Error: Failed to commit: Failed to create checkpoint because of error: Reloading blocks just written does not yield same blocks: Written: {block_id: 2 index: 32 offset: 0}, {block_id: 2 index: 33 offset: 8}, Read: {block_id: 2 index: 33 offset: 8}, Read Detailed: {block_id: 2 index: 33 offset: 8}, Start pointers: {block_id: 2 index: 33 offset: 8}, Metadata blocks: {block_id: 2 index: 32 offset: 0}, ``` - Ensures that we always update `extra_metadata_blocks` after checkpointing a rowgroup. This speeds up subsequent checkpoints significantly. Right now, if you have a large legacy database, and don't update these old rowgroups, this field is kept as is, and every checkpoint needs to recompute it (even if the database isn't reloaded). Making sure we always have `RowGroup::has_metadata_blocks == true` after each checkpoint, even in case of metadata reuse, will both benefit checkpointing for databases in old storage formats, as well as when starting to use newer storage format on large legacy databases. - Only tangentially related to the issue / PR, but while debugging I noticed that the `deletes_is_loaded` variable is not correctly initialized in all RowGroup constructors (can also be triggered with the assertion I added in `RowGroup::HasChanges()`) commit 46028940c8e429739e73f4d345ec3cab5eb5b01c Author: Sam Ansmink <samansmink@hotmail.com> Date: Wed Nov 5 19:33:58 2025 +0100 bump extension entries commit 2d69f075ee91c42ad4fe4208a4d1f06d0034faff Merge: 7043621a83 e3fb2eb884 Author: Laurens Kuiper <laurens@duckdblabs.com> Date: Wed Nov 5 15:27:27 2025 +0100 Enable running all extensions tests as part of the build step (#19631) This is enabled via https://github.com/duckdb/extension-ci-tools/pull/278, that introduced a way to hook into running tests for all extension of a given configuration (as opposed to a single one). Also few minor fixes I bumped into: * disable unused platforms from the external extension builds * remove `[persistence]` tests to be always run * enable `vortex` tests * avoid `httpfs` tests on Windows, to be reverted in a follow up commit 4cb677238f7f4ad4d747f1a1045396fd74765724 Merge: b48cd982e0 7043621a83 Author: Yannick Welsch <yannick@welsch.lu> Date: Wed Nov 5 14:59:47 2025 +0100 Merge remote-tracking branch 'origin/v1.4-andium' into yw/metadata-reuse-tweaks commit b48cd982e0c59a03cf78a37175ba7272438c2525 Author: Yannick Welsch <yannick@welsch.lu> Date: Wed Nov 5 14:59:34 2025 +0100 newline commit 490411ab5ae614064e3e4fa94f631dcbbeea68d8 Author: Sam Ansmink <samansmink@hotmail.com> Date: Wed Nov 5 13:55:19 2025 +0100 fix: add more places to securely clear key from memory commit e3fb2eb8843f9ff90ad29fd69938ee6961b644dc Author: Carlo Piovesan <piovesan.carlo@gmail.com> Date: Wed Nov 5 11:07:40 2025 +0100 Avoid testing httpfs on Windows (fix incoming) commit e719c837851f016ea614b28380685de8794ccf39 Author: Carlo Piovesan <piovesan.carlo@gmail.com> Date: Wed Nov 5 11:04:57 2025 +0100 Revert "Add ducklake tests" This reverts commit b77a9615117de845fa48463f09be20a89dea7434. commit 4242618a8d43c2004f55b27b63535ad979302e92 Author: Sam Ansmink <samansmink@hotmail.com> Date: Wed Nov 5 11:03:48 2025 +0100 only autoload if crypto util is not set commit 19232fc414dc7f861dcbad788ba5466d10c27a67 Author: Sam Ansmink <samansmink@hotmail.com> Date: Wed Nov 5 10:14:12 2025 +0100 bump extensions commit 7043621a83d1be17ba6b278f0f7a3ec65df98d93 Merge: db845b80c7 3584a93938 Author: Laurens Kuiper <laurens@duckdblabs.com> Date: Wed Nov 5 09:18:39 2025 +0100 Bump MySQL scanner (#19643) Updating the MySQL scanner to include the time zone handling fix to duckdb/duckdb-mysql#166. commit db845b80c76452054e26cf7a2d715769592de925 Merge: f50618b48c 7eccc643ae Author: Laurens Kuiper <laurens@duckdblabs.com> Date: Wed Nov 5 09:15:52 2025 +0100 Remove `FlushAll` from `DETACH` (#19644) This was initially added to reduce RSS after `DETACH`ing, but it is now creating a large bottleneck for workloads that aggressively `ATTACH`/`DETACH`. RSS will be freed by further allocation activity, or when `SET allocator_background_threads=true;` is enabled. commit 4978ccd8ec15e7631fd9ed741d338da663b0ff48 Author: Sam Ansmink <samansmink@hotmail.com> Date: Tue Nov 4 16:34:16 2025 +0100 fix: add patch file commit 6ec168d508d9395306b29c62cb0b163b6a77bafb Author: Sam Ansmink <samansmink@hotmail.com> Date: Tue Nov 4 16:13:18 2025 +0100 format commit 67ec072c0ea6a237213f680709773e1342b11065 Author: Sam Ansmink <samansmink@hotmail.com> Date: Tue Nov 4 15:59:04 2025 +0100 fix: tests commit 7eccc643ae57a76a49e61b905f9a9a1857a00084 Author: Laurens Kuiper <laurens.kuiper@cwi.nl> Date: Tue Nov 4 15:47:29 2025 +0100 remove flush all from detach commit 3584a93938a4852b0510b0c3d6b3bb13861c4147 Author: Alex Kasko <alex@staticlibs.net> Date: Tue Nov 4 14:33:21 2025 +0000 Bump MySQL scanner Updating the MySQL scanner to include the time zone handling fix to duckdb/duckdb-mysql#166. commit 250b917ed6f423b56efbd855b2359a498fe2ef8d Author: Sam Ansmink <samansmink@hotmail.com> Date: Tue Nov 4 14:41:32 2025 +0100 fix: various issues with encryption commit f50618b48c3dd04f77ae557e3bb4863f96f74a76 Merge: 66100df7ae 8257973295 Author: Mark <mark.raasveldt@gmail.com> Date: Tue Nov 4 14:26:16 2025 +0100 Fix #19455: correctly extract root table in merge into when running ajoin that contains single-sided predicates that are transformed into filters (#19637) Fixes #19455 commit 82579732952d68dec2b2a44cc1ca04243ac57151 Merge: 6efd4a4fde 66100df7ae Author: Mytherin <mark.raasveldt@gmail.com> Date: Tue Nov 4 14:25:42 2025 +0100 Merge branch 'v1.4-andium' into mergeintointernalerror commit 66100df7aeb321d37f2434416df59dc274948987 Merge: d54d36faae c53eb7a562 Author: Mark <mark.raasveldt@gmail.com> Date: Tue Nov 4 14:24:10 2025 +0100 Detect invalid merge into action and throw exception (#19636) `WHEN NOT MATCHED (BY TARGET)` cannot be combined with `DELETE` or `UPDATE`, since there is no rows in the target table to delete or update. This PR ensures we throw an error when this is attempted. commit ca88f5b2cf9480ac8e57f436fbc89d327d19422a Author: Yannick Welsch <yannick@welsch.lu> Date: Tue Nov 4 10:57:57 2025 +0100 Use reserve instead commit 133a15ee61a64a831de46e4407f38d8bdd7b71f5 Author: Carlo Piovesan <piovesan.carlo@gmail.com> Date: Tue Nov 4 10:45:20 2025 +0100 Move also [persistence] tests back under ENABLE_UNITTEST_CPP_TESTS commit eb322ce251b5c4347650afc455171d862c51bf34 Author: Carlo Piovesan <piovesan.carlo@gmail.com> Date: Tue Nov 4 10:41:40 2025 +0100 Switch from running on PRs wasm_mvp to wasm_eh commit 9c5f82fa358fcf236cff21499351c1e739ca032a Author: Carlo Piovesan <piovesan.carlo@gmail.com> Date: Tue Nov 4 10:40:15 2025 +0100 Currently no external extension works on wasm or windows or musl To be expanded once that changes commit d54d36faae00120f548b39d1e21d93ca25f17087 Merge: 97fdeddb2b c01c994085 Author: Laurens Kuiper <laurens@duckdblabs.com> Date: Tue Nov 4 09:03:51 2025 +0100 Bump: spatial (#19620) This PR bumps the following extensions: - `spatial` from `61ede09bec` to `d83faf88cd` commit 6efd4a4fde180bf7d9c433977921818e5465c92a Author: Mytherin <mark.raasveldt@gmail.com> Date: Tue Nov 4 08:13:56 2025 +0100 Fix #19455: correctly extract root table in merge into when running a join that contains single-sided predicates that are transformed into filters commit c53eb7a56266157f0e9d97bd91be0d36285ec38b Author: Mytherin <mark.raasveldt@gmail.com> Date: Tue Nov 4 08:01:24 2025 +0100 Detect invalid merge into action and throw exception commit 97fdeddb2bd5c34862afd30177c9184f51f6dccd Merge: a0a46d6ed0 87193fd5ab Author: Mark <mark.raasveldt@gmail.com> Date: Tue Nov 4 07:48:43 2025 +0100 Try to prevent overshooting of `FILE_SIZE_BYTES` by pre-emptively increasing bytes written in Parquet writer (#19622) Helps with #19552, but doesn't fully fix the problem. We should look into a more robust fix for v1.5.0, but not for a bugfix release commit a0a46d6ed06dd962a4d6eeb01f3e14f8b275cec4 Merge: 73c0d0db15 3838c4a1ed Author: Mark <mark.raasveldt@gmail.com> Date: Tue Nov 4 07:48:27 2025 +0100 Increase cast-cost of old-style implicit cast to string (#19621) This PR fixes https://github.com/duckdb/duckdb-python/issues/148 The issue is that `list_extract` now has two overloads, one for a templated list `LIST<T>` and for concrete `VARCHAR` inputs. When binding a function we add a really high cost to selecting a templated overload to ensure we always pick something more specific if available. With our current casting rules, we are unable to cast `VARHCAR[]` to `VARCHAR`, and therefore fall back to the list-template as expected. But with old-style casting rules we allow `VARCHAR[]` to `VARCHAR` by also adding a high cost penalty, but its still lower than the cost of casting to the template - even though that would be the better alternative. With old-style casting we basically always have a lower-cost "fallback" option than selecting a template overload. While we should overhaul our casting system to evaluate the cast cost along more axes than just "score", this PR fixes this specific case by just cranking up the cost of old-style implicit to-string casts. commit c01c99408526b3c0d698028083481301af069824 Author: Max Gabrielsson <max@gabrielsson.com> Date: Mon Nov 3 22:42:37 2025 +0100 extension entries commit b77a9615117de845fa48463f09be20a89dea7434 Author: Carlo Piovesan <piovesan.carlo@gmail.com> Date: Mon Nov 3 17:35:39 2025 +0100 Add ducklake tests commit bd58abcdfb4485a1a9dbb750bd0587803fd1c559 Author: Carlo Piovesan <piovesan.carlo@gmail.com> Date: Mon Nov 3 17:35:15 2025 +0100 Load vortex tests commit 62fe1bff77a60fd690b9911aa7a38b7bc197f865 Author: Carlo Piovesan <piovesan.carlo@gmail.com> Date: Mon Nov 3 22:08:43 2025 +0100 Pass down extensions_test_selection -> complete commit e2604e6f5259453f482e0c49ca10520e89ddf269 Author: Yannick Welsch <yannick@welsch.lu> Date: Mon Nov 3 19:18:47 2025 +0100 Always has_metadata_blocks after checkpoint commit 73c0d0db15621d3d1c2936816becf27e2c41e2ab Merge: 286924e634 b518b2aa0b Author: Mark <mark.raasveldt@gmail.com> Date: Mon Nov 3 18:24:26 2025 +0100 Improve error message around compression type deprecation/availability checks (#19619) This PR fixes https://github.com/duckdblabs/duckdb-internal/issues/6436 The old code kept only a list of "deprecated" types, and returned a boolean, losing the context whether the compression type was available at one point and is now deprecated OR is newly introduced and not available yet in the storage version that is currently used. commit 0e5a33dae35aab5209a8e959cf48d7525fa7ec8d Author: Yannick Welsch <yannick@welsch.lu> Date: Thu Oct 30 19:28:54 2025 +0100 Verify blocks commit 286924e6348723138ca4dfd55b749d847bce59a9 Merge: 535f905874 c248313a1d Author: Mark <mark.raasveldt@gmail.com> Date: Mon Nov 3 17:12:32 2025 +0100 bump iceberg (#19618) commit 87193fd5abf342d6ddce9d984e69007a4ccdc7d2 Author: Laurens Kuiper <laurens.kuiper@cwi.nl> Date: Mon Nov 3 14:43:08 2025 +0100 try to prevent overshooting by pre-emptively increasing write size commit 3838c4a1edd83dc1373b6077dc6ee478bb996e50 Author: Max Gabrielsson <max@gabrielsson.com> Date: Mon Nov 3 13:55:53 2025 +0100 increase fallback string cast cost commit 535f90587495e0c8f5974a0968b06b15ad01b32e Merge: d643cefe13 06df593c60 Author: Laurens Kuiper <laurens@duckdblabs.com> Date: Mon Nov 3 13:49:57 2025 +0100 [DevEx] Improve error message when FROM clause is omitted (#18995) This PR fixes #18954 If the "similar bindings" is entirely empty, that means that there are no bindings, which can only happen if the FROM clause is entirely missing. commit 9268637337a21b9c03fdc7dceb0a88fbbe001a73 Author: Max Gabrielsson <max@gabrielsson.com> Date: Mon Nov 3 12:35:30 2025 +0100 bump extensions commit d643cefe13de6873f6fb0ecc0bca1c14111cde11 Merge: 5f8cf7d7f8 c6434fd89a Author: Mark <mark.raasveldt@gmail.com> Date: Mon Nov 3 12:28:33 2025 +0100 Avoid eagerly resolving the next on-disk pointer in the MetadataReader, as that pointer might not always be valid (#19588) When enabling the new [experimental metadata re-use](https://github.com/duckdb/duckdb/pull/18395), it is possible for metadata of *some* row groups to be re-used. This can cause linked lists of metadata blocks to contain invalid references. For example, when writing a bunch of row groups, we might get this layout: ``` METADATA BLOCK 1 ROW GROUP 1 ROW GROUP 2 (pt 1) NEXT BLOCK: 2 -> METADATA BLOCK 2 ROW GROUP 2 (pt 2) ROW GROUP 3 ``` Metadata is stored in a linked list (block 1 -> block 2) - but we don't need to traverse this linked list fully. We store pointers to individual row groups, and can start reading from their position. Now suppose we re-use metadata of `ROW GROUP 1`, but not of the other row groups (because e.g. they have been updated / changed). Since this is fully contained in `METADATA BLOCK 1`, we can garbage collect `METADATA BLOCK 2`, leaving the following metadata block: ``` METADATA BLOCK 1 ROW GROUP 1 ROW GROUP 2 (pt 1) NEXT BLOCK: 2 ``` Now we can safely read this block and read the metadata for `ROW GROUP 1`, **however**, this block contains a reference to a metadata block that is no longer valid and might have been garbage collected. This revealed a problem in the `MetadataReader`. In the current implementation of the `MetadataReader` - when pointing it towards a block, it would eagerly try to figure out the metadata location of *the next block*. This is normally not a problem, however, with these invalid chains, we might try to resolve a block that has been freed up already - causing an internal exception to trigger: ``` Failed to load metadata pointer (id %llu, idx %llu, ptr %llu) ``` This PR resolves the issue by making the MetadataReader lazy. Instead of eagerly resolving the next pointer, we only do this when it is actually required. commit b518b2aa0b06372d583fb203f5cae0011a53a87f Author: Tishj <t_b@live.nl> Date: Mon Nov 3 12:24:43 2025 +0100 enum util fix commit 5f8cf7d7f81981f4b2355959257fa82982c3dd11 Merge: 407720a348 2cdc7f922b Author: Laurens Kuiper <laurens@duckdblabs.com> Date: Mon Nov 3 12:22:52 2025 +0100 add vortex external extension (#19580) commit 7c2353cb06d867813b7725f893a6b1092821c807 Author: Tishj <t_b@live.nl> Date: Mon Nov 3 11:21:32 2025 +0100 differentiate between deprecated/not available yet in the check, to improve error reporting commit c248313a1dd40f1569b608b80bdec1229de0b6b4 Author: Tmonster <tom@ebergen.com> Date: Mon Nov 3 10:54:40 2025 +0100 bump iceberg commit 407720a34804f0da61d5ba6645c3c44ec6ddf0d8 Merge: 7764771eaa d4fb98d454 Author: Mark <mark.raasveldt@gmail.com> Date: Sun Nov 2 15:01:29 2025 +0100 Wal index deletes (#19477) This adds support for buffering and replaying Index delete operations for WAL replay. During WAL replay, index operations are buffered since the Indexes are not bound yet. During Index binding, the buffered operations are applied to the Index. UnboundIndex is modified to support buffering delete operations on top of inserts. BoundIndex::ApplyBufferedAppends is changed to a BoundIndex::ApplyBufferedReplays which supports replaying both inserts and deletes. Documentation along relevant code paths is added which clarifies the ordering of mapped_column_ids and the index_chunks being buffered. Before, the mapping was any order since it was only coming from Index insert paths. Now, buffering can come from both insert and delete paths, so both need to make sure to buffer index chunks and the mappings in the same order, (which is just in sorted order of the physical Index column IDs). There is also a bug fix for buffering index data on a table with generated columns, since the table chunk being created for replaying buffered operations contained all column types previously, including generated columns, whereas now it only contains the physical column layout which is needed for index operations. (ART Index operations take a chunk of data with only the index columns containing any data, and the non-Indexed columns are empty). A catch block is added to Transaction CleanupState::Flush which was silently throwing away any failures (which caught this WAL replay in the first place). Also, some test coverage for ART duplicate rowids and a LookupInLeaf function was added which allows searching for a rowid in a Leaf that is either inlined, or a gate node to a nested ART. @taniabogatsch commit c6434fd89a7391e428f2cb31e6e3d676d5257b0d Author: Mytherin <mark.raasveldt@gmail.com> Date: Sun Nov 2 14:54:33 2025 +0100 Fix lock order inversion commit eb514c01e4ea4ad434fb87fde70307f64992d52a Merge: 2f3d2db509 7764771eaa Author: Mytherin <mark.raasveldt@gmail.com> Date: Sun Nov 2 09:45:34 2025 +0100 Merge branch 'v1.4-andium' into metadatareusefixes commit 7764771eaa654cb44f5c731e99f5d989951aefb8 Merge: 9ea6e07a29 fc2bf610d0 Author: Mark <mark.raasveldt@gmail.com> Date: Sun Nov 2 09:44:54 2025 +0100 Skip compiling remote optimizer test when TSAN Is enabled (#19590) This test uses `fork` which seems to mess up the thread sanitizer, causing strange errors to occur sporadically. commit fc2bf610d0c9851d1e3f6ad273dcfb47b6ec60a6 Author: Mytherin <mark.raasveldt@gmail.com> Date: Sat Nov 1 23:27:43 2025 +0100 Skip compiling entirely commit a68390e2b1a6f09b899d248881d331e5dbbab89a Author: Mytherin <mark.raasveldt@gmail.com> Date: Sat Nov 1 23:23:18 2025 +0100 Skip fork test with tsan commit 2f3d2db50968fd917f253c2c34cf488290dadfa4 Author: Mytherin <mark.raasveldt@gmail.com> Date: Sat Nov 1 15:27:51 2025 +0100 Avoid eagerly resolving the next on-disk pointer in the MetadataReader, as that pointer might not always be valid commit 9ea6e07a290db878c9da097d407b3a866c43c8e0 Merge: 5f1ce8ba5c a740840f97 Author: Mark <mark.raasveldt@gmail.com> Date: Sat Nov 1 09:25:59 2025 +0100 Fix edge case in uncompressed validity scan with offset and fix off-by-one in ArrayColumnData::Select (#19567) This PR fixes a off-by-one in the consecutive-array-scan optimization implemented in https://github.com/duckdb/duckdb/pull/16356 as well as an edge case in our uncompressed validity data scan. Fixes https://github.com/duckdb/duckdb/issues/19377 I can't figure out how to write a test for this, it seems like no matter what I do I'm unable to replicate the same storage characteristics as the database file provided in the issue above. In the repro we do a scan+skip+scan, where part of the first `validity_t` in the second scan contains a bunch of zeroes at the positions "before" the scan window that remain even after shifting. I've solved it by setting all lower bits up to `result_idx` in the first `validity_t` we scan, but not sure if this is the most elegant solution. Strangely enough If we remove all bitwise logic and just do the same "fall-back" logic as ifdef:ed for `VECTOR_SIZE < 128` it all works though, so the issue has to be part of the bit-manipulation. commit 5f1ce8ba5c0000770412b35a763af417f8fb2b90 Merge: be0142d4ee dbe272dff0 Author: Mark <mark.raasveldt@gmail.com> Date: Sat Nov 1 09:22:00 2025 +0100 [v1.4-andium] Add Profiler output to logger interface (#19572) This is https://github.com/duckdb/duckdb/pull/19546 backported to `v1.4-andium` branch, see conversation there. --- Idea is: if both profiler and logger are enabled, then you can access profiler output also via logger. This is on top / independent of the current choices for where to output the profiler (JSON / graphviz / query-tree / ...). While this might be somewhat wasteful, it's allow for an easier PR and leave unopinionated what should the SQL interface be. Also given ToLog() call is inexpensive (in particular if the logger is disabled), and that it's unclear if logger alone can satisfy profiler necessities, I think going additive is the best path here. Demo: ```sql ATTACH 'my_db.db'; USE my_db; ---- enable profiling to json file PRAGMA profiling_output = 'profiling_output.json'; PRAGMA enable_profiling = 'json'; ---- enable logging (to in-memory table) call enable_logging(); ---- CREATE TABLE small AS FROM range(100); CREATE TABLE medium AS FROM range(10000); CREATE TABLE big AS FROM range(1000000); PRAGMA disable_profiling; SELECT query_id, type, metric, value FROM duckdb_logs_parsed('Metrics') WHERE metric == 'CPU_TIME'; ``` Will result in for example in: ``` ┌──────────┬─────────┬──────────┬───────────────────────┐ │ query_id │ type │ metric │ value │ │ uint64 │ varchar │ varchar │ varchar │ ├──────────┼─────────┼──────────┼───────────────────────┤ │ 10 │ Metrics │ CPU_TIME │ 8.1041e-05 │ │ 11 │ Metrics │ CPU_TIME │ 0.0002499510000000001 │ │ 12 │ Metrics │ CPU_TIME │ 0.02776677799999981 │ └──────────┴─────────┴──────────┴───────────────────────┘ ``` A more complex example would be for example: With the duckdb cli, execute: ```sql PRAGMA profiling_output = 'metrics_folder/tmp_profiling_output.json'; PRAGMA enable_profiling = 'json'; CALL enable_logging(storage='file', storage_path='./metrics_folder'); --- arbitrary queryies CREATE TABLE small AS FROM range(100); CREATE TABLE medium AS FROM range(10000); CREATE TABLE big AS FROM range(1000000); ``` then close, restart duckdb cli, and query what's persisted in the `metric_folder` folder: ```sql PRAGMA disable_profiling; CALL enable_logging(storage='file', storage_path='./metrics_folder'); SELECT queries.message, metrics.metric, TRY_CAST(metrics.value AS DOUBLE) as value FROM duckdb_logs_parsed('QueryLog') queries, duckdb_logs_parsed('Metrics') metrics WHERE queries.query_id = metrics.query_id AND metrics.metric = 'CPU_TIME';``` ``` ``` ┌─────────────────────────────────────────────┬──────────┬─────────────────────────────────────┐ │ message │ metric │ TRY_CAST(metrics."value" AS DOUBLE) │ │ varchar │ varchar │ double │ ├─────────────────────────────────────────────┼──────────┼─────────────────────────────────────┤ │ CREATE TABLE small AS FROM range(100); │ CPU_TIME │ 8.1041e-05 │ │ CREATE TABLE medium AS FROM range(10000); │ CPU_TIME │ 0.0002499510000000001 │ │ CREATE TABLE big AS FROM range(1000000); │ CPU_TIME │ 0.02776677799999981 │ └─────────────────────────────────────────────┴──────────┴─────────────────────────────────────┘ ``` commit be0142d4ee0385262520ae2488e8dd11ac213735 Merge: b68a1696de 7df4151c0d Author: Mark <mark.raasveldt@gmail.com> Date: Sat Nov 1 09:21:19 2025 +0100 fix inconsistent behavior in remote read_file/blob, and prevent union… (#19531) Closes https://github.com/duckdb/duckdb-fuzzer/issues/4208 Closes https://github.com/duckdb/duckdb/issues/19090 Our remote filesystem doesn't actually check that files exist when "globbing" a non-glob pattern. Now we check that the file exists in the read_blob/text function even if we just access the file name. Diff is a bit bigger cause I also moved a bunch of templated stuff into the cpp file. commit 06df593c60bb22973642d776c1c3c3aca85ee0d6 Author: Tishj <t_b@live.nl> Date: Fri Oct 31 15:26:18 2025 +0100 fix up tests commit 2cdc7f922bde5550aa1ecd24dabf23b05fbf202b Author: Sam Ansmink <samansmink@hotmail.com> Date: Fri Oct 31 15:10:31 2025 +0100 add vortex external extension commit b68a1696de1a603b59e39efc25da7fc2826a3135 Merge: 8169d4f15c 9414882f7f Author: Mark <mark.raasveldt@gmail.com> Date: Fri Oct 31 13:45:16 2025 +0100 Release relevant tests to still be run on all builds (#19559) I would propose, at least for the Linux builds, to add back a minimal amount of tests also on release builds. They will ensure at a minimum that: * for a given release, the corresponding storage_version is valid * for a minor release, that the corresponding name has been set There are more tests that we might consider basic enough AND connected to behaviour specific of a release that we might want to add to the `release` tag. Fixes https://github.com/duckdb/duckdb/issues/19354 (together with https://github.com/duckdb/duckdb/pull/19525 that actually added the name). Note that given the current release process happens in advance, eventual test failure are annoying but not fatal, but they will require changes to code. I am not sure if it's worth having a `keep_going_in_all_cases` option, basically turning the boolean into a set, but I think it can be done when need arise. commit 8169d4f15cf556d0ca0ec68d9c876c2bb84aae09 Merge: d9028d09d5 6e2c195859 Author: Mark <mark.raasveldt@gmail.com> Date: Fri Oct 31 13:44:30 2025 +0100 Fix race condition between `Append` and `Scan` (#19571) Update `ColumnData::count` only after actually `Append` the data to avoid Race Condition with `Scan`. See https://github.com/duckdb/duckdb/issues/19570 for details. commit d4fb98d45409bcaaf8c3030c7aa7e40b1f60b9d1 Merge: 0743b590d3 d9028d09d5 Author: Artjom Plaunov <artyemnyc@gmail.com> Date: Fri Oct 31 11:16:23 2025 +0100 Merge remote-tracking branch 'upstream/v1.4-andium' into wal-index-deletes commit 0743b590d361041cc167f0634250f78c20f4d332 Author: Artjom Plaunov <artyemnyc@gmail.com> Date: Fri Oct 31 11:15:04 2025 +0100 remove C++ test, add extra interleaved index replay SQL test commit 5ca334715faa6c871c8e96029c142aacf53969a7 Author: Tishj <t_b@live.nl> Date: Fri Oct 31 10:43:03 2025 +0100 fix up tests commit 0a6b5fb4919a8092b38e19051a9286eeaaeb392c Merge: 0b1f0e320a d9028d09d5 Author: Tishj <t_b@live.nl> Date: Fri Oct 31 10:38:56 2025 +0100 Merge branch 'v1.4-andium' into missing_from_clause_better_error commit a740840f9772a1702a5ffeec43694c48be3526c5 Author: Max Gabrielsson <max@gabrielsson.com> Date: Thu Oct 30 18:04:39 2025 +0100 fix consecutive array range calculation, fix validity scanning when bits before result offset are null commit 6e2c195859a496f1f98c20fd887fac944ba0e344 Author: zhangxizhe <zhangxizhe.zxz@alibaba-inc.com> Date: Fri Oct 31 13:43:19 2025 +0800 Update `ColumnData::count` only after actually `Append` the data to avoid Race Condition with `Scan`. See `issue #19570` for details. commit d9028d09d56640599dd8307dd9ae6c8837267e9f Merge: 307f9b41ff 6bc51dd58e Author: Laurens Kuiper <laurens@duckdblabs.com> Date: Fri Oct 31 08:47:10 2025 +0100 Disable jemalloc on BSD (#19560) Fixes https://github.com/duckdb/duckdb/issues/14363 commit dbe272dff0a63d0d01269cee05945a0b016d219f Author: Carlo Piovesan <piovesan.carlo@gmail.com> Date: Wed Oct 29 23:51:42 2025 +0100 Add Profiler output to logger interface Idea is: if both profiler and logger are enabled, then you can access profiler output also via logger. This is on top / independent of the current choices for where to output the profiler (JSON / graphviz / query-tree / ...). While this might be somewhat wasteful, it's allow for an easier PR and leave unopinionated what should the SQL interface be. Also given ToLog() call is inexpensive (in particular if the logger is disabled), and that it's unclear if logger alone can satisfy profiler necessities, I think going additive is the best path here. Demo: ```sql ATTACH 'my_db.db'; USE my_db; ---- enable profiling to json file PRAGMA profiling_output = 'profiling_output.json'; PRAGMA enable_profiling = 'json'; ---- enable logging (to in-memory table) call enable_logging(); ---- CREATE TABLE small AS FROM range(1000); CREATE TABLE medium AS FROM range(1000000); CREATE TABLE big AS FROM range(1000000000); PRAGMA disable_profiling; SELECT * EXCLUDE timestamp FROM duckdb_logs() WHERE type == 'Metrics' ORDER BY message.split(',')[1], context_id; ``` Will result in for example in: ``` ┌────────────┬─────────┬───────────┬────────────────────────────────────────────────────────────┐ │ context_id │ type │ log_level │ message │ │ uint64 │ varchar │ varchar │ varchar │ ├────────────┼─────────┼───────────┼────────────────────────────────────────────────────────────┤ │ 39 │ Metrics │ INFO │ {'metric': CHECKPOINT_LATENCY, 'value': 0.0} │ │ 44 │ Metrics │ INFO │ {'metric': CHECKPOINT_LATENCY, 'value': 0.0} │ │ 49 │ Metrics │ INFO │ {'metric': CHECKPOINT_LATENCY, 'value': 0.017832} │ │ 39 │ Metrics │ INFO │ {'metric': COMMIT_WRITE_WAL_LATENCY, 'value': 0.000305292} │ │ 44 │ Metrics │ INFO │ {'metric': COMMIT_WRITE_WAL_LATENCY, 'value': 0.003793958} │ │ 49 │ Metrics │ INFO │ {'metric': COMMIT_WRITE_WAL_LATENCY, 'value': 0.0} │ │ 39 │ Metrics │ INFO │ {'metric': CPU_TIME, 'value': 0.000110209} │ │ 44 │ Metrics │ INFO │ {'metric': CPU_TIME, 'value': 0.009471759999999997} │ │ 49 │ Metrics │ INFO │ {'metric': CPU_TIME, 'value': 8.241736770029297} │ │ · │ · │ · │ · │ │ · │ · │ · │ · │ │ · │ · │ · │ · │ │ 39 │ Metrics │ INFO │ {'metric': SYSTEM_PEAK_BUFFER_MEMORY, 'value': 36864} │ │ 44 │ Metrics │ INFO │ {'metric': SYSTEM_PEAK_BUFFER_MEMORY, 'value': 6625280} │ │ 49 │ Metrics │ INFO │ {'metric': SYSTEM_PEAK_BUFFER_MEMORY, 'value': 63510528} │ │ 39 │ Metrics │ INFO │ {'metric': TOTAL_BYTES_WRITTEN, 'value': 0} │ │ 44 │ Metrics │ INFO │ {'metric': TOTAL_BYTES_WRITTEN, 'value': 262144} │ │ 49 │ Metrics │ INFO │ {'metric': TOTAL_BYTES_WRITTEN, 'value': 12587008} │ │ · │ · │ · │ · │ │ · │ · │ · │ · │ │ · │ · │ · │ · │ ├────────────┴─────────┴───────────┴────────────────────────────────────────────────────────────┤ │ 57 rows (? shown) 4 columns │ └───────────────────────────────────────────────────────────────────────────────────────────────┘ ``` commit 307f9b41ff0464dba0e0f2504c75747c7ead2ecc Merge: 1cba2e741b 08bf725300 Author: Mark <mark.raasveldt@gmail.com> Date: Thu Oct 30 15:03:25 2025 +0100 [ported from main] Fix bug initializing std::vector for column names (#19555) This 4 line fix was merged with main in #19444. It should be in v1.4-andium as well so that it makes it into v1.4.2. commit 1cba2e741b6622f5be156c061478a6fa66c0f819 Merge: ecb6bfe5b4 80554e4d59 Author: Laurens Kuiper <laurens@duckdblabs.com> Date: Thu Oct 30 14:47:58 2025 +0100 Bugfixes: Parquet JSON+DELTA_LENGTH_BYTE_ARRAY and sorting iterator (#19556) This PR fixes an issue introduced in v1.4.1 with the Parquet reader when combining a `JSON` column with `DELTA_LENGTH_BYTE_ARRAY` encoding. The issue was caused by trying to validate an entire block of strings in one go, which is OK for UTF-8, but for JSON. This PR makes it so we validate individual strings if the column has `JSON` type. Fixes https://github.com/duckdb/duckdb/issues/19366 This PR also fixes an issue with the new sorting code, which had an error in the calculation of subtraction under modulo. I've fixed this, and unified the code for `InMemoryBlockIteratorState` and `ExternalBlockIteratorState` with some templating, so now the erroneous calculation should be gone from both state types. Fixes https://github.com/duckdb/duckdb/issues/19498 commit 9414882f7fc81be58af0ec914cbe8c6045af3517 Author: Carlo Piovesan <piovesan.carlo@gmail.com> Date: Thu Oct 30 12:39:48 2025 +0100 Allow back basics tests also in release mode commit 2987acd0d19656e583f30447a91852793ef188f7 Author: Carlo Piovesan <piovesan.carlo@gmail.com> Date: Thu Oct 30 12:36:32 2025 +0100 Add test on codename being registered, and tag it as release commit 6bc51dd58edaf76725810b595a5300044749c0cf Author: Laurens Kuiper <laurens.kuiper@cwi.nl> Date: Thu Oct 30 13:24:45 2025 +0100 disable jemalloc BSD commit 80554e4d592ec793676a80b180469a572a247f2a Merge: 5974ef8c03 ecb6bfe5b4 Author: Laurens Kuiper <laurens.kuiper@cwi.nl> Date: Thu Oct 30 09:57:58 2025 +0100 Merge branch 'v1.4-andium' into bugfixes_v1.4 commit 08bf725300335d34f05cd6f6f508f78ef57c477b Author: Curt Hagenlocher <curt@hagenlocher.org> Date: Fri Oct 17 14:08:52 2025 -0700 Fix bug initializing std::vector for column names commit ecb6bfe5b483ffd1a2a490275b48ec91501680c4 Merge: 09a36d2f73 94471b8e04 Author: Hannes Mühleisen <227792+hannes@users.noreply.github.com> Date: Thu Oct 30 09:01:41 2025 +0200 Follow up to staging move (#19551) Follow up to #19539, CF does not like AWS regions commit 94471b8e0472a2507623b2408808156f6ddde764 Author: Hannes Mühleisen <hannes@muehleisen.org> Date: Thu Oct 30 07:49:34 2025 +0200 this region does not exist in cf commit 09a36d2f73d1b2f93682e315761bb3c4973f8ac9 Merge: a23f54fb54 c2a4fc29dc Author: Mark <mark.raasveldt@gmail.com> Date: Wed Oct 29 21:51:05 2025 +0100 [Dev] Disable the use of `ZSTD` if the block_manager is the `InMemoryBlockManager` (#19543) This PR fixes https://github.com/duckdblabs/duckdb-internal/issues/6319 This has to be done because the InMemoryBlockManager doesnt support GetFreeBlockId, which is required by the zstd compression method. I couldn't produce a test for this because I can't reproduce the problem in the unittester, only in the CLI (I assume the storage version prevents in-memory compression???) commit c2a4fc29dceb617c80ab9156d84f2320add29542 Author: Tishj <t_b@live.nl> Date: Wed Oct 29 16:37:20 2025 +0100 add test for disabled zstd compression in memory commit 5974ef8c03afcd01df670a42dd7be0bbb2a6c6ff Author: Laurens Kuiper <laurens.kuiper@cwi.nl> Date: Wed Oct 29 16:34:54 2025 +0100 properly set file paht in test commit a35ba26f267eca2fb144e07b14706af2b96270a8 Author: Tishj <t_b@live.nl> Date: Wed Oct 29 15:19:03 2025 +0100 disable the use of ZSTD if the block_manager is the InMemoryBlockManager, since it doesnt support GetFreeBlockId commit fd85508aa0065a18180a6f9af1d4c66842b28964 Author: Laurens Kuiper <laurens.kuiper@cwi.nl> Date: Wed Oct 29 15:08:06 2025 +0100 re-add missing initialization commit a23f54fb54c686614cdaf547778b4c6f47bcbf5c Merge: f2e48a73d4 ab586dfaf6 Author: Hannes Mühleisen <227792+hannes@users.noreply.github.com> Date: Wed Oct 29 14:52:40 2025 +0200 Creating separate OSX cli binaries for each arch (#19538) Also no longer adding the shared library three times because of symlinks commit f2e48a73d42ce538706529e51aec54cfd9f96d84 Merge: 5a6521ca7e ccefe12386 Author: Hannes Mühleisen <227792+hannes@users.noreply.github.com> Date: Wed Oct 29 14:51:26 2025 +0200 Moving staging to cf and uploading to install bucket (#19539) This adds a custom endpoint for staging uploads so we can move to R2 for this. We also add functionality to upload to the R2 bucket behind `install.duckdb.org`. Once merged, I will update/add the following secrets: - `S3_DUCKDB_STAGING_ENDPOINT` - `S3_DUCKDB_STAGING_ID` - `S3_DUCKDB_STAGING_KEY` - `DUCKDB_INSTALL_S3_ENDPOINT` - `DUCKDB_INSTALL_S3_ID` - `DUCKDB_INSTALL_S3_SECRET` commit f5bc9796be79b602ed1892484e060f0e79083610 Author: Laurens Kuiper <laurens.kuiper@cwi.nl> Date: Wed Oct 29 13:43:05 2025 +0100 nicer templating and less code duplication commit ccefe12386007dd65fae1fe3ff1d65bcb45df44d Author: Hannes Mühleisen <227792+hannes@users.noreply.github.com> Date: Wed Oct 29 14:18:15 2025 +0200 Update .github/workflows/StagedUpload.yml Co-authored-by: Carlo Piovesan <piovesan.carlo@gmail.com> commit 41fc70ae3312599e425d140f7db770f56c2c5c38 Author: Hannes Mühleisen <227792+hannes@users.noreply.github.com> Date: Wed Oct 29 14:00:41 2025 +0200 Update .github/workflows/StagedUpload.yml Co-authored-by: Carlo Piovesan <piovesan.carlo@gmail.com> commit e8c2d9401b580c64ef5d3cad3cb8d301375ddbd3 Author: Hannes Mühleisen <hannes@muehleisen.org> Date: Wed Oct 29 12:35:30 2025 +0200 moving staging to cf and uploading to install bucket commit 7df4151c0d4967e2dd33eff7f426805df3c56442 Author: Max Gabrielsson <max@gabrielsson.com> Date: Wed Oct 29 10:58:22 2025 +0100 remove named parameters commit ab586dfaf6bf58fa8376944e599c51efea462cb8 Author: Hannes Mühleisen <hannes@muehleisen.org> Date: Wed Oct 29 11:46:18 2025 +0200 creating separate osx cli binaries for each arch commit 8f30296d7c05c277771bf1fe95b73fafe7fa9d0f Merge: 5dac9f7504 5a6521ca7e Author: Artjom Plaunov <artyemnyc@gmail.com> Date: Wed Oct 29 09:39:30 2025 +0100 Merge remote-tracking branch 'upstream/v1.4-andium' into wal-index-deletes commit 5a6521ca7e744205e4c3b67cab8708e2df87073b Merge: 8c7210f9b0 601d68526c Author: Mark <mark.raasveldt@gmail.com> Date: Wed Oct 29 07:55:06 2025 +0100 Add test that either 'latest' or 'vX.Y.Z' are supported STORAGE_VERSIONs (#19527) Connected to https://github.com/duckdb/duckdb/pull/19525, adds a test that would have triggered. That test is not build when actually building releases, so that's not fool-proof, but I think adding this in is helpful. Tested locally to behave as intended both on dev commit (success) and tag (fails, fixed via linked PR). commit 8c7210f9b0270517e1dba11502dc196a3f0cb13c Merge: 7b5c16f2d5 99f26bde2d Author: Mark <mark.raasveldt@gmail.com> Date: Tue Oct 28 18:58:35 2025 +0100 add upcoming patch release to internal versions (#19525) commit 7b5c16f2d51dda602c9ddfed58d71bb6ae3275a0 Merge: 23228babba 295603915b Author: Mark <mark.raasveldt@gmail.com> Date: Tue Oct 28 18:58:16 2025 +0100 Bump multiple extensions (#19522) This PR bumps the following extensions: - `avro` from `7b75062f63` to `93da8a19b4` - `delta` from `03aaf0f073` to `0747c23791` - `ducklake` from `f134ad86f2` to `2554312f71` - `iceberg` from `4f3c5499e5` to `30a2c66f10` - `spatial` from `a6a607fe3a` to `61ede09bec` commit 23228babba519ec70b183b03ea6bc4457b3ed84c Merge: 71a64b5ab4 6a38ac0f69 Author: Mark <mark.raasveldt@gmail.com> Date: Tue Oct 28 18:58:00 2025 +0100 Bump: inet (#19526) This PR bumps the following extensions: - `inet` from `f6a2a14f06` to `fe7f60bb60 (patches removed: 1)` commit 067d6eb0d5c56270f1d24951966191d9c12c3008 Author: Max Gabrielsson <max@gabrielsson.com> Date: Tue Oct 28 17:33:43 2025 +0100 fix inconsistent behavior in remote read_file/blob, and prevent union_by_name from crashing commit 601d68526c9e616ff08a0e08d949f00dcfb76060 Author: Carlo Piovesan <piovesan.carlo@gmail.com> Date: Tue Oct 28 13:11:45 2025 +0100 Add test that either 'latest' or 'vX.Y.Z' are supported STORAGE_VERSIONs commit c63c5060d01340dc11f39349bf7950fb8eaa455b Author: Laurens Kuiper <laurens.kuiper@cwi.nl> Date: Tue Oct 28 15:55:12 2025 +0100 fix #19498 commit 7e52dc5a75532c5413088fbb9f90e6a30f9e5d14 Author: Laurens Kuiper <laurens.kuiper@cwi.nl> Date: Tue Oct 28 15:54:56 2025 +0100 add missing test commit 71a64b5ab4005fd2eb63cb3912403fde29f4d7e0 Merge: 76ee047ce4 3856fa8ea8 Author: Mark <mark.raasveldt@gmail.com> Date: Tue Oct 28 14:30:18 2025 +0100 Support non-standard NULL in Parquet again (#19523) https://github.com/duckdb/duckdb/pull/19406 removed support for the non-standard NULL by adding the safe enum casts. Support for this was explicitly added in https://github.com/duckdb/duckdb/pull/11774 We could consider removing support for this - but it shouldn't be done as part of a bug-fix release imo. This also currently breaks merging v1.4 -> main. commit 05fb1249cab3404bc396ccaee0cdb1959ae11481 Author: Laurens Kuiper <laurens.kuiper@cwi.nl> Date: Tue Oct 28 14:19:50 2025 +0100 fix #19366 commit 5dac9f750490e1ea601b03d8e3d11db7a9cc0197 Merge: 0d4a78c90f 76ee047ce4 Author: Artjom Plaunov <artyemnyc@gmail.com> Date: Tue Oct 28 13:14:30 2025 +0100 Merge remote-tracking branch 'upstream/v1.4-andium' into wal-index-deletes commit 0d4a78c90f6288abe842afab521ba1e7a075307f Author: Artjom Plaunov <artyemnyc@gmail.com> Date: Tue Oct 28 13:12:44 2025 +0100 remove int types commit 6a38ac0f699f2f85adda33d61c94c6ec054d89ca Author: Sam Ansmink <samansmink@hotmail.com> Date: Tue Oct 28 13:08:40 2025 +0100 bump extensions commit 3cd616b89657c5489844d8a76d26169554e5af96 Author: Artjom Plaunov <artyemnyc@gmail.com> Date: Tue Oct 28 12:57:05 2025 +0100 PR review fixes + more C++ test coverage commit 0fde0c573099c317b0710ed42d87864ee4b75c00 Merge: baa522991e 76ee047ce4 Author: Laurens Kuiper <laurens.kuiper@cwi.nl> Date: Tue Oct 28 12:32:44 2025 +0100 Merge branch 'v1.4-andium' into bugfixes_v1.4 commit 99f26bde2d03e9958ac4bd37f5f8a0ac67b2fcd3 Author: Sam Ansmink <samansmink@hotmail.com> Date: Tue Oct 28 12:07:39 2025 +0100 add upcoming patch release to internal versions commit 3856fa8ea82bd8b9c11166102aab602ddf165ee2 Author: Mytherin <mark.raasveldt@gmail.com> Date: Tue Oct 28 11:19:35 2025 +0100 Support non-standard NULL in Parquet again commit 295603915b0ab3a1532cbbe6cf9547f9803e3c46 Author: Sam Ansmink <samansmink@hotmail.com> Date: Tue Oct 28 10:58:22 2025 +0100 bump extensions commit c1d826f2523bd8454426ad7401665e8e69f9dadc Author: Artjom Plaunov <artyemnyc@gmail.com> Date: Tue Oct 28 08:55:00 2025 +0100 unnamed name space commit 76ee047ce45bab9472068ea360f9894a3a456a83 Merge: b62b03c4b3 bd3eb153b1 Author: Laurens Kuiper <laurens@duckdblabs.com> Date: Tue Oct 28 08:34:42 2025 +0100 Make `DatabaseInstance::…

evertlammerts mentioned this pull request Nov 1, 2025

Fix InsertRelation on attached database duckdb/duckdb#19583

Merged

Tishj reviewed Nov 5, 2025

View reviewed changes

src/duckdb_py/pyrelation.cpp Outdated Show resolved Hide resolved

Tishj reviewed Nov 5, 2025

View reviewed changes

src/duckdb_py/pyrelation.cpp Show resolved Hide resolved

Tishj requested changes Nov 5, 2025

View reviewed changes

evertlammerts force-pushed the insert_rel_fix branch from 57d7a05 to c26e02d Compare November 6, 2025 10:55

Tishj previously approved these changes Nov 6, 2025

View reviewed changes

lnkuiper added a commit to duckdb/duckdb that referenced this pull request Nov 7, 2025

Fix InsertRelation on attached database (#19583)

7ce99bc

Fixes #18396 Related PR in duckdb-python: duckdb/duckdb-python#155

Fix InsertRelation on attached database

95a9968

evertlammerts force-pushed the insert_rel_fix branch from c26e02d to 97fd6c6 Compare November 7, 2025 08:31

evertlammerts marked this pull request as ready for review November 7, 2025 08:31

review feedback

20bfd52

evertlammerts dismissed Tishj’s stale review via 20bfd52 November 10, 2025 12:36

evertlammerts force-pushed the insert_rel_fix branch from 97fd6c6 to 20bfd52 Compare November 10, 2025 12:36

evertlammerts merged commit 13827e7 into duckdb:v1.4-andium Nov 10, 2025
3 of 14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix InsertRelation on attached database #155

Fix InsertRelation on attached database #155

Uh oh!

evertlammerts commented Nov 1, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Tishj left a comment

Uh oh!

Tishj left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix InsertRelation on attached database #155

Fix InsertRelation on attached database #155

Uh oh!

Conversation

evertlammerts commented Nov 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Tishj left a comment

Choose a reason for hiding this comment

Uh oh!

Tishj left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

evertlammerts commented Nov 1, 2025 •

edited

Loading