-
Notifications
You must be signed in to change notification settings - Fork 45
Fix InsertRelation on attached database #155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
evertlammerts
merged 2 commits into
duckdb:v1.4-andium
from
evertlammerts:insert_rel_fix
Nov 10, 2025
Merged
Fix InsertRelation on attached database #155
evertlammerts
merged 2 commits into
duckdb:v1.4-andium
from
evertlammerts:insert_rel_fix
Nov 10, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Tishj
reviewed
Nov 5, 2025
Tishj
reviewed
Nov 5, 2025
Tishj
requested changes
Nov 5, 2025
Collaborator
Tishj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some questions, but I'm also missing a test
57d7a05 to
c26e02d
Compare
Tishj
previously approved these changes
Nov 6, 2025
Collaborator
Tishj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
lnkuiper
added a commit
to duckdb/duckdb
that referenced
this pull request
Nov 7, 2025
Fixes #18396 Related PR in duckdb-python: duckdb/duckdb-python#155
c26e02d to
97fd6c6
Compare
97fd6c6 to
20bfd52
Compare
mach-kernel
added a commit
to spiceai/duckdb
that referenced
this pull request
Nov 14, 2025
Squashed commit of the following:
commit 68d7555f68bd25c1a251ccca2e6338949c33986a
Merge: 3d4d568674 9c6efc7d89
Author: Mark <mark.raasveldt@gmail.com>
Date: Tue Nov 11 11:59:30 2025 +0100
Fix minor crypto issues (#19716)
commit 3d4d568674d1e05d221e8326c0d180336c350f18
Merge: 7386b4485d 0dea05daf8
Author: Mark <mark.raasveldt@gmail.com>
Date: Tue Nov 11 10:58:18 2025 +0100
Logs to be case-insensitive also at enable_logging callsite (#19734)
Currently `CALL enable_logging('http');` would succeed, but then select
an empty subset of the available logs (`http` != `HTTP`), due to a quirk
in the code. This PR fixes that up.
commit 7386b4485d23bc99c9f6efab6ce0e33ecc23222b
Merge: 1ef3444f09 d4a77c801b
Author: Mark <mark.raasveldt@gmail.com>
Date: Tue Nov 11 09:28:13 2025 +0100
Add explicit Initialize(HTTPParam&) method to HTTPClient (#19723)
This allow explicit re-initialization of specific parts of HTTPClient(s)
This diff would allow patterns such reusing partially constructed (but
properly re-initialized) HTTPClient objects
```c++
struct CrossQueryState {
// in some state kept around
unique_ptr<HTTPClient>& client;
};
void SomeFunction() {
// ...
http_util.Request(get_request, client);
// some more logic, same query
http_util.Request(get_request, client);
}
void SomeOtherFunction() {
// Re-initialize part of the client, given some settings might have changed
auto http_params = HTTPParams(http_util)
client->Initialize(http_params);
// ...
http_util.Request(get_request, client);
// some more logic, same query
http_util.Request(get_request, client);
}
```
Note that PR is fully opt-in from users, while if you implement a
file-system abstraction inheriting from HTTPClient you should get a
compiler error pointing to implementing the relevant function.
commit 9c6efc7d89ee5ca60598c7e43778c0e9b34b266b
Author: Mark <mark.raasveldt@gmail.com>
Date: Tue Nov 11 08:09:02 2025 +0100
Fix typo
commit e52f71387731da1202fc33755922999a472218a1
Author: Mark <mark.raasveldt@gmail.com>
Date: Tue Nov 11 08:08:32 2025 +0100
Add require to test
commit 1ef3444f09b1df6e4a7cc3ad1d67868ecaa1a6a4
Merge: 8090b8d52e dff5b7f608
Author: Mark <mark.raasveldt@gmail.com>
Date: Tue Nov 11 08:07:17 2025 +0100
Bump the Postgres scanner extension (#19730)
commit 0dea05daf823237a2de28ec7c0fec53dbb006475
Author: Carlo Piovesan <piovesan.carlo@gmail.com>
Date: Tue Nov 11 06:42:36 2025 +0100
Logs to be case-insensitive also at enable_logging callsite
commit 8090b8d52ed6bfd31b72013f6800cea89539cc2f
Merge: 6667c7a3ec 5e9f88863f
Author: Mark <mark.raasveldt@gmail.com>
Date: Mon Nov 10 21:34:42 2025 +0100
[Dev] Fix assertion failure for empty ColumnData serialization (#19713)
The `PersistentColumnData` constructor asserts that the pointers aren't
empty.
This assertion will fail if we try to serialize the child of a list, if
all lists are empty (as the child will be entirely empty then)
Backported fix for problem found by: #19674
commit 6667c7a3ecdc56cc144a9bcf8601001af66e6839
Merge: 3f0ad6958f 4a0f4b0b38
Author: Mark <mark.raasveldt@gmail.com>
Date: Mon Nov 10 21:32:58 2025 +0100
Bump httpfs and resume testing on Windows (#19714)
commit dff5b7f608b732a0e7c5d9a68e7e8d7db3c48478
Author: Mytherin <mark.raasveldt@gmail.com>
Date: Mon Nov 10 21:31:46 2025 +0100
Bump the Postgres scanner extension
commit 0e3d0b5af535fcde90d272d95b1d08cb5fb12d15
Author: Sam Ansmink <samansmink@hotmail.com>
Date: Mon Nov 10 21:26:43 2025 +0100
remove deleted file from patch
commit ffb7be7cc5f27d9945d6868f76ef769a3f8a43d4
Merge: 2142f0b10d 3f0ad6958f
Author: Sam Ansmink <samansmink@hotmail.com>
Date: Mon Nov 10 21:13:17 2025 +0100
Merge branch 'v1.4-andium' into fix-crypto-issue
commit 2142f0b10db72b89c9101fa65ead619182f8e5d1
Author: Sam Ansmink <samansmink@hotmail.com>
Date: Mon Nov 10 20:55:18 2025 +0100
fix duplicate job id
commit 0a225cb99a130c2b1635d6ced03bc37f01ff9436
Author: Sam Ansmink <samansmink@hotmail.com>
Date: Mon Nov 10 20:52:40 2025 +0100
fix ci for encryption
commit 3f0ad6958f1952a083bc499fc147f69504a3c6d2
Merge: f3fb834ef7 a1eeb0df6f
Author: Mark <mark.raasveldt@gmail.com>
Date: Mon Nov 10 20:09:11 2025 +0100
Fix #19700: correctly sort output selection vector in nested selection operations (#19718)
Fixes #19700
This probably should be maintained during the actual select - but for
now just sorting it afterwards solves the issue.
commit f3fb834ef7153b90ef3908eb51a5b85efa580ca5
Merge: 7333a0ae84 c8ddca6f3c
Author: Mark <mark.raasveldt@gmail.com>
Date: Mon Nov 10 20:09:03 2025 +0100
Fix #19355: correctly resolve subquery in MERGE INTO action condition (#19720)
Fixes #19355
commit 7333a0ae84d51729fffe91e67f12c3cee526af2a
Merge: 95fcb8f188 6595848a27
Author: Mark <mark.raasveldt@gmail.com>
Date: Mon Nov 10 16:46:31 2025 +0100
Bump: delta, ducklake, httpfs (#19715)
This PR bumps the following extensions:
- `delta` from `0747c23791` to `6515bb2560`
- `ducklake` from `022cfb1373` to `77f2512a67`
- `httpfs` from `b80c680f86` to `041a782b0b`
commit 35f98411037cb0499e236d0cbe20d6b3a0dcc43f
Author: Sam Ansmink <samansmink@hotmail.com>
Date: Mon Nov 10 14:51:29 2025 +0100
install curl
commit d4a77c801bb1a88e634c12bc64e185ef2f147d2d
Author: Carlo Piovesan <piovesan.carlo@gmail.com>
Date: Mon Nov 10 14:37:42 2025 +0100
Add explicit Initialize(HTTPParams&) method to HTTPClient
This allow explicit re-initialization of specific parts of HTTPClient(s)
commit 6595848a27bd7fb271c63a99551d8326417320dd
Author: Sam Ansmink <samansmink@hotmail.com>
Date: Mon Nov 10 11:30:34 2025 +0100
bump extensions
commit 7a7726214c86267d476a2edbc68656ebd6253fe8
Author: Sam Ansmink <samansmink@hotmail.com>
Date: Mon Nov 10 11:28:32 2025 +0100
fix: ci issues
commit 4a0f4b0b38b9d5660c8a5c848d8a1c71bc3220de
Author: Carlo Piovesan <piovesan.carlo@gmail.com>
Date: Mon Nov 10 11:07:58 2025 +0100
Bump httpfs and resume testing on Windows
commit 5e9f88863f5f519620ae01f4ff873f6a2869343f
Author: Tishj <t_b@live.nl>
Date: Mon Nov 10 10:58:03 2025 +0100
conditionally create the PersistentColumnData, if there are no segments (as could be the case for a list's child), there won't be any data pointers
commit 95fcb8f18819b1a77df079a7fcb753a8c2f52844
Merge: 396c86228b 4f3df42f20
Author: Laurens Kuiper <laurens@duckdblabs.com>
Date: Mon Nov 10 10:50:38 2025 +0100
Bump: aws, ducklake, httpfs, iceberg (#19654)
This PR bumps the following extensions:
- `aws` from `18803d5e55` to `55bf3621fb`
- `ducklake` from `2554312f71` to `022cfb1373`
- `httpfs` from `8356a90174` to `b80c680f86`
- `iceberg` from `5e22d03133` to `db7c01e92`
commit c8ddca6f3c32aa0d3a9536371f9e3ca8cb00753e
Author: Mytherin <mark.raasveldt@gmail.com>
Date: Mon Nov 10 09:19:31 2025 +0100
Fix #19355: correctly resolve subquery in MERGE INTO action condition
commit a1eeb0df6ffc2f129638a2dfaab9a70720c8db1b
Author: Mytherin <mark.raasveldt@gmail.com>
Date: Mon Nov 10 09:00:35 2025 +0100
Fix #19700: correctly sort output selection vector in nested selection operations
commit 396c86228bda46929560affde7effdbab7d4e905
Merge: e3d242509e e501fcbd1a
Author: Mark <mark.raasveldt@gmail.com>
Date: Sat Nov 8 17:34:13 2025 +0100
Add missing query location to blob cast (#19689)
commit e3d242509e5710314921a0d7debd0bedb4d10a3e
Merge: 7ce99bc041 1ba198d711
Author: Mark <mark.raasveldt@gmail.com>
Date: Sat Nov 8 17:34:04 2025 +0100
Add request timing to HTTP log (#19691)
Demo:
```SQL
D call enable_logging('HTTP');
D from read_csv_auto('s3://duckdblabs-testing/test.csv');
D select request.type, request.url, request.start_time, request.duration_ms from duckdb_logs_parsed('HTTP');
┌─────────┬────────────────────────────────────────────────────────────────┬───────────────────────────────┬─────────────┐
│ type │ url │ start_time │ duration_ms │
│ varchar │ varchar │ timestamp with time zone │ int64 │
├─────────┼────────────────────────────────────────────────────────────────┼───────────────────────────────┼─────────────┤
│ HEAD │ https://duckdblabs-testing.s3.us-east-1.amazonaws.com/test.csv │ 2025-11-07 10:17:56.052202+00 │ 417 │
│ GET │ https://duckdblabs-testing.s3.us-east-1.amazonaws.com/test.csv │ 2025-11-07 10:17:56.478847+00 │ 104 │
└─────────┴────────────────────────────────────────────────────────────────┴───────────────────────────────┴─────────────┘
```
commit ae518d0a4e439f80c768388fab8f51d667f7e4b7
Author: Sam Ansmink <samansmink@hotmail.com>
Date: Fri Nov 7 13:58:22 2025 +0100
minor ci fixes
commit e501fcbd1af58cf147b80051b38ddf815d5e1b8c
Author: Mytherin <mark.raasveldt@gmail.com>
Date: Fri Nov 7 12:37:21 2025 +0100
move
commit bc1a683d10150dfe15f2f4d69e505f6337c4fc27
Author: Sam Ansmink <samansmink@hotmail.com>
Date: Fri Nov 7 11:22:35 2025 +0100
only load httpfs if necessary
commit 1ba198d71106a851fb8234ccfb208ec66b0e1d17
Author: Sam Ansmink <samansmink@hotmail.com>
Date: Fri Nov 7 11:16:38 2025 +0100
fix: check if logger exists
commit f22e9a06ef6e1b6c999e8c7389b05e40ae9032fc
Author: Sam Ansmink <samansmink@hotmail.com>
Date: Fri Nov 7 11:13:11 2025 +0100
add test for http log timing
commit f474ba123485377f94e5b57600fb720733050c98
Author: Sam Ansmink <samansmink@hotmail.com>
Date: Fri Nov 7 11:00:19 2025 +0100
add http timings to logger
commit 02bb5d19b9fc7a702184ffcf7d9688b88f54071a
Author: Mytherin <mark.raasveldt@gmail.com>
Date: Fri Nov 7 09:30:04 2025 +0100
Add query location to blob cast
commit 7ce99bc04130615dfc3a39dfb79177a8942fefba
Merge: 1555b0488e aea843492d
Author: Laurens Kuiper <laurens@duckdblabs.com>
Date: Fri Nov 7 09:22:48 2025 +0100
Fix InsertRelation on attached database (#19583)
Fixes https://github.com/duckdb/duckdb/issues/18396
Related PR in duckdb-python:
https://github.com/duckdb/duckdb-python/pull/155
commit 1555b0488e322998e6fd06cc47e1909c7bb4eba4
Merge: 783f08ffd8 98e2c4a75f
Author: Laurens Kuiper <laurens@duckdblabs.com>
Date: Fri Nov 7 08:31:05 2025 +0100
Log total probe matches in hash join (#19683)
This is usually evident from the number of tuples coming out of a join,
but it can be hard to understand what's going on when doing a
`LEFT`/`RIGHT`/`OUTER` join. This PR adds one log call at the end of the
hash join to report how many probe matches there were.
```sql
D CALL enable_logging('PhysicalOperator');
┌─────────┐
│ Success │
│ boolean │
├─────────┤
│ 0 rows │
└─────────┘
D SELECT count(*)
FROM range(3_000_000) t1(i)
LEFT JOIN range(1_000_000, 2_000_000) t2(i)
USING (i);
┌────────────────┐
│ count_star() │
│ int64 │
├────────────────┤
│ 3000000 │
│ (3.00 million) │
└────────────────┘
D CALL disable_logging();
┌─────────┐
│ Success │
│ boolean │
├─────────┤
│ 0 rows │
└─────────┘
D SELECT info.total_probe_matches::BIGINT total_probe_matches
FROM duckdb_logs_parsed('PhysicalOperator')
WHERE class = 'PhysicalHashJoin' AND event = 'GetData';
┌─────────────────────┐
│ total_probe_matches │
│ int64 │
├─────────────────────┤
│ 1000000 │
│ (1.00 million) │
└─────────────────────┘
```
Here we are able to see that the hash join produced 1M matches, but
emitted 3M tuples.
commit 783f08ffd89b1d1290b2d3dec0b3ba12d8c233bf
Merge: 6c6af22ea4 1d5c9f5f3d
Author: Laurens Kuiper <laurens@duckdblabs.com>
Date: Thu Nov 6 15:57:35 2025 +0100
Fixup linking for LLVM (#19668)
See conversation at https://github.com/llvm/llvm-project/issues/77653
This allows again:
```
brew install llvm
CMAKE_LLVM_PATH=/opt/homebrew/Cellar/llvm/21.1.5 GEN=ninja make
```
to just work.
Arguably very limited, but can as well be fixed.
commit 6c6af22ea45effc67dc9e76feec3fb73208750bb
Merge: 2892abafa7 f483e95d1c
Author: Laurens Kuiper <laurens@duckdblabs.com>
Date: Thu Nov 6 15:56:49 2025 +0100
Categorize ParseLogMessage as CAN_THROW_RUNTIME_ERROR (#19672)
Currently we rely on filtering on query type AND executing scalar
function `parse_duckdb_log_message` to not be reordered.
This is somehow brittle, and have found locally cases where this cause
problems that will result in wrong casts, such as:
```
Conversion Error:
Type VARCHAR with value 'ColumnDataCheckpointer FinalAnalyze(COMPRESSION_UNCOMPRESSED) result for main.big.0(VALIDITY): 15360' can't be cast to the destination type STRUCT(metric VARCHAR, "value" VARCHAR)
```
Looking at the executed plan, it would look like:
```
┌─────────────┴─────────────┐
│ FILTER │
│ ──────────────────── │
│ ((type = 'Metrics') AND │
│ (struct_extract │
│ (parse_duckdb_log_message(│
│ 'Metrics', message), │
│ 'metric') = 'CPU_TIME')) │
│ │
│ ~0 rows │
└─────────────┬─────────────┘
```
Tagging `parse_duckdb_log_message` as potentially throwing on some input
avoids reordering, and avoid the problem while improving the usability
of logs.
An alternative solution would be use explicit DefaultTryCast (instead of
TryCast), at
https://github.com/duckdb/duckdb/blob/v1.4-andium/src/function/scalar/system/parse_log_message.cpp#L70,
either allow to solve the problem.
commit 98e2c4a75f816eae6ef2893bbb581c9913293f2a
Author: Laurens Kuiper <laurens.kuiper@cwi.nl>
Date: Thu Nov 6 15:35:25 2025 +0100
log total probe matches in hash join
commit 2892abafa772fffc4402e5125cf16a26c094cb44
Merge: ecc73b2b4b 488069ec8d
Author: Laurens Kuiper <laurens@duckdblabs.com>
Date: Thu Nov 6 14:21:05 2025 +0100
duckdb_logs_parsed to do case-insensitive matching (#19669)
This is something me and @Tmonster bumped into while helping a customer
debugging an issue.
I think it's more intuitive and friendly that user facing functions are
case insensitive, given that is the general user expectation around SQL.
I am not sure `ILIKE` is the best way to do so (an alternative would be
filtering on `lower(1) = lower(2)`).
Note that passing `%` signs is currently checked elsewhere, for example:
```sql
SELECT message FROM duckdb_logs_parsed('query%') WHERE starts_with(message, 'SELECT 1');
```
would throw
```
Invalid Input Error: structured_log_schema: 'query%' not found
```
(while `querylog` already work, see test case, given there
case-insensitivity comparison was already used)
commit aea843492da3f40c30e6e88c12eb6da690348f2e
Author: Evert Lammerts <evert.lammerts@gmail.com>
Date: Thu Nov 6 11:40:11 2025 +0100
review feedback
commit 094a54b890a2466aad743b1c372809849cdef283
Author: Evert Lammerts <evert.lammerts@gmail.com>
Date: Sat Nov 1 11:22:34 2025 +0100
Fix InsertRelation on attached database
commit 4f3df42f208d5e6dc602d2e688911ef13758d3aa
Author: Sam Ansmink <samansmink@hotmail.com>
Date: Thu Nov 6 11:31:58 2025 +0100
bump iceberg further
commit f483e95d1c3983c2ba5758ebba1272f7ff12cd0d
Author: Carlo Piovesan <piovesan.carlo@gmail.com>
Date: Fri Oct 31 12:25:01 2025 +0100
Improve tests using now working FROM duckdb_logs_parsed()
commit 6554c84a73b6c7857d2ec5ebf6f2019ceb56e6dc
Author: Carlo Piovesan <piovesan.carlo@gmail.com>
Date: Tue Nov 4 12:56:31 2025 +0100
parse_logs_message might throw
commit 488069ec8d726d3b19093e8d57101c6c6af8910b
Author: Carlo Piovesan <piovesan.carlo@gmail.com>
Date: Thu Nov 6 09:29:49 2025 +0100
duckdb_logs_parsed to do case-insensitive matching
commit 1d5c9f5f3d18c73e27b0bc4353d549680c5c82d5
Author: Carlo Piovesan <piovesan.carlo@gmail.com>
Date: Thu Nov 6 09:13:41 2025 +0100
Fixup linking for LLVM
See conversation at https://github.com/llvm/llvm-project/issues/77653
commit ecc73b2b4b10beb175968e55e24e69241d00df1b
Merge: 2d69f075ee 4cb677238f
Author: Mark <mark.raasveldt@gmail.com>
Date: Thu Nov 6 08:58:09 2025 +0100
Always remember extra_metadata_blocks when checkpointing (#19639)
This is a follow-up to https://github.com/duckdb/duckdb/pull/19588,
adding the following:
- Reenables block verification in a new test configuration. It further
adds new checks to ensure that the metadata blocks that the RowGroup
references after checkpointing corresponds to those that it would see if
it were to reload them from disk. This verification would have caught
the issue addressed by https://github.com/duckdb/duckdb/pull/19588
- Adds a small tweak in `MetadataWriter::SetWrittenPointers`. This
ensures that the table writer does not track an `extra_metadata_block`
that did not ever receive any writes as part of that rowgroup (as it
immediately skipped to next block when calling
`writer.GetMetaBlockPointer()` after `writer.StartWritingColumns`). With
the added verification, not having this tweak fails e.g. the following
test:
```
test/sql/storage/compression/bitpacking/bitpacking_compression_ratio.test_slow
CREATE TABLE test_bitpacked AS SELECT i//2::INT64 AS i FROM range(0, 120000000) tbl(i);
================================================================================
TransactionContext Error: Failed to commit: Failed to create checkpoint because of error: Reloading blocks just written does not yield same blocks: Written: {block_id: 2 index: 32 offset: 0}, {block_id: 2 index: 33 offset: 8},
Read: {block_id: 2 index: 33 offset: 8},
Read Detailed: {block_id: 2 index: 33 offset: 8},
Start pointers: {block_id: 2 index: 33 offset: 8},
Metadata blocks: {block_id: 2 index: 32 offset: 0},
```
- Ensures that we always update `extra_metadata_blocks` after
checkpointing a rowgroup. This speeds up subsequent checkpoints
significantly. Right now, if you have a large legacy database, and don't
update these old rowgroups, this field is kept as is, and every
checkpoint needs to recompute it (even if the database isn't reloaded).
Making sure we always have `RowGroup::has_metadata_blocks == true` after
each checkpoint, even in case of metadata reuse, will both benefit
checkpointing for databases in old storage formats, as well as when
starting to use newer storage format on large legacy databases.
- Only tangentially related to the issue / PR, but while debugging I
noticed that the `deletes_is_loaded` variable is not correctly
initialized in all RowGroup constructors (can also be triggered with the
assertion I added in `RowGroup::HasChanges()`)
commit 46028940c8e429739e73f4d345ec3cab5eb5b01c
Author: Sam Ansmink <samansmink@hotmail.com>
Date: Wed Nov 5 19:33:58 2025 +0100
bump extension entries
commit 2d69f075ee91c42ad4fe4208a4d1f06d0034faff
Merge: 7043621a83 e3fb2eb884
Author: Laurens Kuiper <laurens@duckdblabs.com>
Date: Wed Nov 5 15:27:27 2025 +0100
Enable running all extensions tests as part of the build step (#19631)
This is enabled via
https://github.com/duckdb/extension-ci-tools/pull/278, that introduced a
way to hook into running tests for all extension of a given
configuration (as opposed to a single one).
Also few minor fixes I bumped into:
* disable unused platforms from the external extension builds
* remove `[persistence]` tests to be always run
* enable `vortex` tests
* avoid `httpfs` tests on Windows, to be reverted in a follow up
commit 4cb677238f7f4ad4d747f1a1045396fd74765724
Merge: b48cd982e0 7043621a83
Author: Yannick Welsch <yannick@welsch.lu>
Date: Wed Nov 5 14:59:47 2025 +0100
Merge remote-tracking branch 'origin/v1.4-andium' into yw/metadata-reuse-tweaks
commit b48cd982e0c59a03cf78a37175ba7272438c2525
Author: Yannick Welsch <yannick@welsch.lu>
Date: Wed Nov 5 14:59:34 2025 +0100
newline
commit 490411ab5ae614064e3e4fa94f631dcbbeea68d8
Author: Sam Ansmink <samansmink@hotmail.com>
Date: Wed Nov 5 13:55:19 2025 +0100
fix: add more places to securely clear key from memory
commit e3fb2eb8843f9ff90ad29fd69938ee6961b644dc
Author: Carlo Piovesan <piovesan.carlo@gmail.com>
Date: Wed Nov 5 11:07:40 2025 +0100
Avoid testing httpfs on Windows (fix incoming)
commit e719c837851f016ea614b28380685de8794ccf39
Author: Carlo Piovesan <piovesan.carlo@gmail.com>
Date: Wed Nov 5 11:04:57 2025 +0100
Revert "Add ducklake tests"
This reverts commit b77a9615117de845fa48463f09be20a89dea7434.
commit 4242618a8d43c2004f55b27b63535ad979302e92
Author: Sam Ansmink <samansmink@hotmail.com>
Date: Wed Nov 5 11:03:48 2025 +0100
only autoload if crypto util is not set
commit 19232fc414dc7f861dcbad788ba5466d10c27a67
Author: Sam Ansmink <samansmink@hotmail.com>
Date: Wed Nov 5 10:14:12 2025 +0100
bump extensions
commit 7043621a83d1be17ba6b278f0f7a3ec65df98d93
Merge: db845b80c7 3584a93938
Author: Laurens Kuiper <laurens@duckdblabs.com>
Date: Wed Nov 5 09:18:39 2025 +0100
Bump MySQL scanner (#19643)
Updating the MySQL scanner to include the time zone handling fix to
duckdb/duckdb-mysql#166.
commit db845b80c76452054e26cf7a2d715769592de925
Merge: f50618b48c 7eccc643ae
Author: Laurens Kuiper <laurens@duckdblabs.com>
Date: Wed Nov 5 09:15:52 2025 +0100
Remove `FlushAll` from `DETACH` (#19644)
This was initially added to reduce RSS after `DETACH`ing, but it is now
creating a large bottleneck for workloads that aggressively
`ATTACH`/`DETACH`. RSS will be freed by further allocation activity, or
when `SET allocator_background_threads=true;` is enabled.
commit 4978ccd8ec15e7631fd9ed741d338da663b0ff48
Author: Sam Ansmink <samansmink@hotmail.com>
Date: Tue Nov 4 16:34:16 2025 +0100
fix: add patch file
commit 6ec168d508d9395306b29c62cb0b163b6a77bafb
Author: Sam Ansmink <samansmink@hotmail.com>
Date: Tue Nov 4 16:13:18 2025 +0100
format
commit 67ec072c0ea6a237213f680709773e1342b11065
Author: Sam Ansmink <samansmink@hotmail.com>
Date: Tue Nov 4 15:59:04 2025 +0100
fix: tests
commit 7eccc643ae57a76a49e61b905f9a9a1857a00084
Author: Laurens Kuiper <laurens.kuiper@cwi.nl>
Date: Tue Nov 4 15:47:29 2025 +0100
remove flush all from detach
commit 3584a93938a4852b0510b0c3d6b3bb13861c4147
Author: Alex Kasko <alex@staticlibs.net>
Date: Tue Nov 4 14:33:21 2025 +0000
Bump MySQL scanner
Updating the MySQL scanner to include the time zone handling fix to
duckdb/duckdb-mysql#166.
commit 250b917ed6f423b56efbd855b2359a498fe2ef8d
Author: Sam Ansmink <samansmink@hotmail.com>
Date: Tue Nov 4 14:41:32 2025 +0100
fix: various issues with encryption
commit f50618b48c3dd04f77ae557e3bb4863f96f74a76
Merge: 66100df7ae 8257973295
Author: Mark <mark.raasveldt@gmail.com>
Date: Tue Nov 4 14:26:16 2025 +0100
Fix #19455: correctly extract root table in merge into when running ajoin that contains single-sided predicates that are transformed into filters (#19637)
Fixes #19455
commit 82579732952d68dec2b2a44cc1ca04243ac57151
Merge: 6efd4a4fde 66100df7ae
Author: Mytherin <mark.raasveldt@gmail.com>
Date: Tue Nov 4 14:25:42 2025 +0100
Merge branch 'v1.4-andium' into mergeintointernalerror
commit 66100df7aeb321d37f2434416df59dc274948987
Merge: d54d36faae c53eb7a562
Author: Mark <mark.raasveldt@gmail.com>
Date: Tue Nov 4 14:24:10 2025 +0100
Detect invalid merge into action and throw exception (#19636)
`WHEN NOT MATCHED (BY TARGET)` cannot be combined with `DELETE` or
`UPDATE`, since there is no rows in the target table to delete or
update. This PR ensures we throw an error when this is attempted.
commit ca88f5b2cf9480ac8e57f436fbc89d327d19422a
Author: Yannick Welsch <yannick@welsch.lu>
Date: Tue Nov 4 10:57:57 2025 +0100
Use reserve instead
commit 133a15ee61a64a831de46e4407f38d8bdd7b71f5
Author: Carlo Piovesan <piovesan.carlo@gmail.com>
Date: Tue Nov 4 10:45:20 2025 +0100
Move also [persistence] tests back under ENABLE_UNITTEST_CPP_TESTS
commit eb322ce251b5c4347650afc455171d862c51bf34
Author: Carlo Piovesan <piovesan.carlo@gmail.com>
Date: Tue Nov 4 10:41:40 2025 +0100
Switch from running on PRs wasm_mvp to wasm_eh
commit 9c5f82fa358fcf236cff21499351c1e739ca032a
Author: Carlo Piovesan <piovesan.carlo@gmail.com>
Date: Tue Nov 4 10:40:15 2025 +0100
Currently no external extension works on wasm or windows or musl
To be expanded once that changes
commit d54d36faae00120f548b39d1e21d93ca25f17087
Merge: 97fdeddb2b c01c994085
Author: Laurens Kuiper <laurens@duckdblabs.com>
Date: Tue Nov 4 09:03:51 2025 +0100
Bump: spatial (#19620)
This PR bumps the following extensions:
- `spatial` from `61ede09bec` to `d83faf88cd`
commit 6efd4a4fde180bf7d9c433977921818e5465c92a
Author: Mytherin <mark.raasveldt@gmail.com>
Date: Tue Nov 4 08:13:56 2025 +0100
Fix #19455: correctly extract root table in merge into when running a join that contains single-sided predicates that are transformed into filters
commit c53eb7a56266157f0e9d97bd91be0d36285ec38b
Author: Mytherin <mark.raasveldt@gmail.com>
Date: Tue Nov 4 08:01:24 2025 +0100
Detect invalid merge into action and throw exception
commit 97fdeddb2bd5c34862afd30177c9184f51f6dccd
Merge: a0a46d6ed0 87193fd5ab
Author: Mark <mark.raasveldt@gmail.com>
Date: Tue Nov 4 07:48:43 2025 +0100
Try to prevent overshooting of `FILE_SIZE_BYTES` by pre-emptively increasing bytes written in Parquet writer (#19622)
Helps with #19552, but doesn't fully fix the problem. We should look
into a more robust fix for v1.5.0, but not for a bugfix release
commit a0a46d6ed06dd962a4d6eeb01f3e14f8b275cec4
Merge: 73c0d0db15 3838c4a1ed
Author: Mark <mark.raasveldt@gmail.com>
Date: Tue Nov 4 07:48:27 2025 +0100
Increase cast-cost of old-style implicit cast to string (#19621)
This PR fixes https://github.com/duckdb/duckdb-python/issues/148
The issue is that `list_extract` now has two overloads, one for a
templated list `LIST<T>` and for concrete `VARCHAR` inputs. When binding
a function we add a really high cost to selecting a templated overload
to ensure we always pick something more specific if available. With our
current casting rules, we are unable to cast `VARHCAR[]` to `VARCHAR`,
and therefore fall back to the list-template as expected. But with
old-style casting rules we allow `VARCHAR[]` to `VARCHAR` by also adding
a high cost penalty, but its still lower than the cost of casting to the
template - even though that would be the better alternative.
With old-style casting we basically always have a lower-cost "fallback"
option than selecting a template overload. While we should overhaul our
casting system to evaluate the cast cost along more axes than just
"score", this PR fixes this specific case by just cranking up the cost
of old-style implicit to-string casts.
commit c01c99408526b3c0d698028083481301af069824
Author: Max Gabrielsson <max@gabrielsson.com>
Date: Mon Nov 3 22:42:37 2025 +0100
extension entries
commit b77a9615117de845fa48463f09be20a89dea7434
Author: Carlo Piovesan <piovesan.carlo@gmail.com>
Date: Mon Nov 3 17:35:39 2025 +0100
Add ducklake tests
commit bd58abcdfb4485a1a9dbb750bd0587803fd1c559
Author: Carlo Piovesan <piovesan.carlo@gmail.com>
Date: Mon Nov 3 17:35:15 2025 +0100
Load vortex tests
commit 62fe1bff77a60fd690b9911aa7a38b7bc197f865
Author: Carlo Piovesan <piovesan.carlo@gmail.com>
Date: Mon Nov 3 22:08:43 2025 +0100
Pass down extensions_test_selection -> complete
commit e2604e6f5259453f482e0c49ca10520e89ddf269
Author: Yannick Welsch <yannick@welsch.lu>
Date: Mon Nov 3 19:18:47 2025 +0100
Always has_metadata_blocks after checkpoint
commit 73c0d0db15621d3d1c2936816becf27e2c41e2ab
Merge: 286924e634 b518b2aa0b
Author: Mark <mark.raasveldt@gmail.com>
Date: Mon Nov 3 18:24:26 2025 +0100
Improve error message around compression type deprecation/availability checks (#19619)
This PR fixes https://github.com/duckdblabs/duckdb-internal/issues/6436
The old code kept only a list of "deprecated" types, and returned a
boolean, losing the context whether the compression type was available
at one point and is now deprecated OR is newly introduced and not
available yet in the storage version that is currently used.
commit 0e5a33dae35aab5209a8e959cf48d7525fa7ec8d
Author: Yannick Welsch <yannick@welsch.lu>
Date: Thu Oct 30 19:28:54 2025 +0100
Verify blocks
commit 286924e6348723138ca4dfd55b749d847bce59a9
Merge: 535f905874 c248313a1d
Author: Mark <mark.raasveldt@gmail.com>
Date: Mon Nov 3 17:12:32 2025 +0100
bump iceberg (#19618)
commit 87193fd5abf342d6ddce9d984e69007a4ccdc7d2
Author: Laurens Kuiper <laurens.kuiper@cwi.nl>
Date: Mon Nov 3 14:43:08 2025 +0100
try to prevent overshooting by pre-emptively increasing write size
commit 3838c4a1edd83dc1373b6077dc6ee478bb996e50
Author: Max Gabrielsson <max@gabrielsson.com>
Date: Mon Nov 3 13:55:53 2025 +0100
increase fallback string cast cost
commit 535f90587495e0c8f5974a0968b06b15ad01b32e
Merge: d643cefe13 06df593c60
Author: Laurens Kuiper <laurens@duckdblabs.com>
Date: Mon Nov 3 13:49:57 2025 +0100
[DevEx] Improve error message when FROM clause is omitted (#18995)
This PR fixes #18954
If the "similar bindings" is entirely empty, that means that there are
no bindings, which can only happen if the FROM clause is entirely
missing.
commit 9268637337a21b9c03fdc7dceb0a88fbbe001a73
Author: Max Gabrielsson <max@gabrielsson.com>
Date: Mon Nov 3 12:35:30 2025 +0100
bump extensions
commit d643cefe13de6873f6fb0ecc0bca1c14111cde11
Merge: 5f8cf7d7f8 c6434fd89a
Author: Mark <mark.raasveldt@gmail.com>
Date: Mon Nov 3 12:28:33 2025 +0100
Avoid eagerly resolving the next on-disk pointer in the MetadataReader, as that pointer might not always be valid (#19588)
When enabling the new [experimental metadata
re-use](https://github.com/duckdb/duckdb/pull/18395), it is possible for
metadata of *some* row groups to be re-used. This can cause linked lists
of metadata blocks to contain invalid references.
For example, when writing a bunch of row groups, we might get this
layout:
```
METADATA BLOCK 1
ROW GROUP 1
ROW GROUP 2 (pt 1)
NEXT BLOCK: 2
->
METADATA BLOCK 2
ROW GROUP 2 (pt 2)
ROW GROUP 3
```
Metadata is stored in a linked list (block 1 -> block 2) - but we don't
need to traverse this linked list fully. We store pointers to individual
row groups, and can start reading from their position.
Now suppose we re-use metadata of `ROW GROUP 1`, but not of the other
row groups (because e.g. they have been updated / changed). Since this
is fully contained in `METADATA BLOCK 1`, we can garbage collect
`METADATA BLOCK 2`, leaving the following metadata block:
```
METADATA BLOCK 1
ROW GROUP 1
ROW GROUP 2 (pt 1)
NEXT BLOCK: 2
```
Now we can safely read this block and read the metadata for `ROW GROUP
1`, **however**, this block contains a reference to a metadata block
that is no longer valid and might have been garbage collected. This
revealed a problem in the `MetadataReader`. In the current
implementation of the `MetadataReader` - when pointing it towards a
block, it would eagerly try to figure out the metadata location of *the
next block*. This is normally not a problem, however, with these invalid
chains, we might try to resolve a block that has been freed up already -
causing an internal exception to trigger:
```
Failed to load metadata pointer (id %llu, idx %llu, ptr %llu)
```
This PR resolves the issue by making the MetadataReader lazy. Instead of
eagerly resolving the next pointer, we only do this when it is actually
required.
commit b518b2aa0b06372d583fb203f5cae0011a53a87f
Author: Tishj <t_b@live.nl>
Date: Mon Nov 3 12:24:43 2025 +0100
enum util fix
commit 5f8cf7d7f81981f4b2355959257fa82982c3dd11
Merge: 407720a348 2cdc7f922b
Author: Laurens Kuiper <laurens@duckdblabs.com>
Date: Mon Nov 3 12:22:52 2025 +0100
add vortex external extension (#19580)
commit 7c2353cb06d867813b7725f893a6b1092821c807
Author: Tishj <t_b@live.nl>
Date: Mon Nov 3 11:21:32 2025 +0100
differentiate between deprecated/not available yet in the check, to improve error reporting
commit c248313a1dd40f1569b608b80bdec1229de0b6b4
Author: Tmonster <tom@ebergen.com>
Date: Mon Nov 3 10:54:40 2025 +0100
bump iceberg
commit 407720a34804f0da61d5ba6645c3c44ec6ddf0d8
Merge: 7764771eaa d4fb98d454
Author: Mark <mark.raasveldt@gmail.com>
Date: Sun Nov 2 15:01:29 2025 +0100
Wal index deletes (#19477)
This adds support for buffering and replaying Index delete operations
for WAL replay. During WAL replay, index operations are buffered since
the Indexes are not bound yet. During Index binding, the buffered
operations are applied to the Index. UnboundIndex is modified to support
buffering delete operations on top of inserts.
BoundIndex::ApplyBufferedAppends is changed to a
BoundIndex::ApplyBufferedReplays which supports replaying both inserts
and deletes.
Documentation along relevant code paths is added which clarifies the
ordering of mapped_column_ids and the index_chunks being buffered.
Before, the mapping was any order since it was only coming from Index
insert paths. Now, buffering can come from both insert and delete paths,
so both need to make sure to buffer index chunks and the mappings in the
same order, (which is just in sorted order of the physical Index column
IDs).
There is also a bug fix for buffering index data on a table with
generated columns, since the table chunk being created for replaying
buffered operations contained all column types previously, including
generated columns, whereas now it only contains the physical column
layout which is needed for index operations. (ART Index operations take
a chunk of data with only the index columns containing any data, and the
non-Indexed columns are empty).
A catch block is added to Transaction CleanupState::Flush which was
silently throwing away any failures (which caught this WAL replay in the
first place). Also, some test coverage for ART duplicate rowids and a
LookupInLeaf function was added which allows searching for a rowid in a
Leaf that is either inlined, or a gate node to a nested ART.
@taniabogatsch
commit c6434fd89a7391e428f2cb31e6e3d676d5257b0d
Author: Mytherin <mark.raasveldt@gmail.com>
Date: Sun Nov 2 14:54:33 2025 +0100
Fix lock order inversion
commit eb514c01e4ea4ad434fb87fde70307f64992d52a
Merge: 2f3d2db509 7764771eaa
Author: Mytherin <mark.raasveldt@gmail.com>
Date: Sun Nov 2 09:45:34 2025 +0100
Merge branch 'v1.4-andium' into metadatareusefixes
commit 7764771eaa654cb44f5c731e99f5d989951aefb8
Merge: 9ea6e07a29 fc2bf610d0
Author: Mark <mark.raasveldt@gmail.com>
Date: Sun Nov 2 09:44:54 2025 +0100
Skip compiling remote optimizer test when TSAN Is enabled (#19590)
This test uses `fork` which seems to mess up the thread sanitizer,
causing strange errors to occur sporadically.
commit fc2bf610d0c9851d1e3f6ad273dcfb47b6ec60a6
Author: Mytherin <mark.raasveldt@gmail.com>
Date: Sat Nov 1 23:27:43 2025 +0100
Skip compiling entirely
commit a68390e2b1a6f09b899d248881d331e5dbbab89a
Author: Mytherin <mark.raasveldt@gmail.com>
Date: Sat Nov 1 23:23:18 2025 +0100
Skip fork test with tsan
commit 2f3d2db50968fd917f253c2c34cf488290dadfa4
Author: Mytherin <mark.raasveldt@gmail.com>
Date: Sat Nov 1 15:27:51 2025 +0100
Avoid eagerly resolving the next on-disk pointer in the MetadataReader, as that pointer might not always be valid
commit 9ea6e07a290db878c9da097d407b3a866c43c8e0
Merge: 5f1ce8ba5c a740840f97
Author: Mark <mark.raasveldt@gmail.com>
Date: Sat Nov 1 09:25:59 2025 +0100
Fix edge case in uncompressed validity scan with offset and fix off-by-one in ArrayColumnData::Select (#19567)
This PR fixes a off-by-one in the consecutive-array-scan optimization
implemented in https://github.com/duckdb/duckdb/pull/16356 as well as an
edge case in our uncompressed validity data scan.
Fixes https://github.com/duckdb/duckdb/issues/19377
I can't figure out how to write a test for this, it seems like no matter
what I do I'm unable to replicate the same storage characteristics as
the database file provided in the issue above.
In the repro we do a scan+skip+scan, where part of the first
`validity_t` in the second scan contains a bunch of zeroes at the
positions "before" the scan window that remain even after shifting. I've
solved it by setting all lower bits up to `result_idx` in the first
`validity_t` we scan, but not sure if this is the most elegant solution.
Strangely enough If we remove all bitwise logic and just do the same
"fall-back" logic as ifdef:ed for `VECTOR_SIZE < 128` it all works
though, so the issue has to be part of the bit-manipulation.
commit 5f1ce8ba5c0000770412b35a763af417f8fb2b90
Merge: be0142d4ee dbe272dff0
Author: Mark <mark.raasveldt@gmail.com>
Date: Sat Nov 1 09:22:00 2025 +0100
[v1.4-andium] Add Profiler output to logger interface (#19572)
This is https://github.com/duckdb/duckdb/pull/19546 backported to
`v1.4-andium` branch, see conversation there.
---
Idea is: if both profiler and logger are enabled, then you can access
profiler output also via logger.
This is on top / independent of the current choices for where to output
the profiler (JSON / graphviz / query-tree / ...). While this might be
somewhat wasteful, it's allow for an easier PR and leave unopinionated
what should the SQL interface be. Also given ToLog() call is inexpensive
(in particular if the logger is disabled), and that it's unclear if
logger alone can satisfy profiler necessities, I think going additive is
the best path here.
Demo:
```sql
ATTACH 'my_db.db';
USE my_db;
---- enable profiling to json file
PRAGMA profiling_output = 'profiling_output.json';
PRAGMA enable_profiling = 'json';
---- enable logging (to in-memory table)
call enable_logging();
----
CREATE TABLE small AS FROM range(100);
CREATE TABLE medium AS FROM range(10000);
CREATE TABLE big AS FROM range(1000000);
PRAGMA disable_profiling;
SELECT query_id, type, metric, value FROM duckdb_logs_parsed('Metrics') WHERE metric == 'CPU_TIME';
```
Will result in for example in:
```
┌──────────┬─────────┬──────────┬───────────────────────┐
│ query_id │ type │ metric │ value │
│ uint64 │ varchar │ varchar │ varchar │
├──────────┼─────────┼──────────┼───────────────────────┤
│ 10 │ Metrics │ CPU_TIME │ 8.1041e-05 │
│ 11 │ Metrics │ CPU_TIME │ 0.0002499510000000001 │
│ 12 │ Metrics │ CPU_TIME │ 0.02776677799999981 │
└──────────┴─────────┴──────────┴───────────────────────┘
```
A more complex example would be for example:
With the duckdb cli, execute:
```sql
PRAGMA profiling_output = 'metrics_folder/tmp_profiling_output.json';
PRAGMA enable_profiling = 'json';
CALL enable_logging(storage='file', storage_path='./metrics_folder');
--- arbitrary queryies
CREATE TABLE small AS FROM range(100);
CREATE TABLE medium AS FROM range(10000);
CREATE TABLE big AS FROM range(1000000);
```
then close, restart duckdb cli, and query what's persisted in the
`metric_folder` folder:
```sql
PRAGMA disable_profiling;
CALL enable_logging(storage='file', storage_path='./metrics_folder');
SELECT queries.message, metrics.metric, TRY_CAST(metrics.value AS DOUBLE) as value
FROM duckdb_logs_parsed('QueryLog') queries,
duckdb_logs_parsed('Metrics') metrics
WHERE queries.query_id = metrics.query_id AND metrics.metric = 'CPU_TIME';```
```
```
┌─────────────────────────────────────────────┬──────────┬─────────────────────────────────────┐
│ message │ metric │ TRY_CAST(metrics."value" AS DOUBLE) │
│ varchar │ varchar │ double │
├─────────────────────────────────────────────┼──────────┼─────────────────────────────────────┤
│ CREATE TABLE small AS FROM range(100); │ CPU_TIME │ 8.1041e-05 │
│ CREATE TABLE medium AS FROM range(10000); │ CPU_TIME │ 0.0002499510000000001 │
│ CREATE TABLE big AS FROM range(1000000); │ CPU_TIME │ 0.02776677799999981 │
└─────────────────────────────────────────────┴──────────┴─────────────────────────────────────┘
```
commit be0142d4ee0385262520ae2488e8dd11ac213735
Merge: b68a1696de 7df4151c0d
Author: Mark <mark.raasveldt@gmail.com>
Date: Sat Nov 1 09:21:19 2025 +0100
fix inconsistent behavior in remote read_file/blob, and prevent union… (#19531)
Closes https://github.com/duckdb/duckdb-fuzzer/issues/4208
Closes https://github.com/duckdb/duckdb/issues/19090
Our remote filesystem doesn't actually check that files exist when
"globbing" a non-glob pattern. Now we check that the file exists in the
read_blob/text function even if we just access the file name.
Diff is a bit bigger cause I also moved a bunch of templated stuff into
the cpp file.
commit 06df593c60bb22973642d776c1c3c3aca85ee0d6
Author: Tishj <t_b@live.nl>
Date: Fri Oct 31 15:26:18 2025 +0100
fix up tests
commit 2cdc7f922bde5550aa1ecd24dabf23b05fbf202b
Author: Sam Ansmink <samansmink@hotmail.com>
Date: Fri Oct 31 15:10:31 2025 +0100
add vortex external extension
commit b68a1696de1a603b59e39efc25da7fc2826a3135
Merge: 8169d4f15c 9414882f7f
Author: Mark <mark.raasveldt@gmail.com>
Date: Fri Oct 31 13:45:16 2025 +0100
Release relevant tests to still be run on all builds (#19559)
I would propose, at least for the Linux builds, to add back a minimal
amount of tests also on release builds.
They will ensure at a minimum that:
* for a given release, the corresponding storage_version is valid
* for a minor release, that the corresponding name has been set
There are more tests that we might consider basic enough AND connected
to behaviour specific of a release that we might want to add to the
`release` tag.
Fixes https://github.com/duckdb/duckdb/issues/19354 (together with
https://github.com/duckdb/duckdb/pull/19525 that actually added the
name).
Note that given the current release process happens in advance, eventual
test failure are annoying but not fatal, but they will require changes
to code. I am not sure if it's worth having a `keep_going_in_all_cases`
option, basically turning the boolean into a set, but I think it can be
done when need arise.
commit 8169d4f15cf556d0ca0ec68d9c876c2bb84aae09
Merge: d9028d09d5 6e2c195859
Author: Mark <mark.raasveldt@gmail.com>
Date: Fri Oct 31 13:44:30 2025 +0100
Fix race condition between `Append` and `Scan` (#19571)
Update `ColumnData::count` only after actually `Append` the data to
avoid Race Condition with `Scan`. See
https://github.com/duckdb/duckdb/issues/19570 for details.
commit d4fb98d45409bcaaf8c3030c7aa7e40b1f60b9d1
Merge: 0743b590d3 d9028d09d5
Author: Artjom Plaunov <artyemnyc@gmail.com>
Date: Fri Oct 31 11:16:23 2025 +0100
Merge remote-tracking branch 'upstream/v1.4-andium' into wal-index-deletes
commit 0743b590d361041cc167f0634250f78c20f4d332
Author: Artjom Plaunov <artyemnyc@gmail.com>
Date: Fri Oct 31 11:15:04 2025 +0100
remove C++ test, add extra interleaved index replay SQL test
commit 5ca334715faa6c871c8e96029c142aacf53969a7
Author: Tishj <t_b@live.nl>
Date: Fri Oct 31 10:43:03 2025 +0100
fix up tests
commit 0a6b5fb4919a8092b38e19051a9286eeaaeb392c
Merge: 0b1f0e320a d9028d09d5
Author: Tishj <t_b@live.nl>
Date: Fri Oct 31 10:38:56 2025 +0100
Merge branch 'v1.4-andium' into missing_from_clause_better_error
commit a740840f9772a1702a5ffeec43694c48be3526c5
Author: Max Gabrielsson <max@gabrielsson.com>
Date: Thu Oct 30 18:04:39 2025 +0100
fix consecutive array range calculation, fix validity scanning when bits before result offset are null
commit 6e2c195859a496f1f98c20fd887fac944ba0e344
Author: zhangxizhe <zhangxizhe.zxz@alibaba-inc.com>
Date: Fri Oct 31 13:43:19 2025 +0800
Update `ColumnData::count` only after actually `Append` the data to
avoid Race Condition with `Scan`. See `issue #19570` for details.
commit d9028d09d56640599dd8307dd9ae6c8837267e9f
Merge: 307f9b41ff 6bc51dd58e
Author: Laurens Kuiper <laurens@duckdblabs.com>
Date: Fri Oct 31 08:47:10 2025 +0100
Disable jemalloc on BSD (#19560)
Fixes https://github.com/duckdb/duckdb/issues/14363
commit dbe272dff0a63d0d01269cee05945a0b016d219f
Author: Carlo Piovesan <piovesan.carlo@gmail.com>
Date: Wed Oct 29 23:51:42 2025 +0100
Add Profiler output to logger interface
Idea is: if both profiler and logger are enabled, then you can access profiler output also via logger.
This is on top / independent of the current choices for where to output the profiler (JSON / graphviz / query-tree / ...).
While this might be somewhat wasteful, it's allow for an easier PR and leave unopinionated what should the SQL
interface be. Also given ToLog() call is inexpensive (in particular if the logger is disabled), and that it's unclear if logger alone can satisfy
profiler necessities, I think going additive is the best path here.
Demo:
```sql
ATTACH 'my_db.db';
USE my_db;
---- enable profiling to json file
PRAGMA profiling_output = 'profiling_output.json';
PRAGMA enable_profiling = 'json';
---- enable logging (to in-memory table)
call enable_logging();
----
CREATE TABLE small AS FROM range(1000);
CREATE TABLE medium AS FROM range(1000000);
CREATE TABLE big AS FROM range(1000000000);
PRAGMA disable_profiling;
SELECT * EXCLUDE timestamp FROM duckdb_logs() WHERE type == 'Metrics' ORDER BY message.split(',')[1], context_id;
```
Will result in for example in:
```
┌────────────┬─────────┬───────────┬────────────────────────────────────────────────────────────┐
│ context_id │ type │ log_level │ message │
│ uint64 │ varchar │ varchar │ varchar │
├────────────┼─────────┼───────────┼────────────────────────────────────────────────────────────┤
│ 39 │ Metrics │ INFO │ {'metric': CHECKPOINT_LATENCY, 'value': 0.0} │
│ 44 │ Metrics │ INFO │ {'metric': CHECKPOINT_LATENCY, 'value': 0.0} │
│ 49 │ Metrics │ INFO │ {'metric': CHECKPOINT_LATENCY, 'value': 0.017832} │
│ 39 │ Metrics │ INFO │ {'metric': COMMIT_WRITE_WAL_LATENCY, 'value': 0.000305292} │
│ 44 │ Metrics │ INFO │ {'metric': COMMIT_WRITE_WAL_LATENCY, 'value': 0.003793958} │
│ 49 │ Metrics │ INFO │ {'metric': COMMIT_WRITE_WAL_LATENCY, 'value': 0.0} │
│ 39 │ Metrics │ INFO │ {'metric': CPU_TIME, 'value': 0.000110209} │
│ 44 │ Metrics │ INFO │ {'metric': CPU_TIME, 'value': 0.009471759999999997} │
│ 49 │ Metrics │ INFO │ {'metric': CPU_TIME, 'value': 8.241736770029297} │
│ · │ · │ · │ · │
│ · │ · │ · │ · │
│ · │ · │ · │ · │
│ 39 │ Metrics │ INFO │ {'metric': SYSTEM_PEAK_BUFFER_MEMORY, 'value': 36864} │
│ 44 │ Metrics │ INFO │ {'metric': SYSTEM_PEAK_BUFFER_MEMORY, 'value': 6625280} │
│ 49 │ Metrics │ INFO │ {'metric': SYSTEM_PEAK_BUFFER_MEMORY, 'value': 63510528} │
│ 39 │ Metrics │ INFO │ {'metric': TOTAL_BYTES_WRITTEN, 'value': 0} │
│ 44 │ Metrics │ INFO │ {'metric': TOTAL_BYTES_WRITTEN, 'value': 262144} │
│ 49 │ Metrics │ INFO │ {'metric': TOTAL_BYTES_WRITTEN, 'value': 12587008} │
│ · │ · │ · │ · │
│ · │ · │ · │ · │
│ · │ · │ · │ · │
├────────────┴─────────┴───────────┴────────────────────────────────────────────────────────────┤
│ 57 rows (? shown) 4 columns │
└───────────────────────────────────────────────────────────────────────────────────────────────┘
```
commit 307f9b41ff0464dba0e0f2504c75747c7ead2ecc
Merge: 1cba2e741b 08bf725300
Author: Mark <mark.raasveldt@gmail.com>
Date: Thu Oct 30 15:03:25 2025 +0100
[ported from main] Fix bug initializing std::vector for column names (#19555)
This 4 line fix was merged with main in #19444. It should be in
v1.4-andium as well so that it makes it into v1.4.2.
commit 1cba2e741b6622f5be156c061478a6fa66c0f819
Merge: ecb6bfe5b4 80554e4d59
Author: Laurens Kuiper <laurens@duckdblabs.com>
Date: Thu Oct 30 14:47:58 2025 +0100
Bugfixes: Parquet JSON+DELTA_LENGTH_BYTE_ARRAY and sorting iterator (#19556)
This PR fixes an issue introduced in v1.4.1 with the Parquet reader when
combining a `JSON` column with `DELTA_LENGTH_BYTE_ARRAY` encoding. The
issue was caused by trying to validate an entire block of strings in one
go, which is OK for UTF-8, but for JSON. This PR makes it so we validate
individual strings if the column has `JSON` type.
Fixes https://github.com/duckdb/duckdb/issues/19366
This PR also fixes an issue with the new sorting code, which had an
error in the calculation of subtraction under modulo. I've fixed this,
and unified the code for `InMemoryBlockIteratorState` and
`ExternalBlockIteratorState` with some templating, so now the erroneous
calculation should be gone from both state types.
Fixes https://github.com/duckdb/duckdb/issues/19498
commit 9414882f7fc81be58af0ec914cbe8c6045af3517
Author: Carlo Piovesan <piovesan.carlo@gmail.com>
Date: Thu Oct 30 12:39:48 2025 +0100
Allow back basics tests also in release mode
commit 2987acd0d19656e583f30447a91852793ef188f7
Author: Carlo Piovesan <piovesan.carlo@gmail.com>
Date: Thu Oct 30 12:36:32 2025 +0100
Add test on codename being registered, and tag it as release
commit 6bc51dd58edaf76725810b595a5300044749c0cf
Author: Laurens Kuiper <laurens.kuiper@cwi.nl>
Date: Thu Oct 30 13:24:45 2025 +0100
disable jemalloc BSD
commit 80554e4d592ec793676a80b180469a572a247f2a
Merge: 5974ef8c03 ecb6bfe5b4
Author: Laurens Kuiper <laurens.kuiper@cwi.nl>
Date: Thu Oct 30 09:57:58 2025 +0100
Merge branch 'v1.4-andium' into bugfixes_v1.4
commit 08bf725300335d34f05cd6f6f508f78ef57c477b
Author: Curt Hagenlocher <curt@hagenlocher.org>
Date: Fri Oct 17 14:08:52 2025 -0700
Fix bug initializing std::vector for column names
commit ecb6bfe5b483ffd1a2a490275b48ec91501680c4
Merge: 09a36d2f73 94471b8e04
Author: Hannes Mühleisen <227792+hannes@users.noreply.github.com>
Date: Thu Oct 30 09:01:41 2025 +0200
Follow up to staging move (#19551)
Follow up to #19539, CF does not like AWS regions
commit 94471b8e0472a2507623b2408808156f6ddde764
Author: Hannes Mühleisen <hannes@muehleisen.org>
Date: Thu Oct 30 07:49:34 2025 +0200
this region does not exist in cf
commit 09a36d2f73d1b2f93682e315761bb3c4973f8ac9
Merge: a23f54fb54 c2a4fc29dc
Author: Mark <mark.raasveldt@gmail.com>
Date: Wed Oct 29 21:51:05 2025 +0100
[Dev] Disable the use of `ZSTD` if the block_manager is the `InMemoryBlockManager` (#19543)
This PR fixes https://github.com/duckdblabs/duckdb-internal/issues/6319
This has to be done because the InMemoryBlockManager doesnt support
GetFreeBlockId, which is required by the zstd compression method.
I couldn't produce a test for this because I can't reproduce the problem
in the unittester, only in the CLI
(I assume the storage version prevents in-memory compression???)
commit c2a4fc29dceb617c80ab9156d84f2320add29542
Author: Tishj <t_b@live.nl>
Date: Wed Oct 29 16:37:20 2025 +0100
add test for disabled zstd compression in memory
commit 5974ef8c03afcd01df670a42dd7be0bbb2a6c6ff
Author: Laurens Kuiper <laurens.kuiper@cwi.nl>
Date: Wed Oct 29 16:34:54 2025 +0100
properly set file paht in test
commit a35ba26f267eca2fb144e07b14706af2b96270a8
Author: Tishj <t_b@live.nl>
Date: Wed Oct 29 15:19:03 2025 +0100
disable the use of ZSTD if the block_manager is the InMemoryBlockManager, since it doesnt support GetFreeBlockId
commit fd85508aa0065a18180a6f9af1d4c66842b28964
Author: Laurens Kuiper <laurens.kuiper@cwi.nl>
Date: Wed Oct 29 15:08:06 2025 +0100
re-add missing initialization
commit a23f54fb54c686614cdaf547778b4c6f47bcbf5c
Merge: f2e48a73d4 ab586dfaf6
Author: Hannes Mühleisen <227792+hannes@users.noreply.github.com>
Date: Wed Oct 29 14:52:40 2025 +0200
Creating separate OSX cli binaries for each arch (#19538)
Also no longer adding the shared library three times because of symlinks
commit f2e48a73d42ce538706529e51aec54cfd9f96d84
Merge: 5a6521ca7e ccefe12386
Author: Hannes Mühleisen <227792+hannes@users.noreply.github.com>
Date: Wed Oct 29 14:51:26 2025 +0200
Moving staging to cf and uploading to install bucket (#19539)
This adds a custom endpoint for staging uploads so we can move to R2 for
this. We also add functionality to upload to the R2 bucket behind
`install.duckdb.org`. Once merged, I will update/add the following
secrets:
- `S3_DUCKDB_STAGING_ENDPOINT`
- `S3_DUCKDB_STAGING_ID`
- `S3_DUCKDB_STAGING_KEY`
- `DUCKDB_INSTALL_S3_ENDPOINT`
- `DUCKDB_INSTALL_S3_ID`
- `DUCKDB_INSTALL_S3_SECRET`
commit f5bc9796be79b602ed1892484e060f0e79083610
Author: Laurens Kuiper <laurens.kuiper@cwi.nl>
Date: Wed Oct 29 13:43:05 2025 +0100
nicer templating and less code duplication
commit ccefe12386007dd65fae1fe3ff1d65bcb45df44d
Author: Hannes Mühleisen <227792+hannes@users.noreply.github.com>
Date: Wed Oct 29 14:18:15 2025 +0200
Update .github/workflows/StagedUpload.yml
Co-authored-by: Carlo Piovesan <piovesan.carlo@gmail.com>
commit 41fc70ae3312599e425d140f7db770f56c2c5c38
Author: Hannes Mühleisen <227792+hannes@users.noreply.github.com>
Date: Wed Oct 29 14:00:41 2025 +0200
Update .github/workflows/StagedUpload.yml
Co-authored-by: Carlo Piovesan <piovesan.carlo@gmail.com>
commit e8c2d9401b580c64ef5d3cad3cb8d301375ddbd3
Author: Hannes Mühleisen <hannes@muehleisen.org>
Date: Wed Oct 29 12:35:30 2025 +0200
moving staging to cf and uploading to install bucket
commit 7df4151c0d4967e2dd33eff7f426805df3c56442
Author: Max Gabrielsson <max@gabrielsson.com>
Date: Wed Oct 29 10:58:22 2025 +0100
remove named parameters
commit ab586dfaf6bf58fa8376944e599c51efea462cb8
Author: Hannes Mühleisen <hannes@muehleisen.org>
Date: Wed Oct 29 11:46:18 2025 +0200
creating separate osx cli binaries for each arch
commit 8f30296d7c05c277771bf1fe95b73fafe7fa9d0f
Merge: 5dac9f7504 5a6521ca7e
Author: Artjom Plaunov <artyemnyc@gmail.com>
Date: Wed Oct 29 09:39:30 2025 +0100
Merge remote-tracking branch 'upstream/v1.4-andium' into wal-index-deletes
commit 5a6521ca7e744205e4c3b67cab8708e2df87073b
Merge: 8c7210f9b0 601d68526c
Author: Mark <mark.raasveldt@gmail.com>
Date: Wed Oct 29 07:55:06 2025 +0100
Add test that either 'latest' or 'vX.Y.Z' are supported STORAGE_VERSIONs (#19527)
Connected to https://github.com/duckdb/duckdb/pull/19525, adds a test
that would have triggered.
That test is not build when actually building releases, so that's not
fool-proof, but I think adding this in is helpful.
Tested locally to behave as intended both on dev commit (success) and
tag (fails, fixed via linked PR).
commit 8c7210f9b0270517e1dba11502dc196a3f0cb13c
Merge: 7b5c16f2d5 99f26bde2d
Author: Mark <mark.raasveldt@gmail.com>
Date: Tue Oct 28 18:58:35 2025 +0100
add upcoming patch release to internal versions (#19525)
commit 7b5c16f2d51dda602c9ddfed58d71bb6ae3275a0
Merge: 23228babba 295603915b
Author: Mark <mark.raasveldt@gmail.com>
Date: Tue Oct 28 18:58:16 2025 +0100
Bump multiple extensions (#19522)
This PR bumps the following extensions:
- `avro` from `7b75062f63` to `93da8a19b4`
- `delta` from `03aaf0f073` to `0747c23791`
- `ducklake` from `f134ad86f2` to `2554312f71`
- `iceberg` from `4f3c5499e5` to `30a2c66f10`
- `spatial` from `a6a607fe3a` to `61ede09bec`
commit 23228babba519ec70b183b03ea6bc4457b3ed84c
Merge: 71a64b5ab4 6a38ac0f69
Author: Mark <mark.raasveldt@gmail.com>
Date: Tue Oct 28 18:58:00 2025 +0100
Bump: inet (#19526)
This PR bumps the following extensions:
- `inet` from `f6a2a14f06` to `fe7f60bb60 (patches removed: 1)`
commit 067d6eb0d5c56270f1d24951966191d9c12c3008
Author: Max Gabrielsson <max@gabrielsson.com>
Date: Tue Oct 28 17:33:43 2025 +0100
fix inconsistent behavior in remote read_file/blob, and prevent union_by_name from crashing
commit 601d68526c9e616ff08a0e08d949f00dcfb76060
Author: Carlo Piovesan <piovesan.carlo@gmail.com>
Date: Tue Oct 28 13:11:45 2025 +0100
Add test that either 'latest' or 'vX.Y.Z' are supported STORAGE_VERSIONs
commit c63c5060d01340dc11f39349bf7950fb8eaa455b
Author: Laurens Kuiper <laurens.kuiper@cwi.nl>
Date: Tue Oct 28 15:55:12 2025 +0100
fix #19498
commit 7e52dc5a75532c5413088fbb9f90e6a30f9e5d14
Author: Laurens Kuiper <laurens.kuiper@cwi.nl>
Date: Tue Oct 28 15:54:56 2025 +0100
add missing test
commit 71a64b5ab4005fd2eb63cb3912403fde29f4d7e0
Merge: 76ee047ce4 3856fa8ea8
Author: Mark <mark.raasveldt@gmail.com>
Date: Tue Oct 28 14:30:18 2025 +0100
Support non-standard NULL in Parquet again (#19523)
https://github.com/duckdb/duckdb/pull/19406 removed support for the
non-standard NULL by adding the safe enum casts.
Support for this was explicitly added in
https://github.com/duckdb/duckdb/pull/11774
We could consider removing support for this - but it shouldn't be done
as part of a bug-fix release imo. This also currently breaks merging
v1.4 -> main.
commit 05fb1249cab3404bc396ccaee0cdb1959ae11481
Author: Laurens Kuiper <laurens.kuiper@cwi.nl>
Date: Tue Oct 28 14:19:50 2025 +0100
fix #19366
commit 5dac9f750490e1ea601b03d8e3d11db7a9cc0197
Merge: 0d4a78c90f 76ee047ce4
Author: Artjom Plaunov <artyemnyc@gmail.com>
Date: Tue Oct 28 13:14:30 2025 +0100
Merge remote-tracking branch 'upstream/v1.4-andium' into wal-index-deletes
commit 0d4a78c90f6288abe842afab521ba1e7a075307f
Author: Artjom Plaunov <artyemnyc@gmail.com>
Date: Tue Oct 28 13:12:44 2025 +0100
remove int types
commit 6a38ac0f699f2f85adda33d61c94c6ec054d89ca
Author: Sam Ansmink <samansmink@hotmail.com>
Date: Tue Oct 28 13:08:40 2025 +0100
bump extensions
commit 3cd616b89657c5489844d8a76d26169554e5af96
Author: Artjom Plaunov <artyemnyc@gmail.com>
Date: Tue Oct 28 12:57:05 2025 +0100
PR review fixes + more C++ test coverage
commit 0fde0c573099c317b0710ed42d87864ee4b75c00
Merge: baa522991e 76ee047ce4
Author: Laurens Kuiper <laurens.kuiper@cwi.nl>
Date: Tue Oct 28 12:32:44 2025 +0100
Merge branch 'v1.4-andium' into bugfixes_v1.4
commit 99f26bde2d03e9958ac4bd37f5f8a0ac67b2fcd3
Author: Sam Ansmink <samansmink@hotmail.com>
Date: Tue Oct 28 12:07:39 2025 +0100
add upcoming patch release to internal versions
commit 3856fa8ea82bd8b9c11166102aab602ddf165ee2
Author: Mytherin <mark.raasveldt@gmail.com>
Date: Tue Oct 28 11:19:35 2025 +0100
Support non-standard NULL in Parquet again
commit 295603915b0ab3a1532cbbe6cf9547f9803e3c46
Author: Sam Ansmink <samansmink@hotmail.com>
Date: Tue Oct 28 10:58:22 2025 +0100
bump extensions
commit c1d826f2523bd8454426ad7401665e8e69f9dadc
Author: Artjom Plaunov <artyemnyc@gmail.com>
Date: Tue Oct 28 08:55:00 2025 +0100
unnamed name space
commit 76ee047ce45bab9472068ea360f9894a3a456a83
Merge: b62b03c4b3 bd3eb153b1
Author: Laurens Kuiper <laurens@duckdblabs.com>
Date: Tue Oct 28 08:34:42 2025 +0100
Make `DatabaseInstance::…
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
fixes duckdb/duckdb#18396
related pr in core: duckdb/duckdb#19583
The checks of this PR can only run after duckdb/duckdb#19583 lands