Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[c++] Fail earlier if n_buffers is not as expected #3107

Merged
merged 1 commit into from
Oct 1, 2024

Conversation

johnkerl
Copy link
Member

@johnkerl johnkerl commented Oct 1, 2024

While working on #3100 for #3057 / [sc-55679] (parent #2406) I ran into a cryptic error from Arrow. It turns out that when the number of buffers doesn't match -- 2 for non-variable-length validity/data vs. 3 for variable-length validity/offsets/data -- we can get message like

tiledbsoma._exception.SOMAError: ArrowInvalid: Column 4: In chunk 0: Invalid: First or last binary offset out of bounds
At:
  pyarrow/error.pxi(92): pyarrow.lib.check_status

Now, the dev situation in which I was setting up data to generate such a situation is perhaps incorrect -- but -- we should fail sooner, and more transparently.

The status quo is that I turned on libtiledbsoma trace-logging & saw lines like

[2024-10-01 16:02:55.054] [tiledbsoma] [Process: 5942] [Thread: 5942] [trace] [ArrowAdapter] column type i name myint nbuf 2 2 nullable true
[2024-10-01 16:02:55.054] [tiledbsoma] [Process: 5942] [Thread: 5942] [trace] [ArrowAdapter] create array name='myint' use_count=4
[2024-10-01 16:02:55.054] [tiledbsoma] [Process: 5942] [Thread: 5942] [debug] [ArrowAdapter] release_schema for myint
[2024-10-01 16:02:55.054] [tiledbsoma] [Process: 5942] [Thread: 5942] [trace] [ArrowAdapter] release_schema schema->name
[2024-10-01 16:02:55.054] [tiledbsoma] [Process: 5942] [Thread: 5942] [trace] [ArrowAdapter] release_schema schema->format
[2024-10-01 16:02:55.054] [tiledbsoma] [Process: 5942] [Thread: 5942] [trace] [ArrowAdapter] release_schema done
[2024-10-01 16:02:55.054] [tiledbsoma] [Process: 5942] [Thread: 5942] [trace] [ArrowAdapter] column type f name myfloat nbuf 2 2 nullable true
[2024-10-01 16:02:55.055] [tiledbsoma] [Process: 5942] [Thread: 5942] [trace] [ArrowAdapter] create array name='myfloat' use_count=4
[2024-10-01 16:02:55.055] [tiledbsoma] [Process: 5942] [Thread: 5942] [debug] [ArrowAdapter] release_schema for myfloat
[2024-10-01 16:02:55.055] [tiledbsoma] [Process: 5942] [Thread: 5942] [trace] [ArrowAdapter] release_schema schema->name
[2024-10-01 16:02:55.055] [tiledbsoma] [Process: 5942] [Thread: 5942] [trace] [ArrowAdapter] release_schema schema->format
[2024-10-01 16:02:55.055] [tiledbsoma] [Process: 5942] [Thread: 5942] [trace] [ArrowAdapter] release_schema done
[2024-10-01 16:02:55.055] [tiledbsoma] [Process: 5942] [Thread: 5942] [trace] [ArrowAdapter] column type U name newattr nbuf 2 3 nullable true <---- here
[2024-10-01 16:02:55.055] [tiledbsoma] [Process: 5942] [Thread: 5942] [trace] [ArrowAdapter] create array name='newattr' use_count=4
[2024-10-01 16:02:55.055] [tiledbsoma] [Process: 5942] [Thread: 5942] [debug] [ArrowAdapter] release_schema for newattr

where the 2 3 was the smoking gun.

@johnkerl johnkerl requested a review from nguyenv October 1, 2024 16:03
@johnkerl johnkerl changed the title [c++] Fail earlier if n_buffers is not as expeced [c++] Fail earlier if n_buffers is not as expected Oct 1, 2024
Copy link

codecov bot commented Oct 1, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.09%. Comparing base (40e2178) to head (8fc2750).
Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3107      +/-   ##
==========================================
+ Coverage   88.95%   89.09%   +0.13%     
==========================================
  Files          45       45              
  Lines        4293     4293              
==========================================
+ Hits         3819     3825       +6     
+ Misses        474      468       -6     
Flag Coverage Δ
python 89.09% <ø> (+0.13%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
python_api 89.09% <ø> (+0.13%) ⬆️
libtiledbsoma ∅ <ø> (∅)

@johnkerl johnkerl merged commit e2d1679 into main Oct 1, 2024
15 checks passed
@johnkerl johnkerl deleted the kerl/cpp-n-buffers-check branch October 1, 2024 23:13
@johnkerl
Copy link
Member Author

johnkerl commented Oct 1, 2024

Related to #3111

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants