Skip to content

ARROW-4749: [Rust] Return Result for RecordBatch::new() #3800

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

nevi-me
Copy link
Contributor

@nevi-me nevi-me commented Mar 4, 2019

Adds more validation between schemas and columns,
returning an error when record types mismatch the schema

@nevi-me
Copy link
Contributor Author

nevi-me commented Mar 4, 2019

@andygrove @paddyhoran PTAL when you get a chance

@nevi-me
Copy link
Contributor Author

nevi-me commented Mar 4, 2019

@andygrove there's a failure in sql.rs re. mismatching column and batch lengths. If that's not supposed to error, I can remove the validation on RecordBatch::new()

@andygrove
Copy link
Member

andygrove commented Mar 4, 2019

@nevi-me Nice! You found a bug in aggregate schemas ... I couldn't figure out how to push the fix direct to your branch but here is the new code (for aggregate.rs around line 188):

 let mut output_fields: Vec<Field> = vec![];
                for expr in group_expr {
                    output_fields.push(expr_to_field(expr, input_schema.as_ref()));
                }
                for expr in aggr_expr {
                    output_fields.push(expr_to_field(expr, input_schema.as_ref()));
                }
                let rel = AggregateRelation::new(
                    Arc::new(Schema::new(output_fields)),
                    input_rel,
                    compiled_group_expr,
                    compiled_aggr_expr,
                );

You can also look at the ARROW-4749 branch in my repo (andygrove).

@nevi-me
Copy link
Contributor Author

nevi-me commented Mar 4, 2019

I'll fix it manually when I get home. I normally submit a PR on the author's branch

@kszucs
Copy link
Member

kszucs commented Mar 5, 2019

LGTM, but returning result from new feels a bit odd to me, especially because other types' new return Self.

Copy link
Member

@sunchao sunchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except a few nits. Also as @kszucs mentioned, maybe we can change new to try_new to reflect the fact that it returns a Result now?

@@ -58,7 +58,8 @@ fn main() {
]);

// build a record batch
let batch = RecordBatch::new(Arc::new(schema), vec![Arc::new(id), Arc::new(nested)]);
let batch =
RecordBatch::new(Arc::new(schema), vec![Arc::new(id), Arc::new(nested)]).unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you can replace unwrap() with ? for better error message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function doesn't return a Result, so I can't ? there. Should I leave the unwrap?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In main you can actually use ?, see here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If my main has a Result return type? I tried it before commenting, and I got an error about needing to implement TryFrom.

I've got it working with returning Result<()>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you are right - seems you have to change the main interface and add return Ok(()) from the main as well.. I guess it may not worth it for this purpose. We can just keep as it is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:( I had already changed it, can I keep it the way it is now? If I need to change anything else on the PR I'll revert it back to its original form

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Let's keep as it is.

pub fn new(schema: Arc<Schema>, columns: Vec<ArrayRef>) -> Result<Self> {
// check that there are some columns
if columns.is_empty() {
return Err(ArrowError::ParseError(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: not sure ParseError is appropriate here. Do we have other candidate such as InvalidArgumentError?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InvalidArgumentError sounds good, I was avoiding adding another error option

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - agree with the other comments on this PR

Copy link
Member

@sunchao sunchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants