Skip to content

Conversation

@waltaskew
Copy link
Contributor

@waltaskew waltaskew commented Mar 19, 2018

I'd written some code like

with tempfile.NamedTemporaryFile() as temp:
    civis.io.civis_to_file(file_id, temp)
    df = pandas.read_csv(temp.name)

with a bit of extra logic for read_csv-ing in chunks to keep from going out of memory.
Because temp wasn't being flushed, I would occasionally get errors from unexpected end of files.

I'm not sure if that's expected behavior or not (if I'm passing the buffer to civis_to_file should I be expected to flush it myself?) but flushing them civis_to_file-side seems harmless at worst and bug-avoiding at best.

@mheilman

_civis_to_file(file_id, f, api_key=api_key, client=client)
else:
_civis_to_file(file_id, buf, api_key=api_key, client=client)
buf.flush()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like this should be in _civis_to_file, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like this should be in _civis_to_file, no?

The with open(buf, 'wb') as f: bit ensures the flush happens when the user passes a file path. It's only when they pass a buffer that we need the flush, although it wouldn't hurt to have an extra flush.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also rename _civis_to_file to _civis_to_stream or something?

Copy link
Contributor Author

@waltaskew waltaskew Mar 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also rename _civis_to_file to _civis_to_stream or something?

I think of files in the unix sense of anything you can read & write rather than filesystem files, so I don't have an issue with _civis_to_file as a name. I wouldn't object to _civis_to_stream, though.

@mheilman
Copy link
Contributor

please add a note in the changelog

Copy link
Contributor

@mheilman mheilman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks

@waltaskew waltaskew merged commit eb3e341 into civisanalytics:master Mar 20, 2018
@waltaskew waltaskew deleted the flush-buffers branch March 20, 2018 14:58
@stephen-hoover stephen-hoover added this to the Next Version milestone Apr 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants