Description
Feature Type
-
Adding new functionality to pandas
-
Changing existing functionality in pandas
-
Removing existing functionality in pandas
Problem Description
I want to write out the value arrays as JSON lines (one JSON array per line). The column metadata will be stored in a separate file. So the output file would look something like this:
["a", "b", true, 123]
["x", "y", false, 234]
Currently, trying to combine lines=True
with orient='values'
results in an error.
Feature Description
I want to write out the value arrays as JSON lines (one JSON array per line). The column metadata will be stored in a separate file. So the output file would look something like this:
["a", "b", true, 123]
["x", "y", false, 234]
Currently, trying to combine lines=True
with orient='values'
results in an error.
Alternative Solutions
The alternative right now is to call iterrows()
, convert each row to a json array and write it to output.
def write_json_lines(output: Path, data_frame: pd.DataFrame):
with open(output, 'w') as out:
for _, row in data_frame.iterrows():
line = json.dumps(row.tolist())
out.write(line)
out.write("\n")
Additional Context
I am assuming internalizing the implementation inside pandas would improve performance.