Skip to content

ENH: Allow to_json for DataFrame combine orient='values' with lines=True #56304

Open
@jehugaleahsa

Description

@jehugaleahsa

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

I want to write out the value arrays as JSON lines (one JSON array per line). The column metadata will be stored in a separate file. So the output file would look something like this:

["a", "b", true, 123]
["x", "y", false, 234]

Currently, trying to combine lines=True with orient='values' results in an error.

Feature Description

I want to write out the value arrays as JSON lines (one JSON array per line). The column metadata will be stored in a separate file. So the output file would look something like this:

["a", "b", true, 123]
["x", "y", false, 234]

Currently, trying to combine lines=True with orient='values' results in an error.

Alternative Solutions

The alternative right now is to call iterrows(), convert each row to a json array and write it to output.

def write_json_lines(output: Path, data_frame: pd.DataFrame):
    with open(output, 'w') as out:
        for _, row in data_frame.iterrows():
            line = json.dumps(row.tolist())
            out.write(line)
            out.write("\n")

Additional Context

I am assuming internalizing the implementation inside pandas would improve performance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions