Skip to content

[DISCUSS] Reducing cadence of major arrow-rs releases introducing patch releases #5368

Closed
@alamb

Description

@alamb

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

As more people use arrow, the overall burden to users from frequent major releases is increasing. Furthermore, the pace of breaking API changes is decreasing, so the burden on maintainers to avoid breaking changes is decreasing

As the arrow crate becomes more widely used in the ecosystem by projects other than DataFusion and other early adopters, the frequent major releases causes several issues:

  1. Crates must match the major arrow versions. For example, if a crate uses DataFusion that forces everything in the entire project to exactly that version of arrow-rs).
  2. parquet and arrow releases are coupled so releasing a version of parquet requires releasing a new version of arrow

The major version bumps imposes non trivial overhead on user crates. Some crates like arrow_serde have implemented clever, though complex, workaround like having feature flags for each arrow version (see the recent discussion with @chmp on arrow_serde chmp/serde_arrow#131)

Also, from what I can see many of the recent arrow-rs changes aren't really adding new APIs, they are more like filling in feature gaps and bugs, which also reflected in the slower pace of the last few releases.

Describe the solution you'd like
I propose we set a more regular major release cadence (e.g. every 3 months) and only do minor, compatible, releases between those releases.

This would absolutely require more maintainer effort, but at this stage in the project the effort may be more manageable as the APIs are in a pretty good place I think

Describe alternatives you've considered
I think there are various alternatives to trigger releases / what cadence. I don't have a hugely strong opinion in this matter

Additional context
At some point in the past we actually had fewer major releases -- see #1120

There was non trivial process overhead so we (well , really I) abandoned it and went YOLO on major releases as there wasn't really any maintenance bandwidth to do anything else

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    arrowChanges to the arrow cratearrow-flightChanges to the arrow-flight crateenhancementAny new improvement worthy of a entry in the changelogparquetChanges to the parquet crate

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions