Description
Sorry, long issue :-)
In issue #7750 there was some heated discussion about backwards compatibility and the fact we did or did not care enough about it. I don't want to start a new discussion on that topic again, rest assured (or not now :-)), but notwithstanding the at some point rather personal attacks in that discussion, I think we still should look at what we can learn from it.
And what I learned is that we can do a better job at documenting backward incompatible changes.
- We have no clear overview of these changes (and I am really speaking about 'changes for which users will have to change their code to keep it running/working correctly')
- In the whatsnew docs, there is the "API changes" section, which you could assume is about such kind of changes. In practice however, we made it a mix of enhancements (which are of course also api changes in se, as they 'change (add to) the api', but that fact is not really relevant) and breaking api changes.
- eg the current api changes section: http://pandas-docs.github.io/pandas-docs-travis/whatsnew.html#api-changes. Second point (addition of
level
keyword) is an enhancement, not a breaking api change, as with a lot of the other bullet points. For this reason, the 'real' breaking api changes disappear a bit between all the others.
- eg the current api changes section: http://pandas-docs.github.io/pandas-docs-travis/whatsnew.html#api-changes. Second point (addition of
- deprecations are in a seperate section, but are also a kind of api changes
What I propose:
- a better stuctured whatsnew, with a more strict structure
- limit the 'API changes' section to backwards incompatible changes, the rest are new features/enhancements (keep a clear distinction)
- for each backwards incompatible change, very clearly explain: what was the previous behaviour? what is the new behaviour? How can I keep the old behaviour? What should I do to update my code to keep it working?
Some examples of other libraries: Scipy speaks about "Backwards incompatible changes" (http://docs.scipy.org/doc/scipy-0.14.0/reference/release.0.14.0.html), scikit-learn about "API changes summary" (http://scikit-learn.org/stable/whats_new.html), sqlalchemy about "Behavioral changes" (http://docs.sqlalchemy.org/en/rel_0_9/changelog/migration_09.html#behavioral-changes-orm-09) and matlab about "Compatibility considerations" (http://www.mathworks.nl/help/matlab/release-notes.html).
A possible structure (in the past, in pandas there was also a more rigid structure with new features/api changes):
1. New features
1.1 ... (subsections highlighting bigger new features)
1.2 Categoricals
1.3 Timedelta(Index)
1.4 .dt accessor
1.5 ....
2. Backwards incompatible API changes
2.1 Breaking changes
2.2 Deprecations
2.3 Removal previously deprecated (prior version deprecations)
3. Other enhancements
4. Performance improvements
5. Bug fixes
For comparison, the current table of contents for 0.15 whatsnew
API changes
Memory Usage
.dt accessor
Timezone API changes
Rolling/Expanding Moments API changes
Internal Refactoring
Categoricals in Series/DataFrame
TimedeltaIndex/Scalar
Prior Version Deprecations/Changes
Deprecations
Enhancements
Performance
Bug Fixes
Some points of dicussion:
- 'api changes' section before new features instead of after?
- 'other enhancements' as subsection in 'new features'
- how to name the 'backwards incompatible API changes' (other options like behavioral changes, compatibility considerations, breaking API changes, ...)
- and of course the general idea I describe in this issue
Thoughts?