bpo-34003: Use dict instead of OrderedDict in csv.DictReader #8014

selik · 2018-06-29T19:59:32Z

DictReader can now use basic dicts instead of OrderedDict, as of version
3.7's change to specify that dict maintains keys in insertion order.
This will be more efficient and more pleasant to read at the interactive
prompt.

I also changed a list comprehension to a generator expression inside of
a ", ".join() to avoid the unnecessary list construction.

https://bugs.python.org/issue34003

DictReader can now use basic dicts instead of OrderedDict, as of version 3.7's change to specify that dict maintains keys in insertion order. This will be more efficient and more pleasant to read at the interactive prompt. I also changed a list comprehension to a generator expression inside of a ``", ".join()`` to avoid the unnecessary list construction.

methane · 2018-06-30T18:55:36Z

I also changed a list comprehension to a generator expression inside of
a ", ".join() to avoid the unnecessary list construction.

FYI, str.join() creates temporary list if input is not list or tuple. You can't avoid list construction.
Changing list comprehension to genexp make it little slower, because of generator overhead.

$ python3 -m timeit '",".join(str(x) for x in range(10))'
100000 loops, best of 5: 3.46 usec per loop

$ python3 -m timeit '",".join([str(x) for x in range(10)])'
100000 loops, best of 5: 3.09 usec per loop

selik · 2018-06-30T18:57:04Z

Doh. I should have checked the timings before I assumed. I'll revert that.

str.join creates a list internally if the input is not a list or tuple. Passing a generator expression only adds overhead.

BoboTiG

Thanks for the patch :)

merwok · 2019-01-31T16:11:45Z

Doc/library/csv.rst

-   .. versionchanged:: 3.6
-      Returned rows are now of type :class:`OrderedDict`.
+   .. versionchanged:: 3.8
+      Returned rows are now of type :class:`dict`.


This entry should have been added without deleting the previous one

Looks like this never got fixed...

PR #20657 fixes this

follow-up to GH-8014

Follow-up to GH-8014

Follow-up to pythonGH-8014 (cherry picked from commit 7aed052) Co-authored-by: Éric Araujo <merwok@netwok.org>

@merwok

…H-20771) Follow-up to GH-8014 (cherry picked from commit 7aed052) Co-authored-by: Éric Araujo <merwok@netwok.org> Automerge-Triggered-By: @merwok

@merwok

…H-20770) Follow-up to GH-8014 (cherry picked from commit 7aed052) Co-authored-by: Éric Araujo <merwok@netwok.org> Automerge-Triggered-By: @merwok

Follow-up to pythonGH-8014

nielskrijger · 2020-08-23T10:01:57Z

It's a bit dissapointing that the return type of a function can suddenly change going from 3.7 to 3.8. I'll spare the details, but my assumptions about Python versioning were clearly wrong if this is allowed.

merwok · 2020-08-23T16:39:04Z

@nielskrijger the interface is 99% the same, and this is documented in the What’s New.

Major updates avoid gratuitous breakage, but have enhancements and fixes that require developers to test before updating.

nielskrijger · 2020-08-23T17:33:40Z

I assume with Major you mean a Minor (3.7 => 3.8 is not a major version).

The return value changing from OrderedDict to a normal dict is a loss of functionality; a dict and OrderedDict tend to have different characteristics https://stackoverflow.com/questions/50872498/will-ordereddict-become-redundant-in-python-3-7 , any code depending on those characteristics would break. I'm not sure how you calculcated this 99%, but if that means knowingly 1% of apps break because of this change I would have at least pointed it out as a breaking change.

Don't get me wrong, the change in itself makes perfect sense and is for the better. But I would definitely have marked this out as a breaking change.

selik · 2020-08-23T17:40:38Z

@nielskrijger The mailing list is probably a better place to discuss the policy. I appreciate that you changed your argument from focusing on a type change to a change in functionality. Not everyone would agree, but much of Python's design cares more about interface than type (as evidenced by "magic" methods).

Calling it a breaking change, are you saying there should have been a deprecation cycle, or something else?

If you have a specific example of someone using OrderedDict.move_to_end in this context, I'd like to see it.

merwok · 2020-08-23T18:17:47Z

No, I meant 3.7 to 3.8. It is a major update, or a feature release if you prefer.
3.8.4 to 3.8.5 is a minor update that takes little effort to validate, but changing your runtime to 3.8 or to 3.9 requires more effort.

(We use major in the regular sense of important; for CPython it’s not very useful to refer to X.Y.Z as major.minor.micro, it’s more line (of development) . major (or feature) . micro (or patch))

nielskrijger · 2020-08-23T20:27:47Z

If you have a specific example of someone using OrderedDict.move_to_end in this context, I'd like to see it.

I didn't find this out by accident. If an app is primarily about uploading, parsing and generating csvs with fairly advanced customization, I hope it's clear something might break.

Calling it a breaking change, are you saying there should have been a deprecation cycle, or something else?

I could imagine a number of solutions, and in the mailing lists I do see some evidence of these being pointed out:

One of those suggestions was a configuration option; that would have been a quick fix on my end.

But I do feel silly, as this was obviously an argument that was indeed discussed and the risk considered neglible I assume.

No, I meant 3.7 to 3.8. It is a major update, [...] (We use major in the regular sense of important; for CPython it’s not very useful to refer to X.Y.Z as major.minor.micro, it’s more line (of development) . major (or feature) . micro (or patch))

Sorry, I didn't know that. I think I've got semver a bit too stuck in my mind I guess, you certainly helped me to point this out in Python's versioning. Thanks!

methane · 2020-08-24T00:21:11Z

Note that we changed the returned type from dict to OrderedDict in Python 3.6 without deprecation period. It clearly breaks == behavior so it might be breaking change for very minor use cases.

We regularly change behavior which can be breaking change some people when:

Using deprecation period is not simple.
Estimated affected users is very small.

Ideally speaking, we should add OrderedDictReader instead of changing DictReader in Python 3.6.
But it is too late. Unnecessary breaking change already happened.

merwok · 2020-08-24T00:29:45Z

It clearly breaks == behavior

It doesn’t, comparing an ordered dict to a dict works (looks at the items as a set)

methane · 2020-08-24T00:42:42Z

It doesn’t, comparing an ordered dict to a dict works (looks at the items as a set)

Consider comparing two csv files. When column order is changed but data is not changed, row1 == row2 in Python 3.5 and Python 3.8+, but row1 != row2 in Python 3.6, and Python 3.7.

It doesn't emit any error so user may not notice their scripts is broken in Python 3.6. It might be worse than missing .move_to_end() in Python 3.8, because user can notice the missing .move_to_end() when they are affected.

merwok · 2020-08-24T02:38:16Z

Sorry, I don’t follow what you are saying, and this is the wrong place for the discussion!

See: python/cpython#8014

the-knights-who-say-ni added the CLA signed label Jun 29, 2018

bedevere-bot added the awaiting review label Jun 29, 2018

selik added 3 commits June 29, 2018 13:05

blurb add

38b7b3c

fix documentation, DictReader now yields dicts

14f2164

use rST metadata for class ref

bcea2a4

revert to list comprehension

0b4c143

str.join creates a list internally if the input is not a list or tuple. Passing a generator expression only adds overhead.

BoboTiG approved these changes Aug 25, 2018

View reviewed changes

bedevere-bot added awaiting core review and removed awaiting review labels Aug 25, 2018

rhettinger merged commit 9f3f093 into python:master Jan 31, 2019

bedevere-bot removed the awaiting core review label Jan 31, 2019

selik deleted the fix-issue-34003 branch January 31, 2019 15:30

merwok reviewed Jan 31, 2019

View reviewed changes

merwok added a commit that referenced this pull request Jun 5, 2020

Re-add versionchanged entry in csv docs

b1f7466

follow-up to GH-8014

merwok mentioned this pull request Jun 5, 2020

bpo-34003: Re-add versionchanged entry in csv docs #20657

Merged

miss-islington pushed a commit that referenced this pull request Jun 10, 2020

bpo-34003: Re-add versionchanged entry in csv docs (GH-20657)

7aed052

Follow-up to GH-8014

This was referenced Jun 10, 2020

[3.9] bpo-34003: Re-add versionchanged entry in csv docs (GH-20657) #20770

Merged

[3.8] bpo-34003: Re-add versionchanged entry in csv docs (GH-20657) #20771

Merged

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jun 10, 2020

bpo-34003: Re-add versionchanged entry in csv docs (pythonGH-20657)

07b8b8d

Follow-up to pythonGH-8014 (cherry picked from commit 7aed052) Co-authored-by: Éric Araujo <merwok@netwok.org>

arun-mani-j pushed a commit to arun-mani-j/cpython that referenced this pull request Jul 21, 2020

bpo-34003: Re-add versionchanged entry in csv docs (pythonGH-20657)

ab82e52

Follow-up to pythonGH-8014

andreas-stuerz pushed a commit to andreas-stuerz/plugins that referenced this pull request Aug 10, 2021

use OrderDict because csv.DictReader returns a dict after version 3.7

9bc3f75

See: python/cpython#8014

andreas-stuerz mentioned this pull request Aug 10, 2021

use OrderDict because csv.DictReader returns a dict after version 3.7 opnsense/plugins#2496

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bpo-34003: Use dict instead of OrderedDict in csv.DictReader #8014

bpo-34003: Use dict instead of OrderedDict in csv.DictReader #8014

selik commented Jun 29, 2018 •

edited by bedevere-bot

Loading

methane commented Jun 30, 2018

selik commented Jun 30, 2018

BoboTiG left a comment

merwok Jan 31, 2019

GPHemsley Jun 5, 2020

merwok Jun 5, 2020

nielskrijger commented Aug 23, 2020

merwok commented Aug 23, 2020

nielskrijger commented Aug 23, 2020

selik commented Aug 23, 2020 •

edited

Loading

merwok commented Aug 23, 2020 •

edited

Loading

nielskrijger commented Aug 23, 2020

methane commented Aug 24, 2020

merwok commented Aug 24, 2020

methane commented Aug 24, 2020

merwok commented Aug 24, 2020

bpo-34003: Use dict instead of OrderedDict in csv.DictReader #8014

bpo-34003: Use dict instead of OrderedDict in csv.DictReader #8014

Conversation

selik commented Jun 29, 2018 • edited by bedevere-bot Loading

methane commented Jun 30, 2018

selik commented Jun 30, 2018

BoboTiG left a comment

Choose a reason for hiding this comment

merwok Jan 31, 2019

Choose a reason for hiding this comment

GPHemsley Jun 5, 2020

Choose a reason for hiding this comment

merwok Jun 5, 2020

Choose a reason for hiding this comment

nielskrijger commented Aug 23, 2020

merwok commented Aug 23, 2020

nielskrijger commented Aug 23, 2020

selik commented Aug 23, 2020 • edited Loading

merwok commented Aug 23, 2020 • edited Loading

nielskrijger commented Aug 23, 2020

methane commented Aug 24, 2020

merwok commented Aug 24, 2020

methane commented Aug 24, 2020

merwok commented Aug 24, 2020

selik commented Jun 29, 2018 •

edited by bedevere-bot

Loading

selik commented Aug 23, 2020 •

edited

Loading

merwok commented Aug 23, 2020 •

edited

Loading