bpo-39721: Fix constness of members of tok_state struct. #18600

petdance · 2020-02-22T00:28:34Z

The function PyTokenizer_FromUTF8 from Parser/tokenizer.c had a comment:

/* XXX: constify members. */

This patch addresses that.

In the tok_state struct:
* end and start were non-const but could be made const
* str and input were const but should have been non-const

Changes to support this include:
* decode_str() now returns a char * since it is allocated.
* PyTokenizer_FromString() and PyTokenizer_FromUTF8() each creates a
new char * for an allocate string instead of reusing the input
const char *.
* PyTokenizer_Get() and tok_get() now take const char ** arguments.
* Various local vars are const or non-const accordingly.

I was able to remove five casts that cast away constness.

https://bugs.python.org/issue39721

The function PyTokenizer_FromUTF8 from Parser/tokenizer.c had a comment: /* XXX: constify members. */ This patch addresses that. In the tok_state struct: * end and start were non-const but could be made const * str and input were const but should have been non-const Changes to support this include: * decode_str() now returns a char * since it is allocated. * PyTokenizer_FromString() and PyTokenizer_FromUTF8() each creates a new char * for an allocate string instead of reusing the input const char *. * PyTokenizer_Get() and tok_get() now take const char ** arguments. * Various local vars are const or non-const accordingly. I was able to remove five casts that cast away constness.

codecov · 2020-02-22T01:09:11Z

Codecov Report

Merging #18600 into master will decrease coverage by 2.65%.
The diff coverage is n/a.

@@             Coverage Diff             @@
##           master   #18600       +/-   ##
===========================================
- Coverage   82.11%   79.46%    -2.66%     
===========================================
  Files        1956      384     -1572     
  Lines      589413   169260   -420153     
  Branches    44458        0    -44458     
===========================================
- Hits       484015   134494   -349521     
+ Misses      95748    34766    -60982     
+ Partials     9650        0     -9650

Impacted Files	Coverage Δ
Modules/_decimal/libmpdec/umodarith.h	`80.76% <0.00%> (-19.24%)`	⬇️
Modules/clinic/spwdmodule.c.h	`58.33% <0.00%> (-3.21%)`	⬇️
Modules/clinic/resource.c.h	`78.57% <0.00%> (-2.68%)`	⬇️
Modules/clinic/_pickle.c.h	`62.40% <0.00%> (-2.33%)`	⬇️
Modules/clinic/signalmodule.c.h	`71.51% <0.00%> (-2.32%)`	⬇️
Python/clinic/marshal.c.h	`77.77% <0.00%> (-2.23%)`	⬇️
Modules/clinic/_weakref.c.h	`77.77% <0.00%> (-2.23%)`	⬇️
Modules/clinic/_queuemodule.c.h	`78.08% <0.00%> (-2.17%)`	⬇️
Modules/_multiprocessing/clinic/posixshmem.c.h	`66.66% <0.00%> (-2.09%)`	⬇️
Objects/clinic/moduleobject.c.h	`78.94% <0.00%> (-2.01%)`	⬇️
... and 1839 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a025d4c...0b7a775. Read the comment docs.

petdance · 2020-02-22T04:05:45Z

I can't for the life of me figure out why codecov thinks that the code coverage has gone down. Does anyone know why? Is there some specific file in the results that I should look in?

benjaminp · 2020-02-26T02:12:30Z

Parser/tokenizer.h

@@ -60,8 +60,8 @@ struct tok_state {
    PyObject *decoding_readline; /* open(...).readline */
    PyObject *decoding_buffer;
    const char* enc;        /* Encoding for the current str. */
-    const char* str;
-    const char* input; /* Tokenizer's newline translated copy of the string. */


What's the reason these can't be const?

tok->input can't be const because it's the allocated input, and then gets freed in PyTokenizer__Free.

tok->str can't be const because it can be returned from decode_str, which returns a non-const string. If I inlined decode_str into PyTokenizer_FromString, the only place that uses it, then I think I could make tok->str const.

decode_str only returns a non-const string in this change, though.

That's correct. I changed decode_str to return char * instead of const char * because it can return the result from translate_newlines. Also, the result from decode_str is assigned to tok->buf which is non-const.

the-knights-who-say-ni added the CLA signed label Feb 22, 2020

bedevere-bot added the awaiting review label Feb 22, 2020

benjaminp reviewed Feb 26, 2020

View reviewed changes

benjaminp added the skip news label Feb 28, 2020

benjaminp merged commit 384f3c5 into python:master Feb 28, 2020

bedevere-bot removed the awaiting review label Feb 28, 2020

petdance deleted the bpo-39721 branch February 28, 2020 04:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

bpo-39721: Fix constness of members of tok_state struct. #18600

bpo-39721: Fix constness of members of tok_state struct. #18600

Uh oh!

petdance commented Feb 22, 2020 •

edited by bedevere-bot

Loading

Uh oh!

codecov bot commented Feb 22, 2020

Uh oh!

petdance commented Feb 22, 2020

Uh oh!

benjaminp Feb 26, 2020

Uh oh!

petdance Feb 26, 2020

Uh oh!

benjaminp Feb 27, 2020

Uh oh!

petdance Feb 27, 2020

Uh oh!

Uh oh!

Uh oh!

bpo-39721: Fix constness of members of tok_state struct. #18600

bpo-39721: Fix constness of members of tok_state struct. #18600

Uh oh!

Conversation

petdance commented Feb 22, 2020 • edited by bedevere-bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Feb 22, 2020

Codecov Report

Uh oh!

petdance commented Feb 22, 2020

Uh oh!

benjaminp Feb 26, 2020

Choose a reason for hiding this comment

Uh oh!

petdance Feb 26, 2020

Choose a reason for hiding this comment

Uh oh!

benjaminp Feb 27, 2020

Choose a reason for hiding this comment

Uh oh!

petdance Feb 27, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

petdance commented Feb 22, 2020 •

edited by bedevere-bot

Loading