Description
(see #33)
The existing Tools/c-globals/check-c-globals.py
script is an important tool for achieving a per-interpreter GIL (and generally for the health of the CPython code base). However, it has a number of problems that keep it from being helpful in reaching that objective.
First we'll review the characteristics of the tool. Then we'll look at the deficiencies. Finally, we'll cover the specific tasks that will address those problems.
Purpose
The script has the following 2 core purposes:
- make sure there are no "unsupported" static variables in CPython
- help address failures when such static variables are found
- including making it easy to see the current related state of the CPython code base
(More or less, "unsupported" means the variable holds per-interpreter runtime state.)
Constraints
The script also has the following core constraints:
- produce correct results (identify exactly all unsupported static variables)
- be up-to-date (relative to ignored variables and to what "unsupported" means)
- have low maintenance cost (few defects, easy to understand, easy to update/extend)
Deficiencies
In context of those purposes and constraints, the script has some critical deficiencies (in rough order of priority):
- not actually used by core developers (so statics keep getting added)
- static locals are not detected
- detection relies on a platform-specific tool (
nm
on linux) - the code is a mess (impacting maintainability)
- no tests (impacting maintainability)
- relies on naive hard-coded rules to ignore many statics (impacting correctness & maintainability)
- relies on a simple large list of names (from a basic flat text file) to ignore the remaining statics (impacting correctness & maintainability)
(Note that non-critical deficiencies (and nice-to-haves) are covered in #49.)
Solution
(Reminder: we're only aiming to addresss critical deficiencies here.)
High-level Plan
- rename
Tools/c-globals
toTools/c-analyzer
- add
c-statics.py
- the "full-featured" replacement for
check-c-statics.py
- only provide minimal necessary functionality to meet core purposes (and be fast enough)
- correct the critical deficiencies
- (detailed tasks below)
- the "full-featured" replacement for
- rewrite
check-c-globals.py
as a single-purpose script namedcheck-c-statics.py
- no args
- share code with
c-statics.py
(i.e. callTools/c-analyzer/c_statics/__main__.py:main()
) - refer to the
c-statics.py
script in output
- support ignoring some "unsupported" statics
- add
Tools/c-analyzer/c_statics/ignored.tsv
- add a
--ignore
option to thec-statics.py
script
- add
- add
Lib/test/test_check_c_statics.py
- 1 test that effectively runs the
check-c-statics.py
script - ignore all statics in
ignored.tsv
for now (will still catch any new statics) - ensures check-c-statics.py is run at some point in CI
- 1 test that effectively runs the
- make sure it's clear how to resolve "unsupported" statics
At that point we can use the tool sans nearly all deficiencies. The remaining tasks would then be:
- (Validate results from
Tools/c-analyzer/c-statics.py
using other tools. #50) validatec-statics.py
script results independently - (
Tools/c-globals/check-c-statics.py
should pass without any ignored globals. #48) resolve all remaining unsupported statics - (Non-critical deficiencies in
Tools/c-globals/c-globals.py
. #49) address non-critical deficiencies
Directory Structure
Tools/c-analyzer/
c_statics/
README (including "Resolving Failures" section)
__main__.py
find.py
show.py
known.py
supported.py
ignored.tsv (temporary)
c_parser/
files.py
info.py
statics.py
declarations.py
c-statics.py
check-c-statics.py
Lib/test/
test_tools/test_tool_c_analyzer/
testdata/...
test_c_statics/...
test_c_parser/...
test_check_c_statics.py
c-statics.py CLI
The c-statics.py
script will initially have the following basic CLI:
show [--ignored FILE] [--known FILE] [DIR,...]
- print table of static variables
check [--ignored FILE] [--known FILE] [DIR,...]
- fail if unsupported static vars found (and print simple list)
Specific Tasks for c-globals.py
(Each task includes adding sufficient tests.)
-
Tools/c-globals/_cg/
: add__init__.py
-
Lib/test/test_toolcglobals.py
: frame out the tests -
_cg/__main__.py
: add a basicparse_args()
,main()
, andif __name__ == '__main__
:` -
__main__.py
: stub out a "show" command w/ file args -
__main__.py
: stub out a "check" command w/ file args -
Tools/c-globals/c-globals.py
: copyif __name__ == '__main__
:from
_cg/main.py` - ...
-
info.py
: addStaticVar
(filename, funcname, name, vartype) -
find.py
: add a fake "statics()" func (w/ hard-coded [StaticVar, supported] list) -
show.py
: addshow_basic()
(group by supported / unsupported) -
__main__.py
: implement the "show" command - ...
-
__main__.py
: implement the "check" command - ...
-
scan.py
: add "iter_statics(file)" with a fake list ofStaticVar
-
supported.py
: add "is_supported(var)" with a simple testable heuristic -
find.py
: implement "statics()" - ...
-
scan.py
: add "normalize_vartype(text)" - ...
-
scan.py
: implement "iter_statics()" - ...
-
known.py
: add "self_check()"- make sure "known" data is up-to-date with search dirs
- check "ignored" variables
- ...
-
Lib/test/test_toolcglobals.py
: add black-box system tests forc-globals.py
(see below) -
Lib/test/test_toolcglobals.py
: add a "self-check" system test
System Tests
- "self-check":
- run normal check
- ensure known macros & vartypes aren't hiding unknown local types
- compare results with
nm
output
- run
c-globals.py show
against dummy files- check exact output
- drive results via "known" and "ignored" files
- static vars are a mix of POTS, stdlib, 3rd-party, and local types
- no known macros or vartypes unless otherwise indicated
- tests (none ignored):
- all known -> supported
- only local known -> supported
- some local unknown -> unsupported
- some vartypes known but not their nested local type -> supported (fail self-check?)
- some macros known but not their nested local type -> supported (fail self-check?)
- run
c-globals.py check
against dummy files:- make sure a passing "check" really means everything is okay
- ...
dummy files:
- static vars for all POTS
- static vars for stdlib (as much as possible)
- many static vars and many non-static
- a good mix of global and local
- at least 1 macro that produces a static
Lib/test/toolcglobalsdata/
test*/
Src1/
spam.c
eggs.c
Src2/
ham.c
Src3/
beans.c
beans_extra.h
tartar.c
Include/
spam.h
beans.h
tartar.h