Skip to content

Fetches the latest upstream #46

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 132 commits into from
Apr 10, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
132 commits
Select commit Hold shift + click to select a range
8a28e38
Switch from nose to pytest
arthurdejong Apr 9, 2022
a280d53
Upgrade to CodeQL Action v2
arthurdejong Jul 4, 2022
9f79691
Fix flake8 error
arthurdejong Aug 12, 2022
b36c0d6
Upgrade GitHub Actions
cclauss Aug 8, 2022
351be74
Add support for Python 3.10
arthurdejong Aug 13, 2022
7ee0563
Put long line flake8 ignores in files instead of globally
arthurdejong Aug 13, 2022
4d4a0b3
Fix small typo
vovavili Aug 6, 2022
c5595c7
Add Czech bank account numbers
Jun 8, 2022
eae1dd2
Use str.zfill() for padding leading zeros
arthurdejong Aug 15, 2022
ce9322c
Add extra court alias for german Handelsregisternummer
romualdr Aug 3, 2022
8aa6b5e
Remove redundant steps with tox_job
cclauss Aug 13, 2022
ed37a6a
Update ISIL download URL
arthurdejong Aug 15, 2022
975d508
Provide a timeout to all download scripts
arthurdejong Aug 15, 2022
2cf78c2
Update names of Wikipedia pages with IMSI codes
arthurdejong Aug 15, 2022
e901ac7
Ignore invalid downloaded country codes
arthurdejong Aug 15, 2022
6b39c3d
Do not print trailing space
arthurdejong Aug 15, 2022
ee9dfdf
Update database files
arthurdejong Aug 15, 2022
5bcc460
Fix German OffeneRegister company registry URL
arthurdejong Aug 15, 2022
e40c827
Update EU VAT Vies test with new number
arthurdejong Aug 15, 2022
dd70cd5
Add support for Tunisia TIN
unho Sep 6, 2022
d70549a
Add Kenyan TIN
unho Aug 8, 2022
2907676
Add support for Morocco TIN
unho Sep 3, 2022
31709fc
Add Algerian NIF number
unho Sep 4, 2022
eff3f52
Fix a couple typos found by codespell
DimitriPapadopoulos Sep 23, 2022
a261a93
Add North Macedonian ЕДБ
unho Sep 17, 2022
fbe094c
Add Faroe Islands V-number
unho Sep 10, 2022
2b6e087
Add support for Montenegro TIN
unho Sep 18, 2022
acb6934
Add CAS Registry Number
arthurdejong Oct 15, 2022
7be2291
Add support for Ghana TIN
unho Sep 11, 2022
1636045
Support running tests with PyPy 2.7
arthurdejong Oct 15, 2022
1003033
Update Fødselsnummer test case for date in future
arthurdejong Oct 19, 2022
7c2153e
Remove duplicate CAS Registry Number
arthurdejong Oct 19, 2022
09d595b
Improve validation of CAS Registry Number
arthurdejong Oct 19, 2022
8b5b07a
Remove unused import
arthurdejong Oct 19, 2022
c5d3bf4
Switch to parse_qs() from urllib.parse
arthurdejong Oct 23, 2022
f972894
Switch to escape() from html
arthurdejong Oct 23, 2022
1364e19
Support "I" and "O" in CUSIP number
arthurdejong Nov 12, 2022
a218032
Add a check_uid() function to the stdnum.ch.uid module
arthurdejong Nov 12, 2022
45f098b
Make all exceptions inherit from ValueError
arthurdejong Nov 12, 2022
8e76cd2
Pad with zeroes in a more readable manner
unho Oct 23, 2022
a03ac04
Use HTTPS in URLs where possible
arthurdejong Nov 12, 2022
74cc981
Ensure we always run flake8-bugbear
arthurdejong Nov 12, 2022
feccaff
Add support for Slovenian EMŠO (Unique Master Citizen Number)
bblaz Oct 16, 2022
fa62ea3
Add Pakistani ID card number
arthurdejong Nov 13, 2022
7348c7a
vatin: Add a few more tests for is_valid
unho Sep 6, 2022
580d6e0
Pick up custom certificate from script path
arthurdejong Nov 13, 2022
5cdef0d
Increase timeout for CN Open Data download
arthurdejong Nov 13, 2022
f691bf7
Update German OffeneRegister lookup data format
arthurdejong Nov 13, 2022
31b2694
Update database files
arthurdejong Nov 13, 2022
60a90ed
Get files ready for 1.18 release
arthurdejong Nov 12, 2022
7a91a98
Avoid newer flake8
arthurdejong Nov 28, 2022
74d854f
Fix a typo
valeriko Nov 29, 2022
4f8155c
Run Python 3.5 and 3.6 GitHub tests on older Ubuntu
arthurdejong Dec 12, 2022
df894c3
Fix typos found by codespell
DimitriPapadopoulos Dec 5, 2022
b1dc313
Add initial CONTRIBUTING.md file
arthurdejong Dec 30, 2022
6d366e3
Add support for Egypt TIN
unho Oct 9, 2022
cf22705
Extend number properties to show in online check
arthurdejong Jan 2, 2023
031a249
Fix typo in UEN docstring
alisaifee Mar 13, 2023
a09a7ce
Fix Albanian tax number validation
arthurdejong Mar 18, 2023
bf1bdfe
Update IBAN database file
DimitriPapadopoulos Mar 9, 2023
7e84c05
Extend date parsing in GS1-128
arthurdejong Mar 18, 2023
8498b37
Fix date formatting on PyPy 2.7
arthurdejong Mar 18, 2023
7af50b7
Add support for Python 3.11
arthurdejong Mar 18, 2023
a8b6573
Ensure flake8 is run on all Python files
arthurdejong Mar 19, 2023
cf14a9f
Add get_county() function to Romanian CNP
RaduBorzea Mar 6, 2023
42d2792
Add functionality to get gender from Belgian National Number
jeffh92 Jan 5, 2023
36858cc
Add support for Finland HETU new century indicating signs
mjturt Feb 13, 2023
96abcfe
Add Spanish postcode validator
Feb 24, 2023
62d15e9
Add support for Guinea TIN
unho Jan 28, 2023
90044e2
Add automated checking for correct license header
arthurdejong May 12, 2023
7d3ddab
Minor ISSN and ISBN documentation fixes
hornc Jun 1, 2023
311fd56
Handle (partially) unknown birthdate of Belgian National Number
jeffh92 Jun 13, 2023
8ce4a47
Run Python 2.7 tests in a container for GitHub Actions
arthurdejong Jun 19, 2023
be33a80
Add Belgian BIS Number
jeffh92 Jun 20, 2023
3848318
Validate first digit of Canadian SIN
arthurdejong Jul 30, 2023
ef49f49
Fix file headers
arthurdejong Aug 6, 2023
b8ee830
Extend license check to file header check
arthurdejong Aug 6, 2023
d0f4c1a
Add Slovenian Corporate Registration Number
Jun 30, 2023
f58e08d
Validate European VAT numbers with EU or IM prefix
arthurdejong Aug 13, 2023
0aa0b85
Remove EU NACE update script
arthurdejong Aug 20, 2023
6e56f3c
Update database files
arthurdejong Aug 20, 2023
88d1dca
Replace test number for German company registry
arthurdejong Aug 20, 2023
3126f96
Update Belarusian UNP online check
arthurdejong Aug 20, 2023
895f092
Rename license_file option in setup.cfg
arthurdejong Aug 20, 2023
f6edcc5
Avoid the deprecated assertRegexpMatches function
arthurdejong Aug 20, 2023
7761e42
Use importlib.resource in place of deprecated pkg_resources
arthurdejong Aug 20, 2023
3947a54
Remove obsolete intermediate certificate
arthurdejong Aug 20, 2023
3191b4c
Ensure all files are included in source archive
arthurdejong Aug 20, 2023
fa455fc
Get files ready for 1.19 release
arthurdejong Aug 20, 2023
352bbcb
Add support for Python 3.12
arthurdejong Oct 2, 2023
1a5db1f
Fix typo (thanks Александр Кизеев)
arthurdejong Nov 12, 2023
58d6283
Ensure EU VAT numbers don't accept duplicate country codes
arthurdejong Nov 12, 2023
2478483
Add British Columbia PHN
oboratav Oct 20, 2023
2535bbf
Add European Community (EC) Number
weberdak Nov 19, 2023
1e412ee
Fix vatin number compacting for "EU" VAT numbers
arthurdejong Feb 3, 2024
9c7c669
Imporve French NIF validation (checksum)
TonkWorks Dec 15, 2023
bb20121
Fix Ukrainian EDRPOU check digit calculation
arthurdejong Mar 17, 2024
7cba469
Add Indian virtual identity number
deolekar Feb 27, 2024
9230604
Use HTTPS in URLs where possible
arthurdejong Mar 17, 2024
26fd25b
Switch to using openpyxl for parsing XLSX files
arthurdejong Mar 17, 2024
97dbced
Add update-dat tox target for convenient data file updating
arthurdejong Mar 17, 2024
b454d3a
Update database files
arthurdejong Mar 17, 2024
201d4d1
Get files ready for 1.20 release
arthurdejong Mar 17, 2024
0690996
Drop support for Python 3.5
arthurdejong May 19, 2024
5aeaeff
Add support for Indonesian NIK
arthurdejong May 19, 2024
fb4d792
Fix a typo
vanderkoort Jun 13, 2024
58ecb03
Update Irish PPS validator to support new numbers
Ollymid May 31, 2024
91959bd
Update Czech database files
May 21, 2024
1da003f
Adjust Swiss uid module to accept numbers without CHE prefix
jeffh92 May 17, 2024
e951dac
Support 16 digit Indonesian NPWP numbers
arthurdejong Jun 23, 2024
0da257c
Replace use of deprecated inspect.getargspec()
arthurdejong Jul 15, 2024
af3a728
Add Belgian SSN number
jeffh92 Jul 4, 2023
6cbb9bc
Fix zeep client timeout parameter
jmak-odoo Jul 4, 2024
3fcebb2
Customise certificate validation for web services
arthurdejong Sep 14, 2024
56036d0
Add Dutch identiteitskaartnummer
jeffh92 Jul 25, 2024
6c2873c
Add Belgian eID card number
jeffh92 Jul 25, 2024
0ceb2b9
Ensure get_soap_client() caches with verify
arthurdejong Sep 15, 2024
dc850d6
Ignore deprecation warnings in flake8 target
arthurdejong Sep 21, 2024
051e63f
Add more tests for Verhoeff implementation
arthurdejong Oct 11, 2024
020f1df
Use older Github runner for Python 3.7 tests
arthurdejong Oct 11, 2024
bcd5018
Add missing music industry ISRC country codes
Vyko Sep 30, 2024
0218601
Allow Uruguay RUT number starting with 22
arthurdejong Nov 17, 2024
2b92075
Drop Python 2 support
arthurdejong Jan 11, 2025
928a09d
Add International Standard Name Identifier
arthurdejong Feb 15, 2025
0f94ca6
Support Ecuador public RUC with juridical format
arthurdejong Feb 15, 2025
8519221
Add Spanish CAE Number
quiqueporta Jul 4, 2024
1386f67
Add Russian ОГРН
nvmbrasserie Dec 10, 2024
fc766bc
Add support for Python 3.13
arthurdejong Feb 16, 2025
852515c
Fix Czech Rodné číslo check digit validation
arthurdejong Mar 27, 2025
dcd7fa6
Drop more Python 2.7 compatibility code
arthurdejong Mar 27, 2025
ae0d86c
Ignore test failures from www.dgii.gov.do
arthurdejong Mar 27, 2025
fca8f0f
Merge remote-tracking branch 'upstream/master' into fetch-upstream-ma…
EmberCraze Apr 10, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 32 additions & 24 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,46 +9,54 @@ on:
- cron: '9 0 * * 1'

jobs:
test:
runs-on: ubuntu-latest
test_legacy:
runs-on: ubuntu-20.04
strategy:
matrix:
python-version: [2.7, 3.5, 3.6, 3.7, 3.8, 3.9, pypy-2.7, pypy-3.6]
fail-fast: false
matrix:
python-version: [3.6, 3.7]
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: python -m pip install --upgrade pip tox
- name: Run tox
run: tox -e "$(echo py${{ matrix.python-version }} | sed -e 's/[.-]//g;s/pypypy/pypy/')" --skip-missing-interpreters false
docs:
run: tox -e "$(echo py${{ matrix.python-version }} | sed -e 's/[.]//g;s/pypypy/pypy/')" --skip-missing-interpreters false
test:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: [3.8, 3.9, '3.10', 3.11, 3.12, 3.13, pypy3.9]
steps:
- uses: actions/checkout@v2
- name: Set up Python 3.8
uses: actions/setup-python@v2
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: 3.8
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: python -m pip install --upgrade pip tox
- name: Run tox
run: tox -e docs --skip-missing-interpreters false
flake8:
run: tox -e "$(echo py${{ matrix.python-version }} | sed -e 's/[.]//g;s/pypypy/pypy/')" --skip-missing-interpreters false
tox_job:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
tox_job: [docs, flake8, headers]
steps:
- uses: actions/checkout@v2
- name: Set up Python 3.8
uses: actions/setup-python@v2
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: 3.8
python-version: 3.x
- name: Install dependencies
run: python -m pip install --upgrade pip tox
- name: Tox
run: tox -e flake8 --skip-missing-interpreters false
- name: Run tox ${{ matrix.tox_job }}
run: tox -e ${{ matrix.tox_job }} --skip-missing-interpreters false
CodeQL:
runs-on: ubuntu-latest
permissions:
Expand All @@ -57,12 +65,12 @@ jobs:
security-events: write
steps:
- name: Checkout repository
uses: actions/checkout@v2
uses: actions/checkout@v3
- name: Initialize CodeQL
uses: github/codeql-action/init@v1
uses: github/codeql-action/init@v2
with:
languages: python
- name: Build
uses: github/codeql-action/autobuild@v1
uses: github/codeql-action/autobuild@v2
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v1
uses: github/codeql-action/analyze@v2
160 changes: 160 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
Contributing to python-stdnum
=============================

This document describes general guidelines for contributing new formats or
other enhancement to python-stdnum.


Adding number formats
---------------------

Basically any number or code that has some validation mechanism available or
some common formatting is eligible for inclusion into this library. If the
only specification of the number is "it consists of 6 digits" implementing
validation may not be that useful.

Contributions of new formats or requests to implement validation for a format
should include the following:

* The format name and short description.
* References to (official) sources that describe the format.
* A one or two paragraph description containing more details of the number
(e.g. purpose and issuer and possibly format information that might be
useful to end users).
* If available, a link to an (official) validation service for the number,
reference implementations or similar sources that allow validating the
correctness of the implementation.
* A set of around 20 to 100 "real" valid numbers for testing (more is better
during development but only around 100 will be retained for regression
testing).
* If the validation depends on some (online) list of formats, structures or
parts of the identifier (e.g. a list of region codes that are part of the
number) a way to easily update the registry information should be
available.


Code contributions
------------------

Improvements to python-stdnum are most welcome. Integrating contributions
will be done on a best-effort basis and can be made easier if the following
are considered:

* Ideally contributions are made as GitHub pull requests, but contributions
by email (privately or through the python-stdnum-users mailing list) can
also be considered.
* Submitted contributions will often be reformatted and sometimes
restructured for consistency with other parts.
* Contributions will be acknowledged in the release notes.
* Contributions should add or update a copyright statement if you feel the
contribution is significant.
* All contribution should be made with compatible applicable copyright.
* It is not needed to modify the NEWS, README.md or files under docs for new
formats; these files will be updated on release.
* Marking valid numbers as invalid should be avoided and are much worse than
marking invalid numbers as valid. Since the primary use case for
python-stdnum is to validate entered data having an implementation that
results in "computer says no" should be avoided.
* Number format implementations should include links to sources of
information: generally useful links (e.g. more details about the number
itself) should be in the module docstring, if it relates more to the
implementation (e.g. pointer to reference implementation, online API
documentation or similar) a comment in the code is better
* Country-specific numbers and codes go in a country or region package (e.g.
stdnum.eu.vat or stdnum.nl.bsn) while global numbers go in the toplevel
name space (e.g. stdnum.isbn).
* All code should be well tested and achieve 100% code coverage.
* Existing code structure conventions (e.g. see README for interface) should
be followed.
* Git commit messages should follow the usual 7 rules.
* Declarative or functional constructs are preferred over an iterative
approach, e.g.::

s = sum(int(c) for c in number)

over::

s = 0
for c in number:
s += int(c)


Testing
-------

Tests can be run with `tox`. Some basic code style tests can be run with `tox
-e flake8` and most other targets run the test suite with various supported
Python interpreters.

Module implementations have a couple of smaller test cases that also serve as
basic documentation of the happy flow.

More extensive tests are available, per module, in the tests directory. These
tests (also doctests) cover more corner cases and should include a set of
valid numbers that demonstrate that the module works correctly for real
numbers.

The normal tests should never require online sources for execution. All
functions that deal with online lookups (e.g. the EU VIES service for VAT
validation) should only be tested using conditional unittests.


Finding test numbers
--------------------

Some company numbers are commonly published on a company's website contact
page (e.g. VAT or other registration numbers, bank account numbers). Doing a
web search limited to a country and some key words generally turn up a lot of
pages with this information.

Another approach is to search for spreadsheet-type documents with some
keywords that match the number. This sometimes turns up lists of companies
(also occasionally works for personal identifiers).

For information that is displayed on ID cards or passports it is sometimes
useful to do an image search.

For dealing with numbers that point to individuals it is important to:

* Only keep the data that is needed to test the implementation.
* Ensure that no actual other data relation to a person or other personal
information is kept or can be inferred from the kept data.
* The presence of a number in the test set should not provide any information
about the person (other than that there is a person with the number or
information that is present in the number itself).

Sometimes numbers are part of a data leak. If this data is used to pick a few
sample numbers from the selection should be random and the leak should not be
identifiable from the picked numbers. For example, if the leaked numbers
pertain only to people with a certain medical condition, membership of some
organisation or other specific property the leaked data should not be used.


Reverse engineering
-------------------

Sometimes a number format clearly has a check digit but the algorithm is not
publicly documented. It is sometimes possible to reverse engineer the used
check digit algorithm from a large set of numbers.

For example, given numbers that, apart from the check digit, only differ in
one digit will often expose the weights used. This works reasonably well if
the algorithm uses modulo 11 is over a weighted sums over the digits.

See https://github.com/arthurdejong/python-stdnum/pull/203#issuecomment-623188812


Registries
----------

Some numbers or parts of numbers use validation base on a registry of known
good prefixes, ranges or formats. It is only useful to fully base validation
on these registries if the update frequency to these registries is very low.

If there is a registry that is used (a list of known values, ranges or
otherwise) the downloaded information should be stored in a data file (see
the stdnum.numdb module). Only the minimal amount of data should be kept (for
validation or identification).

The data files should be able to be created and updated using a script in the
`update` directory.
Loading