Skip to content

Adding support for Slovenian EMŠO (Unique Master Citizen Number) #338

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -212,6 +212,7 @@ Currently this package supports the following formats:
* MST (Mã số thuế, Vietnam tax number)
* ID number (South African Identity Document number)
* TIN (South African Tax Identification Number)
* EMŠO (Slovenian Unique Master Citizen Number)

Furthermore a number of generic check digit algorithms are available:

Expand Down
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -284,6 +284,7 @@ Available formats
se.postnummer
se.vat
sg.uen
si.emso
si.ddv
sk.dph
sk.rc
Expand Down
5 changes: 5 additions & 0 deletions docs/stdnum.si.emso.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
stdnum.si.emso
==============

.. automodule:: stdnum.si.emso
:members:
2 changes: 2 additions & 0 deletions stdnum/si/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
# coding: utf-8
#
# Copyright (C) 2012 Arthur de Jong
# Copyright (C) 2022 Blaž Bregar
#
# This library is free software; you can redistribute it and/or
# modify it under the terms of the GNU Lesser General Public
Expand All @@ -22,3 +23,4 @@

# provide vat as an alias
from stdnum.si import ddv as vat # noqa: F401
from stdnum.si import emso as personalid # noqa: F401
168 changes: 168 additions & 0 deletions stdnum/si/emso.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,168 @@
# emso.py - functions for handling Slovenian Unique Master Citizen Numbers
# coding: utf-8
#
# Copyright (C) 2022 Blaž Bregar
#
# This library is free software; you can redistribute it and/or
# modify it under the terms of the GNU Lesser General Public
# License as published by the Free Software Foundation; either
# version 2.1 of the License, or (at your option) any later version.
#
# This library is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public
# License along with this library; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
# 02110-1301 USA

"""Enotna matična številka občana (Unique Master Citizen Number).

The EMŠO is used for uniquely identify each physical person, it is perscribed
by Centralni Register Prebivalstva CRP (Central Citizen Registry), including
foreign citizens living in Slovenia.
https://sl.wikipedia.org/wiki/Enotna_mati%C4%8Dna_%C5%A1tevilka_ob%C4%8Dana
https://en.wikipedia.org/wiki/Unique_Master_Citizen_Number

EMŠO contains some personal data, namely date of brith and gender.

It is composed of 13 digits in the following pattern:
DDMMYYY RR BBBK

#. Date of birth (DDMMYYY)
Date of birth with 3 digit year (skipping the milenia number)
i.e. January first 2006 respresented as: 0101006
Since EMŠO was implemented in 1977, YYY less than 800 will be considered
in 3rd century (after year 2000), those higher or equal to 800 will be
considered in 2nd century (before year 2000).

#. Political region (RR)
Slovenia - 50-59 reserved (only 50 used)

#. Unique number of the particular RR (BBB)
* 000-499 - Male
* 500-999 - Female

#. Checksum (K)
The checksum is calculated from the mapping DDMMYYYRRBBBK = abcdefghijklm, using the formula:

m = 11 − (( 7×(a + g) + 6×(b + h) + 5×(c + i) + 4×(d + j) + 3×(e + k) + 2×(f + l) ) mod 11)

* If m is between 1 and 9, the number K is the same as the number m
* If m is 10 or 11 K becomes 0 (zero)

Source: Wikipedia - https://en.wikipedia.org/wiki/Unique_Master_Citizen_Number



>>> validate('0101006500006')
'0101006500006'
>>> validate('0101006500007') # invalid check digit
Traceback (most recent call last):
...
InvalidChecksum: ...
"""
import datetime

from stdnum.exceptions import *
from stdnum.util import clean, isdigits


def calc_check_digit(number):
"""Calculate the check digit."""
emso_factor_map = [7, 6, 5, 4, 3, 2, 7, 6, 5, 4, 3, 2]

def emso_digit(number, place):
return int(str(number)[place])

emso_sum = 0
for digit in range(12):
emso_sum += emso_digit(number, digit) * emso_factor_map[digit]
control_digit = 11 - (emso_sum % 11)

if control_digit == 11:
control_digit = 0

return str(control_digit)


def compact(number):
"""Convert the number to the minimal representation. This strips the
number of any valid separators and removes surrounding whitespace.
Removes SI prefix, as it is used as ID za DDV (VAT identifier)."""
number = clean(number, ' ').strip()
return number


def get_date(number):
"""Return date of birth from valid EMŠO."""
day = int(str(number)[:2])
month = int(str(number)[2:4])
year = str(number)[4:7]
if int(year[0]) < 8:
year = '2' + year
else:
year = '1' + year
try:
dob = datetime.date(int(year), month, day)
except ValueError:
return False
return dob


def get_gender(number):
"""Return gender from valid EMŠO.
M - represents male
F - represents female"""
if int(number[9:12]) < 500:
return 'M'
else:
return 'F'


def get_region(number):
"""Return (political) region from valid EMŠO.
Source Wikipedia - https://en.wikipedia.org/wiki/Unique_Master_Citizen_Number"""
return number[7:9]


def validate(number):
"""Check if the number is a valid EMŠO number. This checks the length,
formatting and check digit."""
number = compact(number)
# Check length
if len(number) != 13:
raise InvalidLength()
# Check if only digits
if not isdigits(number):
raise InvalidFormat()
# Check date of brith
if not get_date(number):
raise InvalidFormat()
# Check checksum
if calc_check_digit(number) != number[-1]:
raise InvalidChecksum()

return number


def is_valid(number):
"""Check if the number provided is a valid ID. This checks the length,
formatting and check digit."""
try:
return bool(validate(number))
except ValidationError:
return False


def extract(number):
"""Extract data from a valid EMŠO."""
number = validate(number)
return (number, get_date(number), get_region(number), get_gender(number))


def format(number):
"""Reformat the number to the standard presentation format."""
return compact(number)
78 changes: 78 additions & 0 deletions tests/test_si_emso.doctest
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
test_si_emso.doctest - more detailed doctests for the stdnum.si.emso module

Copyright (C) 2015 Arthur de Jong
Copyright (C) 2022 Blaž Bregar

This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
02110-1301 USA


This file contains more detailed doctests for the stdnum.de.wkn module. It
tries to validate a number of numbers that have been found online.

>>> from stdnum.si import emso
>>> from stdnum.exceptions import *


Tests for some corner cases.

>>> emso.validate('0101006500006')
'0101006500006'
>>> emso.format(' 0101006 50 000 6 ')
'0101006500006'
>>> emso.validate('12345')
Traceback (most recent call last):
...
InvalidLength: ...
>>> emso.validate('3202006500008')
Traceback (most recent call last):
...
InvalidFormat: ...
>>> emso.validate('0101006500007')
Traceback (most recent call last):
...
InvalidChecksum: ...
>>> emso.validate('010100650A007')
Traceback (most recent call last):
...
InvalidFormat: ...


Tests of helper functions.

>>> emso.calc_check_digit('0101006500006')
'6'
>>> emso.get_gender('0101006500006')
'M'
>>> emso.get_gender('2902932505526')
'F'
>>> emso.get_region('0101006500006')
'50'
>>> emso.extract('0101006500006')
('0101006500006', datetime.date(2006, 1, 1), '50', 'M')

These have been found online and should all be valid numbers.

>>> numbers = '''
...
... 0101006500006
... 2902932505526
... 2001939010010
... 1508995500237
... 1211981500126
...
... '''
>>> [x for x in numbers.splitlines() if x and not emso.is_valid(x)]
[]