Skip to content

R package to validate New Zealand NHI numbers

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

nzbri/nhiValidator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nhiValidator

R-CMD-check

Each person who has contact with the New Zealand health system is issued a unique 7 character National Health Index number (NHI). The unique identification is actually provided by the first six characters. The seventh is a checksum, which provides for an internal validity check. This package can check NHIs for a valid checksum, allowing for the detection of most typographical or other data errors.

Installation

nhiValidator is not available from CRAN, so you should install it from GitHub (note that you need to explicitly refer to the main branch in the repository, as install_github() defaults to attempting to access a branch called master):

# install.packages('devtools')
devtools::install_github('nzbri/nhiValidator', ref = 'main')

NHI format

NHIs can be in one of two formats: the original AAANNNC (the identifier of three letters and three digits, followed by a numerical check digit) or the revised AAANNAC (three letters, two digits, one letter, and an alphabetic check character). The final check digit or character is calculated as a checksum based on the first six characters. Thus it provides an internal validity check, to guard against data entry errors. This package contains functions that check for the correct sequence format of letters and characters. It can also conduct the internal validity check, by calculating what the check digit should be, and returning whether the calculated value matches the entered value.

Revised NHI format

As the pool of original NHIs will soon be exhausted, the new format is about to be introduced. This is of the same length as the original but with the final two digits being replaced by letters. This provides for more possible unique values and also improves the strength of the checksum. This package can deal with either format.

Details on the new format can be found at the Ministry of Health.

Example

JBX3656 (a test value, not issued to a real person) is an example of the original NHI format (3 letters followed by 4 digits). The nhi_valid() function shows that the internal checksum is valid:

library(nhiValidator)

nhi_valid('JBX3656')
#> [1] TRUE

That is, the final digit, 6, is calculated based on the preceding 6 characters. Therefore any other final character would yield an invalid result:

nhi_valid(c('JBX3650', 'JBX3651', 'JBX3652', 'JBX3653'))
#> [1] FALSE FALSE FALSE FALSE

Conversely, transpositions or substitutions among any of the other characters are very unlikely to be consistent with the final check digit of 6:

nhi_valid(c('BJX3656', 'JBX3696', 'JBX6356', 'JBX3566'))
#> [1] FALSE FALSE FALSE FALSE

The other function the package provides is nhi_format(). This does not do the internal validity check of the NHI, but merely reports whether its sequence of letters and digits is consistent with the original or revised NHI format, or is in another, invalid format:

# the second entry below would fail the internal validity check,
# but is still in the expected format of an original NHI
nhi_format(c('JBX3656', 'JBX3657', 'ZZZ00AX', 'HELLO', NA),
           allow_test_cases = TRUE)
#> [1] "original format" "original format" "revised format"  "invalid format" 
#> [5] NA

Test cases

NHIs (of either format) that start with Z are reserved for testing purposes (i.e. they will never be assigned to real people). These are likely only of interest to software developers and system testers. This package defaults to regarding such NHIs as invalid, because they are only likely to be encountered in the wild as the result of a typo or other data error. If, however, you would like to process such values, you can override that behaviour by setting the allow_test_cases = TRUE parameter in either the nhi_format() or nhi_valid() functions.

Performance and future development

No further feature development is envisaged but issue reports and pull requests are certainly welcome.

The package functions are vectorised only to the extent that they can be passed a vector of values and return a corresponding vector of results. Under the hood, however, the functions iterate over each entry in the vector and process them sequentially. This can lead to slow performance when processing a lot of values. If for some reason you need to validate a million NHIs, for example, expect to wait for half an hour. Performance optimisation is not a priority for the original developer but pull requests in that regard will be gratefully received.

Licence

This package was developed by Michael MacAskill at the New Zealand Brain Research Institute. It is released as open-source software under an MIT licence. Note the provisions under that licence about warranties and liability.

Please report any issues or suggestions using the Github issues page for this project.

About

R package to validate New Zealand NHI numbers

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Packages

No packages published

Languages