This is a Python library for working with ICD9 codes. ICD9 is version 9 of the "International Statistical Classification of Diseases and Related Health Problems". See http://en.wikipedia.org/wiki/ICD for more information. The library is heavily influenced by, and borrows data from the R icd9 package.
The package includes lists of ICD9 codes associated with specific conditions, and information about the ICD9 codes themselves.
>>> import icd9 >>> codes1 = icd9.ahrqComorbidAll >>> codes2 = icd9.elixComorbid >>> codes3 = icd9.icd9Hierarchy >>> codes4 = icd9.ahrqComorbid >>> codes5 = icd9.icd9Chapters >>> codes6 = icd9.quanElixComorbid
The package contains functions for manipulating the codes.
>>> icd9.decimal_to_short("81.23") "08123" >>> icd9.short_to_decimal("08123") "81.23" >>> icd9.decimal_to_parts("9.9") ("009", "9") >>> icd9.parts_to_decimal("088", "88") "88.88" >>> icd9.short_to_parts("E0123") ("E012", "3") >>> icd9.parts_to_short("V1", "0") "V010" >>> icd9.parts_to_short("E012", "3") "E0123"
The package contains a Counter
class for accumulating statistics
about the matches between codes associated to a particular subject and
a given set of code classes. The Counter
class is designed for
use with data streams, in which a large file is read in chunks and the
statistics are incrementally updated according to the contents of each
chunk. The number of matches for each code class, and optionally the
first and last date at which a code class is matched are computed.
To illustrate, first we define a class that contains the codes '12345' and '54321', and a second class that contains all codes beginning with '44' and all codes beginning with '323':
>>> full = {"group1": ["12345", "54321"]} >>> init = {"group2": ["44", "323"]} >>> counter = icd9.Counter(codes_full=full, codes_initial=init)
When we want to add codes that match exactly we put them in the codes_full argument, and when we want to add codes that match as an initial substring we put them in the codes_initial argument. Each ICD9 code class can appear in either or both of these arguments.
Now we can take a Pandas.DataFrame
object codes whose index is
interpreted as subject identifiers (repeats are allowed), and whose
values are ICD9 codes, and update the counter:
>>> counter.update(codes, 'id')
The counter.table
attribute contains the number of occurrences of
codes within each of the groups, for each subject. The unique subject
identifier for a given chunk is given as the id argument.
Counters can be used to calculate comorbidity indices like the Elixhauser index.
>>> counter = icd9.Counter(codes_full=icd9.elixComorbid) >>> counter.update(chunk1, 'id') >>> counter.update(chunk2, 'id') >>> elix = (counter.table > 0).sum(1)
Now suppose that there is a service date associated with each code,
and it is contained in a column of the data called date. We can set
up the counter class as follows. After update
is called,
counter.table will contain columns corresponding to the first and
last occurrence of a code in each category, and the number of codes in
each category.
>>> counter = icd9.Counter(codes_full=icd9.elixComorbid, calculate_dates=True) >>> counter.update(chunk, 'id', 'date')
All of the data components of the package were obtained from the R
icd9 package. See COPYRIGHT.txt for relevant copyright information.
The R script used to prepare the JSON files distributed with this
package can be found in the resources
subdirectory.
See also https://github.com/sirrice/icd9.