Maisa Clinical Data Parser

Note

🇫🇮 Suomenkielinen ohjeistus: Lue ohjeet suomeksi tästä

A Python tool to parse and consolidate HL7 CDA (Clinical Document Architecture) XML files exported from the Maisa patient portal (used by Apotti in Finland).

It extracts key health information into a structured, machine-readable JSON format (patient_history.json).

🚀 Features

Consolidated Patient History: Merges data from multiple DOC*.XML files into a single chronological timeline.
Narrative Extraction: Intelligently extracts free-text clinical notes ("Päivittäismerkinnät", "Hoidon tarpeen arviointi") while filtering out redundant structured lists (medications, labs) to reduce noise.
Structured Data Parsing:
- Patient Profile: Demographics, contact info.
- Medications: Active list and history with dates and dosage.
- Lab Results: Test names, values, units, and timestamps.
- Diagnoses: Active problems with ICD-10/SNOMED codes (from Problem List section).
- Procedures: Medical procedures with Finnish national codes (lumbar puncture, ENMG, OCT, etc.).
- Immunizations: Vaccination records with ATC codes and dates.
- Social History: Tobacco use, alcohol consumption status.
- Allergies: Status and substances.
Deduplication: Handles duplicate entries across multiple documents.
Clean Output: Produces a clean patient_history.json file.

🛠️ Prerequisites

Python 3.8 or higher
pip (Python package installer)

📦 Installation

Clone this repository or download the script.
Install the required dependencies:
```
pip install -r requirements.txt
```
(The primary dependency is lxml for efficient XML parsing)

📖 Usage

Export Data: Download your health data dump from Maisa ("Tilanneyhteenveto" or similar export). After extracting the ZIP file, you'll see a folder structure like this:

Tilanneyhteenveto_DD_Month_YYYY/
├── HTML/
│   ├── IMAGES/
│   └── STYLE/
├── IHE_XDM/
│   └── <PatientFolder>/     ← This folder contains the XML files!
│       ├── DOC0001.XML
│       ├── DOC0002.XML
│       ├── ...
│       ├── METADATA.XML
│       └── STYLE.XSL
├── INDEX.HTM
└── README - Open for Instructions.TXT

[!IMPORTANT] Point the parser to the IHE_XDM/<PatientFolder>/ directory that contains the DOC*.XML files, not the root extracted folder.

Run the Parser:

python src/maisa_parser.py /path/to/IHE_XDM/<PatientFolder>/

For example:

python src/maisa_parser.py ~/Downloads/Tilanneyhteenveto_16_joulu_2025/IHE_XDM/Ilias1/

If you run the script from inside the data folder, you don't need arguments:

cd ~/Downloads/Tilanneyhteenveto_16_joulu_2025/IHE_XDM/Ilias1/
python /path/to/maisa-parser/src/maisa_parser.py

View Output: The script generates a patient_history.json file in your current working directory.

📂 Output Structure

The generated JSON contains:

{
  "patient_profile": {
    "full_name": "...",
    "dob": "1990-01-15T00:00:00",
    "gender": "...",
    "address": "...",
    "phone": "...",
    "email": "..."
  },
  "clinical_summary": {
    "allergies": [ ... ],
    "active_medications": [ ... ],
    "medication_history": [ ... ]
  },
  "diagnoses": [
    { "code": "G35", "code_system": "ICD10", "display_name": "Multiple sclerosis", "status": "active" }
  ],
  "procedures": [
    { "code": "TAB00", "name": "Lumbar puncture", "date": "2023-05-10T00:00:00" }
  ],
  "immunizations": [
    { "vaccine_name": "COVID-19 Pfizer", "vaccine_code": "J07BN01", "date": "2021-08-13T00:00:00" }
  ],
  "social_history": {
    "tobacco_smoking": "Ex-smoker",
    "alcohol": "Current drinker"
  },
  "lab_results": [ ... ],
  "encounters": [
    {
      "date": "2024-10-10T12:00:00",
      "type": "Hoito- ja palveluyhteenveto",
      "provider": "Dr. Name",
      "notes": "Narrative text of the visit...",
      "source_file": "DOC0018.XML"
    }
  ]
}

⚠️ Important Note on Privacy

This tool processes sensitive personal health information.

Do not commit your XML data files or the generated JSON output to GitHub or any public repository.
A .gitignore file is included to help prevent accidental commits of .XML and .json files.
Always handle your medical data with care.

📥 How to export your data from Maisa

Log in to Maisa.fi.
Go to Menu > Sharing > Download My Record (Lataa tietoni).
Select "Lucy XML" (or "Everything").
Download the ZIP file and unzip it.
You will see a folder IHE_XDM containing the DOC*.XML files. This is the folder you process.

⚠️ Legal & Liability Disclaimer

Disclaimer: This software is for educational and informational purposes only. It is not a medical device and should not be used for diagnosis or treatment. Always consult a professional for medical advice. The authors are not responsible for any errors in parsing or data representation.

By using this tool, you agree that you are solely responsible for safeguarding your own medical data.

🤝 Contributing

Feel free to submit issues or pull requests if you find bugs or want to improve the parsing logic for different types of Maisa documents.

📄 License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
src		src
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
README_fi.md		README_fi.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Maisa Clinical Data Parser

🚀 Features

🛠️ Prerequisites

📦 Installation

📖 Usage

📂 Output Structure

⚠️ Important Note on Privacy

📥 How to export your data from Maisa

⚠️ Legal & Liability Disclaimer

🤝 Contributing

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

tinof/maisa-parser

Folders and files

Latest commit

History

Repository files navigation

Maisa Clinical Data Parser

🚀 Features

🛠️ Prerequisites

📦 Installation

📖 Usage

📂 Output Structure

⚠️ Important Note on Privacy

📥 How to export your data from Maisa

⚠️ Legal & Liability Disclaimer

🤝 Contributing

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages