cda_to_dict.py is a Python script designed to convert a Clinical Document Architecture (CDA) XML file into a Python dictionary structure. This allows for easy manipulation, analysis, and processing of CDA documents in a format that is more convenient for Python applications.
- Converts a CDA XML file to a nested dictionary representation.
- Strips XML namespaces for simpler access.
- Handles XML attributes and text content.
- Ignores comment nodes in the XML.
- Python 3.x
- lxml library
Install the required dependencies using pip:
pip install lxmlTo use the script, run it from the command line with the path to the CDA XML file as an argument:
python cda_to_dict.py <cda_file>Suppose you have a CDA XML file named example_cda.xml. You can convert it to a dictionary by running:
python cda_to_dict.py example_cda.xmlThe script will output the dictionary representation of the CDA file.
You can also use the cda_to_dict function in your own Python projects:
from cda_to_dict import cda_to_dict
cda_file = "example_cda.xml"
cda_dict = cda_to_dict(cda_file)
print(cda_dict)Parameters:
cda_file(str): The path to the CDA XML file.
Returns:
- A Python dictionary representing the structure and content of the CDA XML file.
- Strips XML namespaces for easier access to tags.
- Converts attributes and text nodes into dictionary entries.
- Collapses child elements into nested dictionaries.
Contributions are welcome! If you encounter any issues or have suggestions for improvement, please open an issue or submit a pull request on the project repository.
Ensure you have test CDA XML files available and run the script to verify its output:
python cda_to_dict.py test_file.xmlThis project is licensed under the MIT License. See the LICENSE file for details.