fix: avoid KeyError when getting type converter #132

myd7349 · 2026-01-07T15:46:26Z

When I attempted to read the patient metadata using the following code, an exception was raised:

from pprint import pprint

import mffpy

reader = mffpy.Reader(r"./examples/example_2.mff") 
with reader.directory.filepointer("subject") as fp:
    subject_xml = mffpy.XML.from_file(fp)
    fields = subject_xml.get_content()
    pprint(fields)

This results in the following traceback:

Traceback (most recent call last):
  File "d:\mffpy\mffpy\cached_property.py", line 34, in _cached_property
    return getattr(self, cached_name)
AttributeError: 'Patient' object has no attribute '_cached_fields'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "d:\mffpy\test.py", line 6, in <module>
    fields = subject_xml.get_content()
  File "d:\mffpy\mffpy\xml_files.py", line 462, in get_content
    'fields': self.fields
              ^^^^^^^^^^^
Traceback (most recent call last):
  File "d:\mffpy\test.py", line 6, in <module>
    fields = subject_xml.get_content()
  File "d:\mffpy\mffpy\xml_files.py", line 462, in get_content
    'fields': self.fields
              ^^^^^^^^^^^
  File "d:\mffpy\test.py", line 6, in <module>
    fields = subject_xml.get_content()
  File "d:\mffpy\mffpy\xml_files.py", line 462, in get_content
    'fields': self.fields
              ^^^^^^^^^^^
    fields = subject_xml.get_content()
  File "d:\mffpy\mffpy\xml_files.py", line 462, in get_content
    'fields': self.fields
              ^^^^^^^^^^^
  File "d:\mffpy\mffpy\xml_files.py", line 462, in get_content
    'fields': self.fields
              ^^^^^^^^^^^
    'fields': self.fields
              ^^^^^^^^^^^
  File "d:\mffpy\mffpy\cached_property.py", line 36, in _cached_property
    ans = fn(self)
              ^^^^^^^^^^^
  File "d:\mffpy\mffpy\cached_property.py", line 36, in _cached_property
    ans = fn(self)
  File "d:\mffpy\mffpy\cached_property.py", line 36, in _cached_property
    ans = fn(self)
  File "d:\mffpy\mffpy\xml_files.py", line 439, in fields
    ans = fn(self)
  File "d:\mffpy\mffpy\xml_files.py", line 439, in fields
  File "d:\mffpy\mffpy\xml_files.py", line 439, in fields
    data = self._type_converter[data.get('dataType')](data.text)
           ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'date'

The root cause of the issue can be found in the following section of the code:

mffpy/mffpy/xml_files.py

Lines 420 to 441 in 2509423

    
           class Patient(XML): 
        
               _xmlns = r'{http://www.egi.com/subject_mff}' 
        
               _xmlroottag = r'patient' 
        
               _default_filename = 'subject.xml' 
        
               _type_converter = { 
        
                   'string': str, 
        
                   None: lambda x: x 
        
               } 
        
               @cached_property 
        
               def fields(self): 
        
                   ans = {} 
        
                   for field in self.find('fields'): 
        
                       assert self.nsstrip(field.tag) == 'field', f""" 
        
                       Unknown field with tag {self.nsstrip(field.tag)}""" 
        
                       name = self.find('name', field).text 
        
                       data = self.find('data', field) 
        
                       data = self._type_converter[data.get('dataType')](data.text) 
        
                       ans[name] = data 
        
                   return ans

As shown there, _type_converter only defines converters for the keys 'string' and None. However, patient metadata fields are not limited to these two data types. As a result, when encountering a field with a dataType such as 'date', the following line:

data = self._type_converter[data.get('dataType')](data.text)

attempts to access a non-existent key in _type_converter, which leads to a KeyError.

In this PR, dict.get(key, default) is used to make the conversion more robust. When the specified dataType is not found in _type_converter, the logic now falls back to the default converter associated with None (i.e., lambda x: x), thereby avoiding the exception and preserving the original value.

myd7349 · 2026-01-07T15:47:34Z

In addition, I noticed another related issue in mffpy regarding the handling of empty string values in XML.

Currently, when a <string> field in the XML is empty, mffpy converts it to the literal string "None" rather than the Python value None. This behavior appears to originate from the following converter definition:

'string': str,

Since str(None) evaluates to "None", an empty or missing XML value that is represented as None ends up being converted into the string "None".

I am wondering whether one of the following approaches would be more appropriate:

'string': lambda x: x if x is None else str(x),

or even simply:

'string': lambda x: x,

Both options preserve None values as None, instead of converting them into the string "None". The latter approach also avoids unnecessary type coercion, since XML text content is already a string when present.

I have observed the same issue in other XML string fields as well, not just in patient metadata. Unifying the string handling logic in this way may therefore improve overall consistency and reduce surprising behavior across the XML parsing layer.

fix: avoid KeyError when reading subject.xml

a95b212

myd7349 force-pushed the fix-type-converter branch from 4a7eea3 to a95b212 Compare January 7, 2026 15:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: avoid KeyError when getting type converter #132

fix: avoid KeyError when getting type converter #132

Uh oh!

myd7349 commented Jan 7, 2026

Uh oh!

myd7349 commented Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	class Patient(XML):

	_xmlns = r'{http://www.egi.com/subject_mff}'
	_xmlroottag = r'patient'
	_default_filename = 'subject.xml'

	_type_converter = {
	'string': str,
	None: lambda x: x
	}

	@cached_property
	def fields(self):
	ans = {}
	for field in self.find('fields'):
	assert self.nsstrip(field.tag) == 'field', f"""
	Unknown field with tag {self.nsstrip(field.tag)}"""
	name = self.find('name', field).text
	data = self.find('data', field)
	data = self._type_converter[data.get('dataType')](data.text)
	ans[name] = data
	return ans

fix: avoid KeyError when getting type converter #132

Are you sure you want to change the base?

fix: avoid KeyError when getting type converter #132

Uh oh!

Conversation

myd7349 commented Jan 7, 2026

Uh oh!

myd7349 commented Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant