This is a first try on a Data Format Description Language (DFDL) parser and unparser for pextract paradigm files. The DFDL schema enables to read and modify extracted paradigm files using XML technology. This makes possible lossless archiving of original pextract files and simpler transformation to the Lexical Markup Framework.
What is more important - this makes possible automatic generation of a morphology module in the Grammatical Framework using the pextract2gf tool.
The XML schema is not definite yet. Any comments welcome.
The DFDL schema has been developed and tested using the open source tool Daffodil.
Parsing the example pextract file.
$ ../bin/daffodil parse --schema ./pextract.dfdl.xsd ./examples/vot_noun.p
Un-parsing (that is serializing) the parsed example XML infoset back to a textual pextract file.
$ ../bin/daffodil unparse --schema ./pextract.dfdl.xsd ./examples/vot_noun.tdml