-
Notifications
You must be signed in to change notification settings - Fork 79
Description
When writing SAS XPORT (XPT) files, the writer currently assigns the same format name/width/decimals to both the variable FORMAT (nform, nfl, nfd) and INFORMAT (niform, nifl, nifd) whenever a format is provided.
In SAS, formats and informats are distinct concepts. Many formats do not have a valid informat counterpart (e.g., WEEKDAY., WORDSC., ROMAN., Z.), and using a format as an informat can be semantically wrong or unsupported. This behavior produces XPT metadata that can create interoperability problems when importing into SAS and other tools.
This is observable in XPT files generated indirectly via haven::write_xpt() (which uses readstat).
Where it happens in code
In xport_write_variables():
if (variable->format[0]) {
xport_format_t format;
retval = xport_parse_format(variable->format, strlen(variable->format),
&format, NULL, NULL);
copypad(namestr.nform, sizeof(namestr.nform), format.name);
namestr.nfl = format.width;
namestr.nfd = format.decimals;
copypad(namestr.niform, sizeof(namestr.niform), format.name);
namestr.nifl = format.width;
namestr.nifd = format.decimals;
}
So niform/nifl/nifd are always set to the same values as nform/nfl/nfd.
Why this is a problem
In SAS:
FORMAT controls how values are displayed.
INFORMAT controls how raw text is read/parsed into a value.
They are not interchangeable, and not every format has an informat counterpart. Examples of formats that don’t have a meaningful informat equivalent include (non-exhaustive): WEEKDAY., WORDSC., ROMAN., Z..
Expected behavior
If only a format is provided, the writer should populate FORMAT fields (nform/nfl/nfd) only, and leave INFORMAT fields unset/blank unless an informat is explicitly provided.