C-API: add geographic and projection header keywords as no-op#63
C-API: add geographic and projection header keywords as no-op#63CarawaySeed42 wants to merge 2 commits intomasterfrom
Conversation
When reading a crg file containing refline lat/lon/alt or mpro keywords, the c-api prints warnings that they are unknown keywords and will be ignored. This suppresses the warnings. Encountering the keywords leads to no-op for now. Signed-off-by: Patrick Kuemmerle <patrick.kuemmerle@3d-mapping.de>
tbleher
left a comment
There was a problem hiding this comment.
Thanks for fixing this! Most keywords look reasonable, only one concern about proj_nm.
c-api/baselib/src/crgLoader.c
Outdated
| { "reference_line_end_lon", decodeHdrDouble, dOpcodeNone }, | ||
| { "reference_line_end_lat", decodeHdrDouble, dOpcodeNone }, | ||
| { "reference_line_end_alt", decodeHdrDouble, dOpcodeNone }, | ||
| { "proj_nm", decodeHdrDouble, dOpcodeNone }, |
There was a problem hiding this comment.
Should we add a new decodeHdr function for this? decodeHdrDouble tries to decode the value as a double, but if I read the examples correctly, the values to proj_nm are strings.
Or maybe use decodeHdrOpMod? That would I think do the correct thing together with dOpcodeNone.
My concern with using decodeHdrDouble would be that someone later might add error checking to decodeHdrDouble and abort if the value was not parseable (which I think would make sense - atof used there is famous for being to lenient). Then the current code would break.
There was a problem hiding this comment.
Good catch, since encountering proj_nm triggers no operation (aside from converting it to a temporary double) and testing it worked fine, I forgot that its value is not a double.
We could also create a new callback structure sLoaderCallbacksMpro (or similar) since it is a different kind of data section than the road parameters. But this one is debatable, we could keep it inside the road structure.
Then replacing decodeHdrDouble with decodeHdrOpMod as you suggested.
What do you think?
There was a problem hiding this comment.
I suggest that we take some time as soon as possible to discuss our future design philosophy. Some of the current issues (including this one) can be attributed to the fact that different assumptions are being made here than in the past.
Here is a brief summary of how OpenCRG was developed in the past:
- The Matlab base set (crg_read, crg_write, crg_check, crg_eval_xy2z) is the reference for the standard and its interpretation.
- Additional sample implementations (currently there is the c-api) implement the functionality in such a way that (as far as possible) identical behavior is achieved. For this purpose, there was a FORTRAN reference implementation as a template at the start of the OpenCRG project.
For the topics in this issue, I suggest continuing to
- ignore unused or unknown blocks (like $MPRO)
- ignore unused or unknown keywords in known blocks (like reference_line_end_lon)
Otherwise, future (including proprietary) extensions (such as the information for Adams in the $ROAD_CRG_VISUALIZATION block in #16) will be hindered.
So I'd recommend to not implement anything in the c-api-reader that will not be used afterwards, as here with the LON/LAT or the MPRO information, since there is nothing implemented to use that data. This will need of course a clear documentation
- of that behavior
- of the implemented (and non-implemented functionality of any sample implementation (here: the c-api)
There is also a lot to say about the topics of double/single and double/float in various other issues. The (mainly former) limitations in memory space have motivated us to use double only where absolutely necessary. However, this requires an understanding of the associated mechanisms and consideration of them when preparing the data. In a nutshell: whenever small changes in large values are relevant (such as the absolute position at a distant origin of the coordinate system), only the relative distances to a suitable local origin are described. Or, in the case of elevation information: here, the “large” values are handled in the gradient so that sufficient resolution is maintained for “small” unevenness. Example: Without separation, a route with a 1000 m difference in elevation would mean that with float (approx. 6 digits of accuracy), elevation resolution could only be achieved in the cm range.
There was a problem hiding this comment.
For the topics in this issue, I suggest continuing to
- ignore unused or unknown blocks (like $MPRO)
- ignore unused or unknown keywords in known blocks (like reference_line_end_lon)
Otherwise, future (including proprietary) extensions (such as the information for Adams in the $ROAD_CRG_VISUALIZATION block in inconsistent road positioning comparing base and header file #16) will be hindered.So I'd recommend to not implement anything in the c-api-reader that will not be used afterwards, as here with the LON/LAT or the MPRO information, since there is nothing implemented to use that data.
This is what will be achieved by these changes. We continue to ignore unused keywords. We just prevent warnings that leads the user of the API to believe, that there is something wrong with their CRG file. Otherwise the offset fields should have also produced warnings in the past.
From experience, there were multiple occasions where users got (rightfully) confused about a valid CRG file producing warnings using the C-API.
There was a problem hiding this comment.
By the proposed changes the c-api will no more complain about the extra keywords, which are currently in the standard, but not used/supported by the c-api.
My proposal is different: the C API should not complain about any unknown or unused keywords or blocks.
This is the standard defined by the Matlab implementation, and it ensures that any additional official or unofficial additions work smoothly without any warnings.
There was a problem hiding this comment.
I see, thank you for the clarification and sorry for the misunderstanding.
There was a problem hiding this comment.
My proposal is different: the C API should not complain about any unknown or unused keywords or blocks. This is the standard defined by the Matlab implementation, and it ensures that any additional official or unofficial additions work smoothly without any warnings.
But this also means that users will not notice if e.g. a keyword is misspelled. So I disagree that the C code should not warn about unknown keywords. I think it should recognize all keywords defined by the standard (even if it does nothing with them) and at least output a NOTICE on all keywords it does not know. I think that makes the system more robust for users (by recognizing errors earlier), at the cost of making extensions slightly harder. But hopefully this will encourage users to incorporate extensions into the standard.
There was a problem hiding this comment.
That's the reason I wrote
I suggest that we take some time as soon as possible to discuss our future design philosophy.
Upto now
- matlab implements the reference
- the silent "tolerance" to unknown keywords and blocks always was an explicit feature of the underlying SDF data format (historic Daimler/Mercedes internal ...)
First and foremost, warnings are just annoying, mostly ignored, and I doubt that they will ever persuade anyone to contribute anything.
The proposed approach is also problematic: although there is no more warning for keywords known by the standard, the user has no indication that the information of this keyword will not be used.
I am open to discuss, but today this change would just brake the current design rules of OpenCRG.
There was a problem hiding this comment.
Yes, I think we should discuss the future design philosophy soon 😃
It's good to hear about the previous design decisions, since I did not know about them (as far as I can see they are not documented anywhere in the standard?).
Some notes from my side:
- As far as I can see the standard does not say anything about how users can extend the file. I think we should change that and make it explicit which extensions are allowed.
- OpenCRG does not seem to have a version number. I think adding one would be important, so one can see (and check) whether a file complies with a particular version of the standard.
- I think we should not allow users adding new keywords or sections in the main namespace. If we allow arbitrary proprietary extensions, then it becomes effectively impossible to safely extend the standard with new keywords or sections, since anything we propose might already be in use as a proprietary extension with a different semantic. So I think if we allow extensions, they should be required in a namespace. Example from inconsistent road positioning comparing base and header file #16: we should not allow users to add a proprietary section like
$ROAD_CRG_VISUALIZATION(since that will collide with any standardization of a similar feature). Instead this should be a namespace. E.g. we could reserve$EXT_<domainname>_as a prefix in the standard (and promise to never add section names with this prefix). Then users could write$EXT_hexagon.de_ROAD_CRG_VISUALIZATION. - If we add this extension mechanism (and I think we really need something like this - allowing users to add arbitrary keywords is just unmaintainable), then the C API can hard-reject any files with unknown keywords and sections (and which are not extensions matching the above scheme). That will make the ecosystem more robust in the long-term, since errors are found earlier. (Side note: OpenDRIVE does something similar: users can add
<userData>tags under all elements, but may not add other tags or modify them). - Eventually I'd like to get away from Matlab being the main reference for such things. Matlab is a proprietary tool, not a standard, and at least at my workplace I am strongly discouraged from using Matlab due to the high license costs.
Now, these are not things to change in a maintenance release, so we should probably discuss separately:
- what to add now for the maintenance release
- what the future direction looks like
There was a problem hiding this comment.
Thank you very much for your highly valued thoughts on these issues!
Just some comments from my side:
Standard
- The matlab base routines to read, check, and write crg alwas were a kind of living specbook for the file format. The only available previous user manual was very limited on its description - so the standard was effectively guarateed by all of us using tha matlab base routines to generate and modify crg files.
- the current documentation was writen without any contribution of the former contributers to OpenCRG - this explains inaccuracies, blurring, and even genuine errors, such as in Fix fileref documentation #59
- version numbering never was necessary using the original approach just ignoring extra keywords or blocks. OpenCRG was strictly upward compatible, so using the most recent version of routines worked will all previously generated files.
- As long as the matlab base routines are used as check and write, no illegal extensions or misspelled keywords will find their way in a data file.
- As I have already mentioned, the opencrg format is based on an old, proven Daimler/Mercedes measurement data format, which deliberately made use of this flexibility in order to avoid major compromises for new or extended applications and to allow mechanisms such as those only hinted at in the fileref feature. Of course, we don't have to stop there, but I just want to try to avoid creating obstacles where none are needed.
- To address just one more possible issue that is not currently mentioned in the standard and is not checked anywhere: how are duplicate keywords and duplicate blocks handled? As long as you create the crg file with crg_write, this cannot happen. But if you don't use this method alone to create crg files, this ambiguity also exists.
Matlab
- Matlab was an early discussion. In the beginning we tried to be consistent to octave, too. But since then, all user had/have matlab, no complaints about that, so octave wasn't further important.
- For short term answer to the matlab topic we could have a look to the base routines to read, check and write crg files to become octave compatibe again if they are currently not.
- There is also the way to generate a respective runtime, which is runnable w/o a matlab license. But from my previous experience from other projects, the installation of the free matlab runtime was quite painful in managed professional CAE environments.
- For a longer term answer, we could think about python, at least for the base routines.
The existing map projection keyword was erroneously part of the ROAD_CRG block. Signed-off-by: Patrick Kuemmerle <patrick.kuemmerle@3d-mapping.de>
|
I suggest that we focus our current work on malfunctions until we have agreed on a new direction. Unfortunately, the currently proposed changes distract us from this. The c-api is (according to our current understanding) not a reference implementation, but rather a sample implementation of a limited scope of OpenCRG. The first requirement is therefore that a crg file generated with the Matlab reference implementation can be read. Unknown header blocks and keywords in them should be ignored, regardless of whether they are defined in the full OpenCRG definition or not. (We could discuss for a later release whether corresponding notes and warnings should be generated for this. This would first be a good addition to the Matlab reference implementation, which would thus represent a further expanded checker function.) My current concern relates to genuine malfunctions in all delivered code and documentation. Currently, files that correspond to the established concept of the crg file format are sometimes misinterpreted by the c-api. One example is a crg file with an additional header block with an unknown name—many of our users have used this approach for simple manual variant generation. The unknown block should be ignored, but this is obviously not the case; the evaluation simply uses the last assignment of the corresponding keyword—even though it is located in an unknown block. Example: The resulting value should be 7.0 but is 5.0 This is really an issue, since the results will differ from the matlab reference implementation. |
When reading a crg file containing refline lat/lon/alt or mpro keywords, the C-API prints warnings that they are unknown keywords and will be ignored. These keywords are well defined by the specification and should not be considered unknown.
Encountering the keywords leads to no-op after merging this PR, which suppresses the warnings.
Then it exhibits the same behaviour as it used to for the channel offsets.
Resolves: #4