Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ATOM dataset feed - Separate entries for each format/CRS combination #377

Open
magneb opened this issue Aug 27, 2020 · 9 comments
Open

ATOM dataset feed - Separate entries for each format/CRS combination #377

magneb opened this issue Aug 27, 2020 · 9 comments
Assignees

Comments

@magneb
Copy link

magneb commented Aug 27, 2020

Hello,

For the assertion about separate entries for each format/CRS combination, specifically the test with assertion id 81bdd847-687d-4597-ba58-99963ff3635a starting on this particular line: https://github.com/inspire-eu-validation/ets-repository/blob/master/service/ds-atom-pre-defined-soapui-project.xml#L786

The following part of the xquery is listing entries and trying to order them by their category term and the link type of links which has the rel = 'alternate' attribute.

    let $sortedValues := for $entry in //*:entry
    order by $entry/*:category[1]/@term, $entry/*:link[@rel = 'alternate'][1]/@type
    return concat('CRS ', $entry/*:category[1]/@term, ' with type ', $entry/*:link[@rel = 'alternate'][1]/@type, '   ')

However, it does not test if the entries actually have the rel = 'alternate' attribute! The result is a list of category, link-type concatenations that start with blank link-types. The effect of this is that the test will fail any dataset that has more than one entries with only rel="section" in its links but no alternate links.

I do not believe this is intentional. I firmly believe the intention is to list only those entries where the rel="alternate" attribute exists. Therefore i propose to introduce a filter in the listing of entries, by changing line 786 to the following:

    let $sortedValues := for $entry in //*:entry/*:link[@rel = 'alternate']

This narrows the test down to what the title of the test says: "Alternate link types and CRS combinations are unique"

@carlospzurita
Copy link
Contributor

Dear @magneb

Thank you for checking this case, we are marking this as "under analysis" until we determine if the interpretation is correct and compliant with the implementing rules on ATOM feeds.

In the meantime, can you provide us with some example endpoints to illustrate your issue?

@magneb
Copy link
Author

magneb commented Aug 31, 2020

Hi, @carlospzurita

Sure thing! I'm working on this feed: https://nedlasting.geonorge.no/geonorge/ATOM/INSPIRE/Geonorge_INSPIRE_ServiceFeed.xml And the only mark left on the dataset feed is the one that checks that alternate link types and CRS combinations are unique. Take the following as an example (This is one of the triggering dataset-feeds):
https://nedlasting.geonorge.no/geonorge/ATOM/INSPIRE/dataset/vbase.xml

In this feed, there are two entries, and each entry only has links that have the rel="section" attribute. This is in accordance with TG Requirement 32, as described on page 53 in the technical implementation guide for ATOM

Another example is this dataset-feed: https://nedlasting.geonorge.no/geonorge/ATOM/INSPIRE/dataset/dtm50.xml
This one is an elevation model, so it makes sense that the data is distributed in multiple physical files.

I hope this is enough, for my testing i used an xpath/xquery evaluation tool and made some mock datasets by copying entries for multiple sources.

@magneb
Copy link
Author

magneb commented Sep 17, 2020

Hello again,

Is there anything more I can do to help this along? I feel I have answered the what, why and how of this issue. Should I create a pull request with my proposed solution?

I have not tested the change in a working instance of the validator yet, but I am very certain that it wont break anything. As I understand it, @jonherrmann seems to be the one that committed the code initially, in 2017 with this commit. To be clear, I'm not blaming anyone. This case is very particular, so you would have to test a feed where a dataset has:

  1. An entry with one link where rel='alternate'
  2. An entry with multiple links where rel='section'
  3. An entry with multiple links where one has rel='alternate' and the rest has rel='section'
  4. An entry with at least two links where all have rel='alternate'

In order to cover all the possible combinations. The three latter examples will fail this test as it stands now, and again, I don't think that is the intended purpose of the test.

Case 4 should fail. There is no reason to have duplicate data like that. It may be that case 3 is supposed to fail because it technically contains duplicate data. But, if a datasett is large, users might want either the complete dataset or just the data for one particular municipality (or cell). Whether that should be allowed or not is not for me to decide, but i have removed files from our atom feed where this is the case.

Looking at the TG requirement 32 in v.3.1 again, this test, which only checks for duplicate, format/CRS combinations should properly exclude links where rel='section' or, somehow aggregate them (on type and rel) before checking for duplicates with the alternate links, in order to fit with the requirement.

@carlospzurita
Copy link
Contributor

Dear @magneb

Thank you very much for your input on this, is is very much appreciated. We are still reviewing your proposal for fixing the issue, and we will get back to you with a response and the needed ETS modifications.

Best

@carlospzurita
Copy link
Contributor

Dear @magneb

We tested your proposed solution, and it seems that the results in validation is still the same for the results tested. We uploaded this modification on https://github.com/inspire-eu-validation/ets-repository/tree/issue-377 , so you can download this version of the ETS and try it on your own deployment.

Please let us know if everything is in order, or another modification is needed.

@carlospzurita carlospzurita added this to the v2021.0 milestone Nov 3, 2020
@carlospzurita carlospzurita removed this from the v2021.0 milestone Feb 9, 2021
@carlospzurita
Copy link
Contributor

Dear @magneb

Did you have the opportunity to test the development uploaded here https://github.com/inspire-eu-validation/ets-repository/tree/issue-377 to check the fix for the ATOM tests?

@fabiovinci
Copy link
Collaborator

Dear @magneb,

do you still have the reported errors?

Were you able to test the proposed solution?

@fabiovinci fabiovinci added question Further information is requested and removed under analysis labels May 23, 2022
@fabiovinci fabiovinci assigned magneb and unassigned carlospzurita May 23, 2022
@mwjsanders
Copy link

Hi @magneb and @fabiovinci ,

Is there any progress booked on this issue?

To my opinion the proposed solution in https://github.com/inspire-eu-validation/ets-repository/tree/issue-377 is not completly correct. When a datafeed entry has multiple download files with rel='section' then a link with rel 'alternate' is permitted which references to a descritpion which describes the relation between the mutiple download files in the datafeed entry (as an alternative to the content element).

Our service feed: https://service.pdok.nl/brt/topnl/atom/index.xml
with datafeed: https://service.pdok.nl/brt/topnl/atom/top10nl.xml
has the same validation error

@fabiovinci
Copy link
Collaborator

fabiovinci commented May 15, 2023

Dear @mwjsanders,

thank you for your feedback and for the sample service.

Looking at your service, it should pass the test "Separate entries for each format/CRS combination" (related ATS available here) since the dataset feed contains two different entries each of them providing the dataset with the same CRS but in different formats.

image

It seems the validator is looking for link elements with the rel attribute equal to "alternate", so it reports an error since in your dataset feed there are only link elements with rel="section".
Since you are correctly providing the dataset in multiple physical files according to TG Requirement 32, the error should not be present.

We will try to improve the description of the test in the ATS and implement the fix in staging.
We will notify you when the fix is ready for testing.

@fabiovinci fabiovinci added under development and removed question Further information is requested labels May 15, 2023
fabiovinci added a commit to inspire-eu-validation/download-atom that referenced this issue May 15, 2023
Improvement of the ATS according to feedback provided in issue 377 (INSPIRE-MIF/helpdesk-validator#377).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants