Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As Open Data Hub I would like to have a new OTP demo instance, that imports public transport data through NeTEx / SIRI instead GTFS / GTFS-RT #191

Open
1 of 4 tasks
rcavaliere opened this issue Jul 3, 2024 · 43 comments
Assignees

Comments

@rcavaliere
Copy link
Member

rcavaliere commented Jul 3, 2024

Tasks agreed on 30.8.

Reference sources for NeTEx and SIRI:

NeTEx
https://web01.sta.bz.it/netex/api/v4/downloadVersion?level=4&agencyCode=IT-ITH1
username = rapuser
password = rappass

SIRI

SIRI ET (XML): https://efa.sta.bz.it/siri-lite/estimated-timetable/xml
SIRI ET (JSON): https://efa.sta.bz.it/siri-lite/estimated-timetable
SIRI SX (XML): https://efa.sta.bz.it/siri-lite/situation-exchange/xml
SIRI SX (JSON): https://efa.sta.bz.it/siri-lite/situation-exchange

@rcavaliere rcavaliere converted this from a draft issue Jul 3, 2024
@rcavaliere
Copy link
Member Author

Relevant also for @leonardehrenfried

@leonardehrenfried
Copy link
Contributor

As soon as you have the feeds, can you post the URLs here?

@leonardehrenfried
Copy link
Contributor

Is there an update? Are the NeTEx feeds available somewhere?

@rcavaliere
Copy link
Member Author

@leonardehrenfried for testing purposes you can start working with this NeTEx file.

GE16614_01_DIVA_apb_ALL_1_20240717011758.xml.zip

@rcavaliere
Copy link
Member Author

@leonardehrenfried as discussed today, please consider this NeTEx export and not the one provided 2 days ago. This is the one provided to the NAP of Italy, compliant with EPIP.

NX-PI_01_it_apb_LINE_apb__20240621.xml.zip

@rcavaliere rcavaliere moved this from Todo to In Progress in Open Data Hub OTP Jul 22, 2024
@leonardehrenfried
Copy link
Contributor

leonardehrenfried commented Jul 23, 2024

I took a look at this today and I am happy to report that importing the feed is going to be possible (with some caveats).

This is what it looks like:

Screenshot from 2024-07-22 17-54-39

Features currently not implemented by OTP

  • Using any as the version of a NeTEx entity
  • The style of the service links (shapes) that are used by the feed

I spoke to the upstream developers and it's going to be possible to implement those two.

Validation errors

The last file you posted successfully validates against both the NeTEx and EPIP XSDs. Very good!

However, OTP has picked up on quite a few validation errors, some of which are cosmetic but also a few serious ones.

Smaller errors

Suspicious data

  • 2061 service journeys repeat the same stop right after each other, often with the exact same time. This is quite suspicious and indicates a data error.
  • There area 956 ServiceJourneyPatterns that are not referenced by a ServiceJourney. This means that they are not imported in OTP. It has no consequences but also probably indicates an error somewhere in the chain.

Serious errors

956 ServiceJourneys have a different number of stops from the ServiceJourneyPattern. This means that these journeys are not imported into OTP at all. It's the most serious error in the feed. An example of this would be the following error message:

Mismatch in stop points between ServiceJourney and JourneyPattern. ServiceJourney will be skipped. ServiceJourney=it:apb:ServiceJourney:031001T-TI-63-5-43500:sonn:, JourneyPattern= it:apb:ServiceJourneyPattern:03100T.24a5100166:

I would speak to Mentz to ask them how this discrepancy can be explained.

Summary

All in all I am pleasantly surprised how well all of this works given that you're the first organisation that tries to import EPIP into OTP.

What is difficult to say is if there are hidden problems with the feed. The best way to find out is to actually use the data, which we currently do.

We can also discuss using a more structured approach to find errors but for now I'm pretty pleased with the progress.

@leonardehrenfried
Copy link
Contributor

leonardehrenfried commented Jul 24, 2024

I have to correct myself about the service links (shapes).

EPIP does structure them a bit differently from the Nordic profile but the problem is that in the latest data feed the ServiceLink elements are not referenced by the StopPointInJourneyPattern. In other words: the shapes are present but not used.

According to the profile specification, the StopPointInJourneyPattern should look like this: https://github.com/5Tsrl/netex-italian-profile/blob/main/Examples/Netex_ITA_1.10_EPIP_with_versioning.xml#L16990

but in the supplied data file they look like this:

 <StopPointInJourneyPattern id="it:apb:StopPointInJourneyPattern:01B06_.24a-1-0071303:" version="1" order="3">
  <ScheduledStopPointRef ref="it:apb:ScheduledStopPoint:it-22021-2167-0-5267:" version="any" />
  <ForAlighting>true</ForAlighting>
  <ForBoarding>true</ForBoarding>
  <RequestStop>false</RequestStop>
  <StopUse>access</StopUse>
</StopPointInJourneyPattern>

Note the element OnwardServiceLinkRef is missing.

@rcavaliere
Copy link
Member Author

rcavaliere commented Jul 24, 2024

@leonardehrenfried thanks for the notification. So at the end is just the field OnwardServiceLinkRef missing? In our data we have the reference with the service link in the structure linksInSequence (field serviceLinkRef). Let's understand if we should change our data according to this...

@leonardehrenfried
Copy link
Contributor

linksInSequence is how the Nordic profile expects it (but not EPIP, apparently) but that field is also absent the latest data file.

If you have the time, I'm available for a call today. Maybe that is quicker than comment ping-pong.

@rcavaliere
Copy link
Member Author

OK, I need to check in the data I provided. Typically we support this (linksInSequence). Let me deepen first this, then we can decide how to handle this

@rcavaliere
Copy link
Member Author

@leonardehrenfried I check this. For the Italian NAP, we removed the structure linksInSequence since was not supported by the Italian profile, but did not add the alternative way to map the match with the service link. I will ask our team of developers working on our NeTEx export for the Italian NAP to adjust this. At the end we will just put the value in linksInSequence/serviceLinkRef (available in our NeTEx German profile) in pointsInSequence/OnwardServiceLinkRef (to be considered in the NeTEx Italian profile). I will let you know when we have a corrected NeTEx export available for your activities here.

@leonardehrenfried
Copy link
Contributor

There are also other problems of varying severity (see post above). Do you have any information about that?

@rcavaliere
Copy link
Member Author

@leonardehrenfried not yet, I will give you a feedback on all points.

@leonardehrenfried
Copy link
Contributor

If you have a stable URL from where to download the regularly updated NeTEx feed I can add it to the OTP test instance.

@rcavaliere
Copy link
Member Author

@dulvui we would like that the testing environment of the OTP back-end (https://otp.opendatahub.testingmachine.eu/) is fed for the public transport data not with GTFS data, but with NeTEx data. So please work with @leonardehrenfried to set up this.

Relevant also for @clezag

@leonardehrenfried
Copy link
Contributor

@dulvui All I need from you is a permantent URL where I can download the latest version of the NeTEx feed. The rest I can do myself.

@rcavaliere
Copy link
Member Author

@leonardehrenfried this is something I can do. At present these NeTEx file stay on an FTP owned by another organizations, i.e. ftp://ftp01.sta.bz.it/netex/2024/plan/All/ Here you find the daily exports, you should always consider the latest one. Can you tell me if you can access there?

@leonardehrenfried
Copy link
Contributor

Yes, I can access it. HTTP with a stable URL pointing towards the newest version would be the best but I can work around it with some scripting.

The full path its then this one? ftp01.sta.bz.it/netex/2024/plan/EU_Profil/NX-PI_01_it_apb_LINE_apb__20240807.xml.zip

@rcavaliere
Copy link
Member Author

@leonardehrenfried good! Yes, unfortunately they want to use this FTP system... yes the current one is the one you have indicated. But as said, every day we generate a new export, so you should consider the new one for the import in OTP. So you should read the current day in the file name and consider this for the choice of the file.

@leonardehrenfried
Copy link
Contributor

Yes, I will compute the file name from the current date.

@leonardehrenfried
Copy link
Contributor

Do you happen to know if the path 2024 will stay the same or change in 2025?

@rcavaliere
Copy link
Member Author

@leonardehrenfried this will change...

@leonardehrenfried
Copy link
Contributor

leonardehrenfried commented Aug 7, 2024

I would like to increase the severity of problem because I noticed it today. Previously I said

No timezone is configured in FrameDefaults which OTP expects to be set like this: https://github.com/entur/profile-examples/blob/272ed7e9f1fe8b60ed1bddefd04c782d35c0917b/netex/network/Line61A.xml#L33-L43

At first I thought this is just cosmetic, but I believe that all times are off by 1 or 2 hours depending on whether it's summer or winter. It would be very good if you could set the time zone in the feed as I suggested in my comment.

@leonardehrenfried
Copy link
Contributor

And since @dulvui just merged my PR, here we have a fresh OTP instance with NeTEx data: https://tinyurl.com/27f8o653

@leonardehrenfried
Copy link
Contributor

I noticed another problem with the NeTEx data: I see no bus stops or bus routes in the city of Trento while there are plenty in Merano, Bolzano and Bressanone.

Let me give you an example: Piazza Dante near Trento railway station has several bus stops and they are all called a variation of "Piazza Dante". I would expect at least one of them to be present in the data but I see zero stops called "Piazza Dante".

I just checked and it's the same with the GTFS feed.

Is this expected?

@rcavaliere
Copy link
Member Author

@leonardehrenfried this is correct; the NeTEx is just related to the Province of Bolzano, not the Province of Trento. In the dataset there are some bus stops in other regions, but these are used just for the railway services.

@leonardehrenfried
Copy link
Contributor

Good to know! I thought this feed covers all of South Tyrol.

@rcavaliere
Copy link
Member Author

Yes, it is. Trento is not in South Tyrol, is in Trentino :-)

@leonardehrenfried
Copy link
Contributor

How embarrassing - I must read up on the difference between an Italian province and a region again! https://en.wikipedia.org/wiki/Trentino-Alto_Adige/S%C3%BCdtirol

@rcavaliere
Copy link
Member Author

I took a look at this today and I am happy to report that importing the feed is going to be possible (with some caveats).

This is what it looks like:

Screenshot from 2024-07-22 17-54-39

Features currently not implemented by OTP

* Using `any` as the version of a NeTEx entity

* The style of the service links (shapes) that are used by the feed

I spoke to the upstream developers and it's going to be possible to implement those two.

Validation errors

The last file you posted successfully validates against both the NeTEx and EPIP XSDs. Very good!

However, OTP has picked up on quite a few validation errors, some of which are cosmetic but also a few serious ones.

Smaller errors

* The `Line` entities do not have an `Authority` which is required in the Nordic profile, so a dummy one is created. These lines have an `Operator` but that is a separate entity in OTP, which is only available in the Transmodel API. We would have to discuss if the Operator is really what GTFS calls the `Agency` in EPIP.

* No timezone is configured in `FrameDefaults` which OTP expects to be set like this: https://github.com/entur/profile-examples/blob/272ed7e9f1fe8b60ed1bddefd04c782d35c0917b/netex/network/Line61A.xml#L33-L43

Suspicious data

* 2061 service journeys repeat the same stop right after each other, often with the exact same time. This is quite suspicious and indicates a data error.

* There area 956 `ServiceJourneyPatterns` that are not referenced by a `ServiceJourney`. This means that they are not imported in OTP. It has no consequences but also probably indicates an error somewhere in the chain.

Serious errors

956 ServiceJourneys have a different number of stops from the ServiceJourneyPattern. This means that these journeys are not imported into OTP at all. It's the most serious error in the feed. An example of this would be the following error message:

Mismatch in stop points between ServiceJourney and JourneyPattern. ServiceJourney will be skipped. ServiceJourney=it:apb:ServiceJourney:031001T-TI-63-5-43500:sonn:, JourneyPattern= it:apb:ServiceJourneyPattern:03100T.24a5100166:

I would speak to Mentz to ask them how this discrepancy can be explained.

Summary

All in all I am pleasantly surprised how well all of this works given that you're the first organisation that tries to import EPIP into OTP.

What is difficult to say is if there are hidden problems with the feed. The best way to find out is to actually use the data, which we currently do.

We can also discuss using a more structured approach to find errors but for now I'm pretty pleased with the progress.

Regarding all these open points:

  • yes, currently we don't have the organization type "Authority" in the resourceFrame, since this was not strictly requested. We have it however in a new version of the NeTEx export, which has several CompositeFrames, including also parking and sharing mobility static data. You can find it for your interest here (export still under consolidation): https://cloud.opendatahub.com/index.php/s/dHXsK9KsFWdKXPC
  • I am checking the topic timezone, I think it can be added without effort at the beginning of the NeTEx export, as you already indicated
  • Regarding the most critical point, which creates in me many doubts: I don't know if this can be related to the fact that in the export we have different line versions with different validity periods. It could be that these line versions are there, also referenced with a reference ServiceJourneyPattern, but are then not associated trips. Can you provide me a couple of examples so that we can better understand these errors?

@leonardehrenfried
Copy link
Contributor

Since this issue is getting quite large, I took the liberty to open separate tickets for the NeTEx problems.

@rcavaliere
Copy link
Member Author

@leonardehrenfried yes please I wanted to do the same once the issues are clarified

@leonardehrenfried
Copy link
Contributor

Here they are: https://github.com/noi-techpark/odh-mentor-otp/issues/created_by/leonardehrenfried

You may want to add a label to give readers a bit of context.

@rcavaliere
Copy link
Member Author

@leonardehrenfried thanks! I have created a new label and labelled the issues. I will give a feedback to you in the next weeks

@rcavaliere
Copy link
Member Author

For SIRI: current end-point is https://efa.sta.bz.it/sirilite (in JSON).

@rcavaliere
Copy link
Member Author

@leonardehrenfried we have finally stable end-points for the NeTEx / SIRI data:

There is now also a SIRI-SX interface (Situation Exchange), implemented according to the German / Swiss profile (VDV-736):

Can you in these days these end-points test and try to integrate them in OTP, especially the SIRI end-point?

For the NeTEx data, there has been some update by 5T in relation to compliance with NeTEx EPIP in the Italian profile, more details on Friday. For sure there is still something to fix in the data we provide...

@leonardehrenfried
Copy link
Contributor

I can take a look at this in the next few days.

Is SIRI only available as SIRI light, where you download everything at once, or also in the Request/Response flow where you create a subscription and get only the latest updates? Request/response is the only one supported by OTP. However, SIRI light is such a simple protocol that it would also not be very hard to implement it.

@rcavaliere
Copy link
Member Author

@leonardehrenfried we have both. At the moment I have shared you just the SIRI light end-points, but in case we can also go in direction subscription. Maybe for a first attempt wouldn't it be easier to work with SIRI light, as done typically in the Nordics?

@leonardehrenfried
Copy link
Contributor

The Nordics use request/response.

@rcavaliere
Copy link
Member Author

But they also use SIRI light, or? As said, if you prefer the complex approach with request / response, this can be easily activated

@leonardehrenfried
Copy link
Contributor

leonardehrenfried commented Nov 13, 2024

They offer SIRI light but don't actually use it. I guess they have it because it's easy to consume. On a country-level request/response has much better performance because you only need to retrieve the latest updates rather than every update for the entire country every minute (even those, where nothing changed).

I would be fine with either.

If the turnaround on activating request/response is as slow as making SIRI available at all, I think I will be faster adding support for SIRI light to OTP. :)

@rcavaliere
Copy link
Member Author

@leonardehrenfried the activities around importing NeTEx data according to the Italian profile are more and more intense at national level. I had some contacts last week with 5T, also Brede was there. At national level they published a new version of the profile (unfortunately in Italian, see annex) with an annex on which are the specific aspects to be considered in order to ensure a smooth import in OTP v2 (see chapter "Appendice A –NeTEx e OTP v.2+"). I would like to discuss this shorty with you, in order to really consolidate what we need to improve in our NeTEx data and eventually provide additional inputs to these national discussions (e.g. the topic parking). As far as I have been told, the current stable OTP version available on github can fully ensure the import of NeTEx data according to these recommendations - is this something that you can confirm? Let's discuss this today...

241104_Linee guida compilazione NeTEx IT v.4.1.0.pdf

An additional point: we are also in close contact with the team at SBB / SKI+ on various topics. They have also had a look to our NeTEx data, in particular Matthias Guenther and Stefan de Konink, you probably know them, They provided us the following inputs:

TimetabledPassingTime must have an id

It is a bad idea not not give id to TimetabledPassingTime. Especially, when version is added.


                                                                                               <TimetabledPassingTime version="any">

                                                                                                           <StopPointInJourneyPatternRef ref="it:apb:StopPointInJourneyPattern:01B10A.24a-3-0030801:" version="3"/>

                                                                                                           <DepartureTime>09:36:00</DepartureTime>

                                                                                               </TimetabledPassingTime>

Using ScheduldedStopPoints as RoutePointRef is not allowed

ScheduldedStopPoints are no RoutePoints


 

                                                           <routes>

                                                                       <Route id="it:apb:Route:1-110-24a-2-1/H:" version="any">

                                                                                   <LineRef ref="it:apb:Line:01110_.24a:" version="2" />

                                                                                   <DirectionRef ref="it:apb:Direction:H:" version="any" />

                                                                                   <pointsInSequence>

                                                                                               <PointOnRoute id="it:apb:PointOnRoute:1-110-24a-2-1/H_1:" version="any" order="1">

                                                                                                          <RoutePointRef ref="it:apb:ScheduledStopPoint:it-22021-468-2-3086:" version="any" />

                                                                                               </PointOnRoute>

                                                                                               <PointOnRoute id="it:apb:PointOnRoute:1-110-24a-2-1/H_2:" version="any" order="2">

                                                                                                          <RoutePointRef ref="it:apb:ScheduledStopPoint:it-22021-468-3-5106:" version="any" />

                                                                                               </PointOnRoute>

                                                                                               <PointOnRoute id="it:apb:PointOnRoute:1-110-24a-2-1/H_3:" version="any" order="3">

                                                                                                          <RoutePointRef ref="it:apb:ScheduledStopPoint:it-22021-2084-0-5029:" version="any" />

                                                                                               </PointOnRoute>

We will have a look also at this, but probably this is not so relevant for the import in OTP, or?

@rcavaliere
Copy link
Member Author

Next steps defined on 15.11:

  • @leonardehrenfried will focus on integration the NeTEx data from the new web-service provided (see main user story description)
  • @leonardehrenfried will focus on integration the SIRI ET data (XML). Decision to test the SIRI-Lite approach first. Attempt to match the SIRI with the NeTEx data from the journey pattern details, since the IDs are not the same ( :-( )
  • @rcavaliere will work on improving certain aspects of the data provided, starting from the issues in the NeTEx data (timeZone + link reference in journeyPatterns)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

No branches or pull requests

3 participants