gtfs-validator
is composed of a number of modules, as shown in the following dependency diagram:
graph BT;
model
core-->model;
processor-->core;
main-->model;
main-->core;
main-->processor;
cli-->main;
app:gui-->main;
app:pkg-->app:gui;
Depends on: processor
, core
, and model
If you're looking to add new GTFS fields or rules, you'll want to look at this module and read the section on adding new tables and fields.
Contains:
- GTFS table schemas - Defines how GTFS files (e.g.,
trips.txt
) and the fields contained within that file (e.g.,trip_id
) are represented in the validator. You can add new GTFS files and fields here. - Business logic validation rules - Code that validates GTFS field values. You can add new validation rules here.
- Error notices - Containers for information about errors discovered during validation. You can add a new notice as a nested
static
class annotated with@GtfsValidationNotice
in the corresponding validation rule class. - Report generators - Create reports and summaries of the notices generated by the rule validators.
The architecture leverages AutoValue
and annotations in Gtfs[File]Schema.java
classes (such as GtfsTripSchema.java
) in main
to auto-generate the following classes used for loading and validating each File. A class is only generated if the Schema
class is annotated to generate that type of class.
Gtfs[File].java
- Internally represents an entity in a GTFS File
- Example:
GtfsStopTime.java
Gtfs[Field].java
- Enumerates the possible values for an enumerated Field
- Example:
GtfsFrequencyExactTimes.java
Gtfs[Field]Enum.java
- An interface for each Field with enumerated values, implemented by each
Gtfs[Field].java
- Example:
GtfsFrequencyExactTimesEnum.java
- An interface for each Field with enumerated values, implemented by each
Gtfs[File]TableContainer.java
- Represents an instance of a GTFS File as a table, including its entities
- Example:
GtfsRouteTableContainer.java
Gtfs[File]TableDescriptor.java
- Describes the specification of the table representation of a GTFS File
- Example:
GtfsStopTableDescriptor.java
Gtfs[File][Annotation]Validator.java
- Validates all fields with a particular validating Annotation in a Gtfs File
- Example:
GtfsAgencyMixedCaseValidator.java
Gtfs[File][Field][Annotation]Validator.java
- Validates a particular Field with a particular validating Annotation in a Gtfs File
- Example:
GtfsAttributionAgencyIdForeignKeyValidator.java
AutoValue_Gtfs[File]TableContainer_CompositeKey.java
- Builder for composite keys for entities in a File where multiple fields comprise the primary key
- Example:
AutoValue_GtfsTransferTableContainer_CompositeKey.java
To generate these classes with AutoValue, build the project. When annotations are modified, the project must be rebuilt for the changes to propagate.
Depends on: core
Contains:
- A file analyser to analyse annotations on Java interfaces that define GTFS schema and translate them to descriptors
- Descriptors of annotations fields (
ForeignKey
,GtfsEnum
,GtfsField
,GtfsFile
) - A processor to auto-generate data classes, loaders and validators based on annotations on GTFS schema interfaces
- GTFS entity classes to generate class names for a given GTFS table
- Code generators to generate code from annotations found by file analyser (e.g.
EnumGenerator
)
Depends on: model
Contains:
- Code to read zipped and unzipped file input
- CSV file and row parsers
- Notice to be generated when checking data type validation rules such as
EmptyFileNotice
- A notice container (
NoticeContainer
) - GTFS data type definitions such as
GtfsTime
,GtfsDate
, orGtfsColor
GtfsFeedLoader
to load for a whole GTFS feed with all its CSV files- GTFS feed's name
Depends on: nothing
Contains:
- root interfaces and annotations for modeling a GTFS schema table
Business logic should generally not be added to this module.
Depends on: main
A command-line-based application for running the validator.
Depends on: main
A GUI-based application for running the validator as a desktop application.
Depends on: app:gui
A minimal wrapper around app:gui
designed to facilitate packaging the GUI application as a Java Module and producing standalone executables and installers for various platforms.
1️⃣ Inputs
- A local GTFS archive (zip file) or fully qualified URL from which to download a GTFS archive
- Command line arguments
2️⃣ Validator loading
- Locate all validators annotated with
@GtfsValidator
and load them
3️⃣ Feed loading
- Read GTFS files
- Create
GtfsTableContainer
from data - Invoke and execute all
SingleEntityValidators
to validate data types, etc.
4️⃣ Validators execution
- Invoke and execute all
FileValidators
in parallel to validate GTFS semantic rules
5️⃣ Notice export
- Creates path to export notices as specified by command line input
--output
(or-o
). - Export notices from
NoticeContainer
to two JSON files in the specified directory -report.json
for validator results andsystem_errors.json
for any software errors that occurred during validation. Notices are alphabetically sorted in the.json
files.
Let's say that you are an agency which for some reason uses other_file.txt
as an additional table to represent GTFS information, and your goal is to implement validation rule related to this new table.
To do so, you would have to:
- add the new table to the validator as an annotated
Gtfs[OtherFile]Schema.java
class in main; - implement the new semantic validation rules and notices in main.
This section details how existing table are defined and gives information on annotation usage. One can then transpose these explanations to add a new table or field. Let's take a look at GtfsCalendarSchema
:
package org.mobilitydata.gtfsvalidator.table;
import org.mobilitydata.gtfsvalidator.annotation.ConditionallyRequired;
import org.mobilitydata.gtfsvalidator.annotation.EndRange;
import org.mobilitydata.gtfsvalidator.annotation.FieldType;
import org.mobilitydata.gtfsvalidator.annotation.FieldTypeEnum;
import org.mobilitydata.gtfsvalidator.annotation.GtfsTable;
import org.mobilitydata.gtfsvalidator.annotation.PrimaryKey;
import org.mobilitydata.gtfsvalidator.annotation.Required;
import org.mobilitydata.gtfsvalidator.type.GtfsDate;
@GtfsTable("calendar.txt")
@ConditionallyRequired
public interface GtfsCalendarSchema extends GtfsEntity {
@FieldType(FieldTypeEnum.ID)
@PrimaryKey
@Required
String serviceId();
@Required
GtfsCalendarService monday();
@Required
GtfsCalendarService tuesday();
@Required
GtfsCalendarService wednesday();
@Required
GtfsCalendarService thursday();
@Required
GtfsCalendarService friday();
@Required
GtfsCalendarService saturday();
@Required
GtfsCalendarService sunday();
@Required
@EndRange(field = "end_date", allowEqual = true)
GtfsDate startDate();
@Required
GtfsDate endDate();
}
By order of appearance in the interface definition:
@GtfsTable
: annotates the interface that defines schema forcalendar.txt
- Theprocessor
will generates data classes, loaders and validators based on annotations on this GTFS schema interface.@ConditionallyRequired
: hints that this file is conditionally required.@FieldType
: specifiescalendar_service_id
is defined as an ID by the GTFS specification.@PrimaryKey
: specifiescalendar_service_id
is the primary key of this table.@Required
: specifies a value forcalendar_service_id
is required - A notice will be issued at the parsing stage.@EndRange
: specifiesendDate
is the end point for the date range defined bycalendar.start_date
andcalendar.end_time
- A validator will be generated and check ifcalendar.start_date
is before or equal tocalendar.end_date
.
Annotation | Definition |
---|---|
@CachedField |
Enables caching of values for a given field to optimize memory usage. |
@ConditionallyRequired |
A hint that a field or a file is conditionally required and custom validators should validate the requirement. |
@CurrencyAmount |
Specifies a field represents a currency and should be interpreted according to currencyField . |
@DefaultValue |
Specifies a default value for a particular GTFS field. |
@EndRange |
Specifies a field for the end point of a date or time range. |
@FieldType |
Specifies type of a GTFS field, e.g., COLOR or LATITUDE . |
@ForeignKey |
Specifies a reference to a foreign key. |
@Generated |
Marker for all classes generated by annotation processor. |
@GtfsEnumValue |
Specifies a value for a GTFS enum. |
@GtfsEnumValues |
It is necessary for making GtfsEnumValue annotation repeatable. |
@GtfsTable |
Annotates an interface that defines schema for a single GTFS table, such as stops.txt . |
@GtfsValidationNotice |
Identifies, configures, and documents a validation notice. |
@GtfsValidator |
Annotates both custom and automatically generated validators to make them discoverable on the fly. |
@Index |
Asks annotation processor to create an index for quick search on a given field. The field does not need to have unique values. |
@MixedCase |
Specifies a string field should have a mixed-case value. |
@NonNegative |
Generates a validation that an integer or a double (float) field is not negative. |
@NonZero |
Generates a validation that an integer or a double (float) field is not zero. |
@Positive |
Generates a validation that an integer or a double (float) field is positive. |
@PrimaryKey |
Specifies a field is the or part of the primary key in a GTFS table. Adds validation that each entry's set of primary key fields is unique in the table. |
@Recommended |
For a file, specifies the file is recommended. For a field, specifies the column and all non-empty values are recommended. |
@RecommendedColumn |
For a field, specifies the column is recommended, but empty values in that field do not violate the recommendation. |
@Required |
For a file, specifies the file is recommended. For a field, specifies the column and all non-empty values are required. |
@RequiredColumn |
For a field, specifies the column is required, but empty values in that field do not violate the requirement. |