-
Notifications
You must be signed in to change notification settings - Fork 45
csv geo au
csv-geo-au is a specification for publishing point or region-mapped Australian geospatial data in CSV format to data.gov.au and other open data portals. Datasets in this format are supported by TerriaJS (and hence the National Map) and are intended to be as reusable as possible. A State
column in a CSV file with a resource format of csv-geo-au
can unambiguously be understood to refer to an Australian state, for example.
Datasets with line feature or explicit polygons (instead of references to standard polygon boundaries) are not covered by this standard, and should be provided as GeoJSON.
Document Status: initial use. This document will evolve, but it is unlikely that field names currently recommended will become deprecated.
- Recommended: The best field name for maximum reusability. High priority for support in TerriaJS (the software that runs the National Map). Sometimes several options are recommended, depending on your need for precision.
- Accepted: A field name which is reasonably reusable. Generally supported by TerriaJS.
- Discouraged: A field name which is ambiguous or not intuitive to a wide audience, but may be commonly used due to existing software. Possibly supported by TerriaJS but may be discontinued.
It is generally acceptable to include "discouraged" fields if there is also recommended or accepted fields as the recommended or accepted fields will be used.
In designing this specification, we have tried to balance these goals:
- Maximising the chance that existing CSV files may accidentally conform, correctly.
- Allowing motivated dataset publishers to be very precise about the exact boundaries their data relates to.
- Making column names guessable without consulting the specification.
- Encouraging the production of datasets which are easy to use by consumers who are unaware of this specification.
- Aligning with attribute names already used by authorities such as the ASGS
The CSV format MUST:
- Consist only of one header row followed by data rows (no other metadata within the file)
- Use
,
as field delimiter - Use
\r\n
(Windows) or\n
(Linux, OSX) as end of line character - Use double quotes around any value containing a comma, and double-double quotes to represent double quotes:
"like ""this"""
- MUST NOT contain a column with header name:
id
. Note: uppercaseID
column is allowed.
It SHOULD be encoded in UTF-8. Headers are not considered to be case-sensitive.
In data.gov.au and other CKAN-based portals, resources (individual files) that conform to this standard SHOULD be given a resource type of csv-geo-au
. This is required for National Map to locate and display them. Resources with format set to csv-geo-au
can also be previewed on data.gov.au like other CSV files.
Tables should look like one of these:
ID,Population,LGA_code_2015,State
1,100600,24600,VIC
or
ID,Population,Postcode,State
1,28000,3000,VIC
or
ID,Name,Lat,Lon
1,Bacchus Marsh Airport,-37.7313,144.4212
EITHER a latitude/longitude pair, OR one or more region fields should be provided.
To encode individual points with a latitude and longitude, two fields are required. Each MUST be a number in decimal degrees. Numbers SHOULD NOT be enclosed in double quotes.
-
Lat
,Lon
-
Latitude
,Longitude
; -
Lat
,Lng
,Long
-
x
,y
; -
WKT
(single column with data inPOINT(-37.8 144.9)
format); -
easting
,northing
; - combined format:
(-37.8, 144.9)
; - GeoJSON
Locations SHOULD be given in the GDA94 datum (EPSG:4283), but WGS84 is acceptable (EPSG:4326). The difference is generally less than one metre. The datum chosen SHOULD be indicated in the metadata for the dataset. (There is currently no standard for this.)
For each boundary type, there are usually three field names that can be used for matching on codes:
- "Field with year" (eg
sa4_code_2011
). This is the most precise, and recommended, particularly for boundaries which change frequently. Certain boundaries move significantly every year (eg LGA), and some are completely renumbered in each reissue (eg, Tourism Regions). (TerriaJS does not currently support different versions.) - "Field without year" (eg
sa4_code
). This is acceptable when the year is not known.
These field names generally match those used by the ABS. In addition, we define:
- "Synonym" (eg
sa4
). This unofficial shorthand is useful for matching spreadsheets in this form, but it is not recommended due to ambiguity: does the field contain codes or names? (TerriaJS always assumes codes.)
In addition, we define field names for matching on names (eg sa4_name_2011
).
Please note that there is currently no support for matching any ASGS regions other than LGA by name.
State/Territory (STE)
Name or code | Field with year | Field without year | Synonyms |
---|---|---|---|
Full name (New South Wales) | ste_name_2011 |
ste_name |
state |
1 digit code (3=Queensland) | ste_code_2011 |
ste_code |
ste |
Note: TerriaJS may in the future support state abbreviations (TAS etc)
Statistical area 1 (SA1)
Note: The use of "maincode" here follows the ABS' convention.
Name or code | Field with year | Field without year | Synonym |
---|---|---|---|
11-digit code | sa1_maincode_2011 |
sa1_maincode |
sa1 ,sa1_code
|
7-digit code | sa1_7digitcode_2011 |
sa1_7digitcode |
Statistical area 2 (SA2)
Name or code | Field with year | Field without year | Synonym |
---|---|---|---|
9-digit code (1 digit state + 2 digit SA4 + 2 digit SA3 + 4) |
sa2_code_2011 |
sa2_code |
sa2 |
5-digit code (1 digit state + 4) |
sa2_5digitcode_2011 |
sa2_5digitcode |
|
Name (eg "O'Connor (WA)") | sa2_name_2011 |
sa2_name |
Statistical area 3 (SA3)
Name or code | Field with year | Field without year | Synonym |
---|---|---|---|
5-digit code (1 digit state + 2 digit SA4 + 2 digits) | sa3_code_2011 |
sa3_code |
sa3 |
Name (eg "North Sydney - Mosman") | sa3_name_2011 |
sa3_name |
Statistical area 4 (SA4)
Name or code | Field with year | Field without year | Synonym |
---|---|---|---|
3 digit code (1-digit state code + 2) | sa4_code_2011 |
sa4_code |
sa4 |
Name (eg "Melbourne - Inner South") | sa4_name_2011 |
sa4_name |
Name or code | Field with year | Field without year | Synonym |
---|---|---|---|
5-character alphanumeric code (1-digit state code + 4, eg 1GSYD) |
gccsa_code_2011 |
gccsa_code |
gccsa |
Name (eg "Greater Sydney") | gccsa_name_2011 |
gccsa_name |
Signifcant Urban Areas (SUA)
Name or code | Field with year | Field without year | Synonym |
---|---|---|---|
4-digit code (non-hierarchical, eg 5009) | sua_code_2011 |
sua_code |
sua |
Name (eg "Warragul - Drouin") | sua_name_2011 |
sua_name |
Name or code | Field without year | Synonyms |
---|---|---|
1 digit code (0=Australia) | aus_code |
aus |
Note: As of June 2015, no decisions have been made about future support of these structures.
Structure | Name or code | With year | Without year | Synonym |
---|---|---|---|---|
Mesh block | 11 digit code | mb_code_2011 |
mb_code |
mb |
Section of state | 2 digit code | sos_code_2011 |
sos_code |
sos |
Section of state range | 3-digit code | sosr_code_2011 |
sosr_code |
sosr |
Urban Centres and Localities | 6-digit code | ucl_code_2011 |
ucl_code |
ucl |
Indigenous Regions | 3-digit code | ireg_code_2011 |
ireg_code |
ireg |
Indigenous Locations | 8-digit code | iloc_code_2011 |
iloc_code |
iloc |
Indigenous Areas | 6-digit code | iare_code_2011 |
iare_code |
iare |
Remoteness Areas | 2-digit code | ra_code_2011 |
ra_code |
ra |
Postcode / postal area
A four digit Australian postcode.
Authority | Region name | Name or code | Field with year | Field without year |
---|---|---|---|---|
PSMA | Postcode | 4 digit code | postcode_2015 |
postcode |
ABS | Postal area (ABS approximation) | 4 digit code |
poa_2011 , poa_code_2011
|
poa , poa_code
|
Note: PSMA's boundaries are not open data, and TerriaJS hence uses the ABS Postal areas to display postcodes. They are not quite the same.)
For greater precision, additional fields Suburb
and State
MAY be provided. For example: Postcode
3068, Suburb
Clifton Hill, State
VIC.
Name or code | Field with year | Field without year | Synonym |
---|---|---|---|
5 digit code (eg "31000") | lga_code_2014 |
lga_code |
lga |
Name (eg "Brisbane") | lga_name_2014 |
lga_name |
adm2 (see note) |
Complete lists of 5-digit codes are available here.
The lga_name
field SHOULD be used only a human-readable addition to lga_code
. It is NOT recommended as the primary lookup (and is not currently supported as such by TerriaJS).
The adm2
field (not currently supported by TerriaJS) must contain the short form of the LGA name, with no "City of", "Council" etc. For example: "Melbourne", "Greater Geelong". It SHOULD be capitalised like this.
A separate State
column (and/or lga_code
column) MUST be provided, as LGA names are not unique across states.
ABS approximations of electoral districts.
Name or code | Field with year | Field without year | Synonym |
---|---|---|---|
3 digit code (1-digit state code + 2, e.g. "402") | ced_code_2011 |
ced_code |
ced |
Name (eg "Barker") | ced_name_2011 |
ced_name |
To explicitly map against actual AEC boundaries instead of ABS approximations:
Name or code | Field with year | Field without year | Synonym |
---|---|---|---|
3 digit "division ID" *(eg "180") | com_elb_id_2016 |
com_elb_id |
com_elb |
Name (eg "Barker") | com_elb_name_2016 |
com_elb_name |
Note: the year here is the year of the AEC release, not ASGS. Division IDs are not the same as AEC codes, although they look similar. (ABS codes beginning 1xx are all in NSW, AEC ones can be in any state.)
ABS approximations of electoral districts.
Name or code | Field with year | Field without year | Synonym |
---|---|---|---|
5 digit code (1-digit state code + 4, e.g. "20106") | sed_code_2011 |
sed_code |
sed |
Name (eg "Albert Park (Southern Metropolitan)") | sed_name_2011 |
sed_name |
To explicitly map against actual AEC boundaries instead of ABS approximations:
Name or code | Field with year | Field without year | Synonym |
---|---|---|---|
5 digit code (1-digit state code + 4, e.g. "20106") | sed_aec_code_2011 |
sed_aec_code |
sed_aec |
Name (eg "Albert Park (Southern Metropolitan)") | sed_aec_name_2011 |
sed_aec_name |
Note: the year here is the year of the AEC release, not ASGS.
ABS approximations of suburbs.
Name or code | Field with year | Field without year | Synonym |
---|---|---|---|
5 digit code (1-digit state code + 4, e.g. "10002") | ssc_code_2011 |
ssc_code |
ssc |
Name (eg "Abbotsford (NSW)") | ssc_name_2011 |
ssc_name |
The field name suburb
is currently treated as a synonym for ssc
but may change.
Structure | Name or code | With year | Without year | Without year (Synonym) |
---|---|---|---|---|
Australian Drainage Divisions | 3 character code: D__
|
add_code_2011 |
add_code |
|
Natural Resource Management Regions | 3-digit code | nrmr_code_2011 |
nrmr_code |
nrmr |
Tourism Regions | 5 character code: _R___
|
tr_code_2011 |
tr_code |
tr |
Name or code | Field | Synonyms |
---|---|---|
Two letter country code (ISO 3166-1 Alpha 2) (eg AU) |
cnt2 |
iso2 |
Three letter country code (ISO 3166-1 Alpha 3) (eg AUS) |
cnt3 |
iso3 |
Primary Health Network (Department of Health)
Name or code | Field with year | Field without year | Synonym |
---|---|---|---|
6 character PHN___ code (eg PHN101)
|
phn_code_2015 |
phn_code |
phn |
Name (eg "Central and Eastern Sydney") | phn_name_2015 |
phn_name |
An obsolete ABS structure roughly equivalent to an LGA.
Name or code | Field with year | Field without year | Synonym |
---|---|---|---|
9-digit Code | sla_code_2006 |
sla_code |
sla , sla_9digitcode_2006
|
5-digit Code | sla_5digitcode_2006 |
sla_5digitcode |
|
Name | sla_name_2006 |
sla_name |
Note: As this structure is obsolete, we recommend using the full field name sla_code_2006
in case the short form sla
clashes with something else in the future.
An obsolete ABS structure similar in size to an SA1.
Name or code | Field with year | Field without year | Synonym |
---|---|---|---|
Code | cd_code_2006 |
cd_code |
cd , cd_7digitcode_2006
|
Note: As this structure is obsolete, we recommend using the full field name cd_code_2006
in case the short form cd
clashes with something else in the future.
These are included here to support standardisation and future support by TerriaJS. As of June 2015, no decisions have been made about future support.
An optional date
field MAY be used to indicate a date (and, optionally) time associated with a row. Similarly, end_date
, end date
, end_time
, end time
store when an event associated with a row ended; alternatively, a specific field may be selected via using isEndDate
in tableStyle.columns
.
These formats of ISO8601 are acceptable:
Format | Example | Description |
---|---|---|
yyyy |
2004 | |
yyyy-mm |
2004-05 | |
yyyy-mm-dd |
2004-05-01 | |
yyyy-mm-ddThh:mm:ss |
2004-05-01T19:43:16 | recommended format without timezone (literal "T") |
yyyy-mm-dd hh:mm:ss |
2004-05-01T19:43:16 | alternative format without timezone |
yyyy-mm-ddThh:mm:ssZ |
2004-05-01T19:43:16Z | with UTC timezone (literal "Z") |
yyyy-mm-ddThh:mm:ss+zz |
2004-05-01T19:43:16+11 | with timezone specified as + or - |