Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,6 @@ Before you can use KnowWhereGraph's data, you must first understand part of the

Obtaining data can be a difficult process - if you're using SPAQRL (which you will be), it's even more. Learn a few tips and tricks to getting the data you need from the database [Here](./sparql-download.md).

## Running your own
## Running Your Own

It's possible to run your own instance of KnowWhereGraph! Do you *need* to? Do you want to? Read on [Here](./self-hosted.md) to make that call.
62 changes: 31 additions & 31 deletions data-download-walkthrough.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,41 +12,41 @@ I want to download the following information for all wildfire events in Santa Ba

### Step 1: Your First Query

When navigating the graph for the data that you want, you need your first query to land you *somewhere* close to what you're after. Because we've already browsed through the [ontology](./ontology.md) (you've done this, right?) at this point and confirmed we have this data for wildfires... We know that there's a type called `kwg-ont:Wildfire`, so let's assume the nodes we're after are going to be that type. Let's take a look at a single wildfire and see how the data connects to it.
When navigating the graph for the data that you want, you need your first query to land you *somewhere* close to what you're after. Because we've already browsed through the [ontology](./ontology.md) (you've done this, right?) at this point and confirmed we have this data for wildfires... We know that there's a type called `kwg-ont:Wildfire`, so let's assume the nodes we're after are going to be of that type. Let's take a look at a single wildfire and see how the data connects to it.

In the image below, we did a simple query for a *single* wildfire.

<img src="../images/walkthrough/step1.png" alt="drawing" width="900"/>
<img src="./images/walkthrough/step1.png" alt="drawing" width="900"/>

### Step 2: Exploring the Node

Click on the link for [kwgr:hazard.1180930.5434012](https://stko-kwg.geog.ucsb.edu/graphdb/resource?uri=http:%2F%2Fstko-kwg.geog.ucsb.edu%2Flod%2Fresource%2Fhazard.1180930.5434012&role=subject) in the query results table to bring up the following page.

<img src="../images/walkthrough/step2.png" alt="drawing" width="900"/>
<img src="./images/walkthrough/step2.png" alt="drawing" width="900"/>

On this page, we see all the predicates that this wildfire is connected to. **This is incredibly powerful when figuring out what is connected to what, and how to write your SPARQL**.

The predicates that should jump out are

1. kwg-ont:hasTemporalScope: This is the relation that connects the fire to temporal information
2. kwg-ont:sfWithin: This is the relation that connects the fire to spatial information
3. sosa:isFeatureofInterestOf: This is the relation that connects to fire to numeric and categorical data
1. `kwg-ont:hasTemporalScope`: This is the relation that connects the fire to temporal information.
2. `kwg-ont:sfWithin`: This is the relation that connects the fire to spatial information.
3. `sosa:isFeatureOfInterestOf`: This is the relation that connects the fire to numeric and categorical data.

These predicates show up time and time again in KnowWhereGraph and are important to recognize.

### Step 3: Getting Numeric & Categorical Data

When thinking about data values, the SOSA ontology should be at the forefront of your mind. If you haven't already, check out the [ontology](./ontology.md) page for the gist on this ontology.
When thinking about data values, the SOSA ontology should be at the forefront of your mind. If you haven't already, check out the [ontology](./ontology.md) page for the gist of this ontology.

From our main Wildfire node view, navigate to the node that's in the range of sosa:isFeatureOfInterestOf ([kwgr:impact.1180930.5434012](https://stko-kwg.geog.ucsb.edu/graphdb/resource?uri=http:%2F%2Fstko-kwg.geog.ucsb.edu%2Flod%2Fresource%2Fimpact.1180930.5434012&role=subject)). On this page (shown below), we see that familiar structure of an observation collection.

<img src="../images/walkthrough/step3.png" alt="drawing" width="900"/>
<img src="./images/walkthrough/step3.png" alt="drawing" width="900"/>

Looking at the different types of observations in this collection, the one named `kwgr:deathDirectObs.1180930.5434012` should stick out. Sort of sounds like this node might be related to deaths from the wildfire, right?

Let's click on that observation to see what data lies inside.

<img src="../images/walkthrough/step4.png" alt="drawing" width="900"/>
<img src="./images/walkthrough/step4.png" alt="drawing" width="900"/>

Bingo. We see that there's a numeric value that represents the number of people that have died from this wildfire, completing our path from a node of type `kwg-ont:Wildfire` to the actual data value.

Expand All @@ -68,7 +68,7 @@ SELECT ?fire ?direct_deaths WHERE {

### Step 4: Handling Spatial Data

In this step, we have to find the node that represents Santa Barbara County. Using SPARQL. From the [ontology](./ontology.md), we know that counties are going to be kwg-ont:AdministrativeRegion_ .
In this step, we have to find the node that represents Santa Barbara County. Using SPARQL. From the [ontology](./ontology.md), we know that counties are going to be `kwg-ont:AdministrativeRegion_3` .

Building our query, we can start with

Expand All @@ -78,7 +78,7 @@ SELECT * WHERE {
}
```

We also know that the words "Santa Barbara" should be in the `rdfs:label` - so let's add some REGEX to our query.
We also know that the words "Santa Barbara" should be in the `rdfs:label` &mdash; so let's add some REGEX to our query.

```SPARQL
SELECT * WHERE {
Expand All @@ -88,7 +88,7 @@ SELECT * WHERE {
}
```

<img src="../images/walkthrough/step5.png" alt="drawing" width="900"/>
<img src="./images/walkthrough/step5.png" alt="drawing" width="900"/>

From the results, we see that there are *several* counties whose names include "Santa Barbara". By process of eliminiation, we can make a good assumption that the node we're after is the first one, [Earth.North_America.United_States.USA.5.42_1](https://stko-kwg.geog.ucsb.edu/graphdb/resource?uri=http:%2F%2Fstko-kwg.geog.ucsb.edu%2Flod%2Fresource%2FEarth.North_America.United_States.USA.5.42_1&role=subject).

Expand All @@ -107,31 +107,31 @@ SELECT * WHERE {

Running this yields the following results

<img src="../images/walkthrough/step6.png" alt="drawing" width="900"/>
<img src="./images/walkthrough/step6.png" alt="drawing" width="900"/>

### Step 5: Getting Temporal Data

Looking back at our initial [Hazard node](https://stko-kwg.geog.ucsb.edu/graphdb/resource?uri=http:%2F%2Fstko-kwg.geog.ucsb.edu%2Flod%2Fresource%2Fhazard.1180930.5434012&role=subject) from Step 2, we can see that there's temporal information attached to it. We can come to that conclusion by realizing that there's a relation (kwg-ont:hasTemporalScope) that has words related to time in it. Now this isn't a technical approach to navigating the graph - but it's practical. If you have the ontology memorized (which honestly no one does), then you'd know that `kwg-ont:hasTemporalScope` links nodes to temporal data.
Looking back at our initial [Hazard node](https://stko-kwg.geog.ucsb.edu/graphdb/resource?uri=http:%2F%2Fstko-kwg.geog.ucsb.edu%2Flod%2Fresource%2Fhazard.1180930.5434012&role=subject) from Step 2, we can see that there's temporal information attached to it. We can come to that conclusion by realizing that there's a relation (`kwg-ont:hasTemporalScope`) that has words related to time in it. Now this isn't a technical approach to navigating the graph &mdash; but it's practical. If you had the ontology memorized (which honestly no one does), then you'd know that `kwg-ont:hasTemporalScope` links nodes to temporal data.

Based on the name `interval` - we can probably guess that this node is going to have some sort of start and end date. Let's take a look.
Based on the name `interval` &mdash; we can probably guess that this node is going to have some sort of start and end date. Let's take a look.

Clicking on the [kwgr:interval.200504221530_200504221930EST](https://stko-kwg.geog.ucsb.edu/graphdb/resource?uri=http:%2F%2Fstko-kwg.geog.ucsb.edu%2Flod%2Fresource%2Finterval.200504221530_200504221930EST&role=subject) object - we're brought to the node that holds the temporal relations for this Wildfire.
Clicking on the [kwgr:interval.200504221530_200504221930EST](https://stko-kwg.geog.ucsb.edu/graphdb/resource?uri=http:%2F%2Fstko-kwg.geog.ucsb.edu%2Flod%2Fresource%2Finterval.200504221530_200504221930EST&role=subject) object &mdash; we're brought to the node that holds the temporal relations for this Wildfire.

<img src="../images/walkthrough/step9.png" alt="drawing" width="900"/>
<img src="./images/walkthrough/step9.png" alt="drawing" width="900"/>

Again, just by looking at the relations we can mostly tell what we're looking at. We see two relations of interest - one that represents the beginning of the wildfire, and one that represents the end. Let's click on the node that represents the beginning.
Again, just by looking at the relations we can mostly tell what we're looking at. We see two relations of interest &mdash; one that represents the beginning of the wildfire, and one that represents the end. Let's click on the node that represents the beginning.

<img src="../images/walkthrough/step8.png" alt="drawing" width="900"/>
<img src="./images/walkthrough/step8.png" alt="drawing" width="900"/>

Recall that we want the year that each wildfire happened. Without referencing the ontology, we should be able to tell that `time:inXSDgYear` is the predicate that we want.
Recall that we want the year that each wildfire began. Without referencing the ontology, we should be able to tell that `time:inXSDgYear` is the predicate that we want.

Using the path that we just followed, we can write a small proof of concept query that gets the years of wildfires.
Using the path that we just followed, we can write a small proof-of-concept query that gets the years of wildfires.

```
```SPARQL
PREFIX kwg-ont: <http://stko-kwg.geog.ucsb.edu/lod/ontology/>
PREFIX time: <http://www.w3.org/2006/time#>
select ?year where {
?wildfire a kwg-ont:Wildfire .
select ?year where {
?wildfire a kwg-ont:Wildfire .
?wildfire kwg-ont:hasTemporalScope ?temporal_scope .
?temporal_scope time:hasBeginning ?wilfire_start .
?wildfire_start time:inXSDgYear ?year .
Expand All @@ -145,7 +145,7 @@ Now that we have separate queries to get
1. Wildfires
2. Wildfires within Santa Barbara County
3. Number of deaths per wildfire
4. They year of the wildfire
4. Year each wildfire began

We'll combine them together to form our final query to get the data we want.

Expand All @@ -160,7 +160,7 @@ SELECT ?fire_name ?direct_deaths ?year WHERE {
?fire a kwg-ont:Wildfire .
?fire rdfs:label ?fire_name .
?fire kwg-ont:sfWithin kwgr:Earth.North_America.United_States.USA.5.42_1 .
?wildfire kwg-ont:hasTemporalScope ?temporal_scope .
?fire kwg-ont:hasTemporalScope ?temporal_scope .
?temporal_scope time:hasBeginning ?wilfire_start .
?wildfire_start time:inXSDgYear ?year .
?fire sosa:isFeatureOfInterestOf ?observation_collection .
Expand All @@ -174,8 +174,8 @@ SELECT ?fire_name ?direct_deaths ?year WHERE {

To summarize the steps and key points above...

1. Start small, build big
1. Build paths to each thing that you want
2. Combine the smaller queries and logic together to form a larger query
2. Use the SPARQL editor to find relevant nodes and explore them in GraphDB's interface
3. The temporal, spatial, and data representations are similar throughout the database. Learn each and be able use the same pattern everywhere
1. Start small, build big.
1. Build paths to each thing that you want.
2. Combine the smaller queries and logic together to form a larger query.
2. Use the SPARQL editor to find relevant nodes and explore them in GraphDB's interface.
3. The temporal, spatial, and data representations are similar throughout the database. Learn each and be able use the same pattern everywhere.
Loading