Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding the fit_model function #277

Draft
wants to merge 41 commits into
base: devel
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
13df5ae
WIP: Adding a basic outline of the fit_model function
gareth-j Nov 20, 2023
f29068a
Adding fit function files
gareth-j Nov 20, 2023
0fe8695
Basic templates for the functions
gareth-j Nov 20, 2023
9274e1f
renamed fit_model
mnky9800n Nov 21, 2023
dc0bd9a
added TODOs
mnky9800n Nov 21, 2023
65a3bd6
added fixed effects string generator
mnky9800n Nov 21, 2023
73c4c0c
added full formula creator
mnky9800n Nov 21, 2023
afa57cc
Merge pull request #280 from 4DModeller/Iss273/full_formula_creator
gareth-j Nov 21, 2023
7c9cbdb
WIP: Adding basics of spde
gareth-j Nov 21, 2023
b0789d3
WIP: Adding the loop to add processes
gareth-j Nov 21, 2023
27e1383
Merge pull request #279 from 4DModeller/Iss273/fixed_effects
gareth-j Nov 21, 2023
1d253c4
WIP: Found reason inla delay
gareth-j Nov 21, 2023
0bd600c
More customisation enabled
gareth-j Nov 21, 2023
db70688
Tidy up formula
gareth-j Nov 21, 2023
87f6359
Working without fe_formula
gareth-j Nov 21, 2023
d84cc56
Adding back in fe_formula
gareth-j Nov 22, 2023
0580efd
Working on new model type
gareth-j Nov 22, 2023
02b785b
Add testing file
gareth-j Nov 22, 2023
efc4023
fixed gareth induced bugs
mnky9800n Nov 23, 2023
8068495
what is qexp?
mnky9800n Nov 23, 2023
7f9edec
fix model run errors
XueqingYin Nov 23, 2023
762f89e
Add xueqing code
gareth-j Nov 23, 2023
3204f99
Merge pull request #282 from 4DModeller/Fix_constrainedModel
gareth-j Nov 23, 2023
04f7ebd
Remove my addition
gareth-j Nov 23, 2023
ffe4527
Remove uneeded code
gareth-j Nov 23, 2023
b17c9af
Updating fit_model
gareth-j Nov 23, 2023
5fed129
Update fit_model to return mesh as well
gareth-j Nov 23, 2023
1b5589b
added NCAR rain data
mnky9800n Nov 23, 2023
c6def3b
added readme for link for data source
mnky9800n Nov 23, 2023
d401bf3
delete big stuff, make INLA stuff
mnky9800n Nov 23, 2023
737268b
Merge pull request #284 from 4DModeller/Iss273/newDataSets
mnky9800n Nov 23, 2023
7b6f160
added stuff to fm
mnky9800n Nov 23, 2023
97c64a6
Don't read initial_range from data for now
gareth-j Nov 23, 2023
9d5eb71
barbara streisand
mnky9800n Nov 23, 2023
8d8d9b2
Add extra params to control model run
gareth-j Nov 23, 2023
2d39652
Fix by removing bad mesh params
gareth-j Nov 23, 2023
ee38716
Adding ability to select specific columns in the data for space and s…
gareth-j Nov 23, 2023
afaf70f
TEST TEST TEST
mnky9800n Nov 23, 2023
245a532
MORE TESTS
mnky9800n Nov 23, 2023
327b98f
Ensure y is in the data.frame
gareth-j Nov 23, 2023
cd109d7
the method doesnt work on any dataset
mnky9800n Nov 23, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

export(clear_caches)
export(create_prediction_field)
export(fit_model)
export(get_tutorial_datapath)
export(latlong_to_utm)
export(load_tutorial_data)
Expand Down
72 changes: 72 additions & 0 deletions R/fit_model.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
#' Fit a model using inlabru::bru
#'
#' @param y outcome variable name
#' @param data Spatial data
#' @param locations list of coordinates associated with process
#' @param space_variables list of spatial variables
#' @param spacetime_variables list of spacetime variables
#' @param family gaussian or poisson
#' @param latitude_col name of latitude column
#' @param longitude_col name of longitude column
#' @param time_variable name of time variable in data
#' @param initial_range_mesh Initial range in mesh creation
#' @param max_edge_mesh Max edge value used in mesh creation
#'
#' @return list of model output and mesh
#' @export
fit_model <- function(
y,
data,
locations,
space_variables,
spacetime_variables,
time_variable = "time",
family = "gaussian",
latitude_col = "LAT",
longitude_col = "LONG",
initial_range_mesh = 0.1,
max_edge_mesh = 0.05,
verbose = FALSE) {
library("inlabru")
library("INLA")

print("Creating mesh...")
mesh <- fmesher::fm_mesh_2d_inla(
loc = locations
)

print("Creating SPDE...")
spde <- INLA::inla.spde2.pcmatern(
mesh = mesh,
prior.range = c(initial_range_mesh, 0.5),
prior.sigma = c(1, 0.01)
)

space_variables <- append(space_variables, y)
spacetime_variables <- append(spacetime_variables, y)

rhoprior <- base::list(theta = list(prior = "pccor1", param = c(0, 0.9)))

print("Running model...")
model_output <- inlabru::bru(
components = ~ space(coordinates, model = spde) +
spacetime(
coordinates,
model = spde,
group = time,
control.group = list(
model = "ar1",
hyper = rhoprior
)
) +
beta_u(1, prec.linear = 1),
inlabru::like(formula = data[[y]] ~ space, family = "gaussian", data = data[space_variables]),
inlabru::like(formula = data[[y]] ~ beta_u * space + spacetime, family = "gaussian", data = data[spacetime_variables]),
options = list(
verbose = verbose
)
)

print("Model run complete.")
list(model_output = model_output, mesh = mesh)
}
5 changes: 4 additions & 1 deletion man/plot_map.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 5 additions & 1 deletion man/plot_mesh.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

74 changes: 74 additions & 0 deletions vignettes/TEST_fitmodel.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
---
title: "TESTING FITMODEL"
output:
bookdown::html_document2:
base_format: rmarkdown::html_vignette
fig_caption: yes
bibliography: "covid.bib"
link-citations: yes
vignette: >
%\VignetteIndexEntry{Fitting a model}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---

this file tests all of the different datasets using `fit_model` function


```{r}
devtools::load_all()
```

# COVID-19
```{r}
# dont run because it takes forever
# fdmr::retrieve_tutorial_data(dataset = "covid")
# covid19_data <- fdmr::load_tutorial_data(dataset = "covid", filename = "covid19_data.rds")
# covid19_data = covid19_data %>% dplyr::filter(week < 5)
# sp::coordinates(covid19_data) <- c("LONG", "LAT")
# fdmr::fit_model(y='cases'
# , locations = covid19_data
# ,time_variable = 'week'
# ,latitude_col = 'LAT'
# ,longitude_col = 'LONG'
# ,data = covid19_data
# ,space_variables = c('Population','carebeds.ratio')
# ,spacetime_variables = c('pm25', 'no2'))
```

# COVID19 Bristol
```{r}
fdmr::retrieve_tutorial_data(dataset = "priors")
covid19_data <- fdmr::load_tutorial_data(dataset = "priors", filename = "covid19_dataBris.rds")
sp::coordinates(covid19_data) <- c("LONG", "LAT")
run_output <- fdmr::fit_model(y = 'cases'
,locations=covid19_data
,time_variable = 'week'
,latitude_col = 'LAT'
,longitude_col = 'LONG'
,data = covid19_data
,space_variables = c('Population')
,spacetime_variables = c('prevalence'))
summary(run_output$model_output)
```


# hydro
```{r}
fdmr::retrieve_tutorial_data(dataset = "tibetan_lakes")
lakes_data_filepath <- fdmr::get_tutorial_datapath(dataset = "tibetan_lakes", filename = "example_lake_data.csv")
lakes_data <- read.csv(lakes_data_filepath)
head(lakes_data)
sp::coordinates(lakes_data) <- c("centroid_lon", "centroid_lat")
run_output <- fdmr::fit_model(y = 'water_balance_m3'
,locations=lakes_data
,time_variable = 'year'
,latitude_col = 'centroid_lat'
,longitude_col = 'centroid_lon'
,data = lakes_data
,space_variables = c('SUB_AREA')
,spacetime_variables = c('precip'))
summary(run_output$model_output)
```


55 changes: 55 additions & 0 deletions vignettes/fit_model.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
---
title: "Fitting a model"
output:
bookdown::html_document2:
base_format: rmarkdown::html_vignette
fig_caption: yes
bibliography: "covid.bib"
link-citations: yes
vignette: >
%\VignetteIndexEntry{Fitting a model}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r}
library(magrittr)
```

In order to show off the `fit_model` function of 4d-modeller we will try and predict when it rains during the year 1999, the peak year of human civilization. First we will import the data.

```{r error=true}
raindata <- read.csv('fitmodel_data/rain_data.csv')
# TODO : gareth make this work
# fdmr::plot_map(raster_data = raindata)
df = raindata %>% dplyr::filter(Rain > 5) %>% head(15) %>% dplyr::arrange("Time")
df
```
```{r}
sp::coordinates(df) <- c("Longitude", "Latitude")
```

## Fit the model

Now we're ready to fit the model


# ```{r}
# fdmr::fit_model
# ```

```{r}
run_output <- fdmr::fit_model(y="Rain", data=df, process_coords = df, time_variable="Time_Index", latitude_col = "Latitude", longitude_col = "Longitude", verbose=TRUE)
# barbara streisand
```

Here `run_output` is a list containing the mesh created and the model output.

```{r}
summary(run_output$model_output)
```

We can also view the output using the `model_viewer` Shiny app.

```{r}
fdmr::model_viewer(model_output = run_output$model_output, mesh = run_output$mesh, measurement_data = data, data_distribution = "Gaussian")
```
5 changes: 5 additions & 0 deletions vignettes/fitmodel_data/datasets.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@

# Data sets for fitmodel functions

# NCAR data
https://www.kaggle.com/datasets/shiratorizawa/ncarcsv2
21 changes: 21 additions & 0 deletions vignettes/fitmodel_data/make_final_data.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt

def main():
df = pd.read_csv('NCAR_raw.csv')
final_data = df[['lat', 'lon', 'time', 'rain', 'slp', 'tmp']].replace(-9999.0, 0.0).fillna(0).copy()
final_data.columns = ['Latitude', 'Longitude', 'Time', 'Rain', 'Sea_Level_Pressure', 'Temperature']
final_data.sort_values(by='Time', inplace=True)
final_data.reset_index(inplace=True)
final_data['Time'] = pd.to_datetime(final_data['Time'])
final_data = final_data[final_data.Time.between('1999-01-01', '1999-12-31')]
# convert timestamps to time index for inlabru
time_index = final_data.Time.apply(lambda t: t.timestamp())
final_data['Time_Index'] = ((time_index - time_index.min())/86400).astype(int)
final_data['Sea_Level_Pressure'] = final_data.Sea_Level_Pressure.replace(0, np.nan).fillna(method='ffill')
final_data.fillna(final_data['Sea_Level_Pressure'].mean(), inplace=True)
final_data.to_csv('rain_data.csv', index=False)

if __name__=='__main__':
main()
Loading
Loading