`rstan` fails to read `cmdstanr` output .csv files #1133

santikka · 2024-07-10T16:54:43Z

Summary:

rstan::read_stan_csv() no longer works for for .csv files produced by cmdstanr

Description:

It seems that rstan expects the save_warmup field to be an integer value, but cmdstanr instead writes this as a logical value (e.g., false) to the conversion to integer fails in rstan:::parse_stancsv_comments() resulting in an NA value for save_warmup. This used to work with earlier versions of cmdstanr/CmdStan.

Reproducible Steps:

library("cmdstanr")
#> This is cmdstanr version 0.8.1
#> - CmdStanR documentation and vignettes: mc-stan.org/cmdstanr
#> - CmdStan path: C:/Users/Santtu/.cmdstan/cmdstan-2.35.0
#> - CmdStan version: 2.35.0
file <- file.path(cmdstan_path(), "examples", "bernoulli", "bernoulli.stan")
mod <- cmdstan_model(file)
data_list <- list(N = 10, y = c(0, 1, 0, 0, 0, 0, 0, 0, 0, 1))
samples <- mod$sample(
  data = data_list,
  refresh = 0,
  chains = 1,
  iter_sampling = 10,
  iter_warmup = 10
)
#> Running MCMC with 1 chain...
#> 
#> Chain 1 WARNING: No variance estimation is 
#> Chain 1          performed for num_warmup < 20 
#> Chain 1 finished in 0.0 seconds.
rstan::read_stan_csv(samples$output_files())
#> Warning in parse_stancsv_comments(comments): NAs introduced by coercion
#> Error in if (max(save_warmup) == 0L) {: missing value where TRUE/FALSE needed

^{Created on 2024-07-10 with reprex v2.1.1}

RStan Version:

2.32.6

R Version:

R version 4.4.0 (2024-04-24 ucrt)

Operating System:

Windows 11 Pro 23H2

The text was updated successfully, but these errors were encountered:

martinmodrak · 2024-07-30T10:21:26Z

I can reproduce the issue - outputs from CmdStan 2.34.1 are read fine, outputs from 2.35.0 cause problems.

Although I get a different error message (maybe because of using R 4.3.2)

Error in rstan::read_stan_csv(samples$output_files()) : 
  object 'n_kept' not found

I'll also note that the problem is made worse by the code at:

rstan/rstan/rstan/R/stan_csv.R

Line 220 in 9b8d8fe

if (max(save_warmup) == 0L) { # all equal to 0L

where if save_warmup is not all 0 or all 1, n_kept never gets assigned.

jgabry · 2024-07-30T17:29:26Z

Thanks. I think this might be fixed by #1131. Or at least the save_warmup issue, not sure about the n_kept issue @martinmodrak ran into.

martinmodrak · 2024-07-31T07:21:47Z

@jgabry Yes, this doesn't resolve the problamatic logic around n_kept. Also #1131 seems likely to break the old .CSVs, though. I didn't do a full checkout, but notice:

as.integer(as.logical("0"))
# [1] NA
as.integer(as.logical("false"))
# [1] 0

jgabry · 2024-07-31T17:31:13Z

Also #1131 seems likely to break the old .CSVs, though.

Ah good point. Perhaps we should revert #1131 and figure out a more general approach? @bgoodri @andrjohns we probably need to decide whether we want to keep supporting this functionality in RStan. If there continue to be changes in the CSV files created by CmdStan I don't think RStan can keep up with releases in order to make sure read_stan_csv keeps working. Do we even need it anymore given that CmdStanR exists?

andrjohns · 2024-07-31T17:36:58Z

The approach I took for cmdstanr for compatibility with both was to convert any true/false to 0/1, so that the parsing logic for the rest stayed the same

martinmodrak · 2024-07-31T17:47:07Z

Note that using cmdstanr with brms relies on read_stan_csv. We probably should add a test with fixed CSVs from s couple CmdStan versions and it should be OK.... Dne st 31. 7. 2024 19:37 uživatel Andrew Johnson ***@***.***> napsal:

…

The approach I took for cmdstanr for compatibility with both was to convert any true/false to 0/1 <https://github.com/stan-dev/cmdstanr/blob/0e3a99a52bc5ad15d9e386e9993c10ad6905818b/R/csv.R#L843>, so that the parsing logic for the rest stayed the same — Reply to this email directly, view it on GitHub <#1133 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACILKU3PJQLPI4IAEB7ONHDZPEOFFAVCNFSM6AAAAABKVJVGVGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENRRGAZDMNZXGE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

andrjohns · 2024-07-31T17:51:14Z

Note that using cmdstanr with brms relies on read_stan_csv.

It doesn't look like that's the case anymore (which makes sense, otherwise there would have been a ton of bug reports in brms once 2.35 was released)

jgabry · 2024-07-31T18:43:06Z

otherwise there would have been a ton of bug reports in brms once 2.35 was released

Yeah that would have been a mess!

katrinabrock · 2024-11-08T15:34:07Z

I was getting the if (max(save_warmup) == 0L) error and installed the latest version of rstan (from r-universe) and I'm now getting object 'n_kept' not found error.

Is there a magical version I can use where I won't get either of these errors? Or is there a way I need to modify the input data to make it work?

I got the code+data combination I'm attempting from someone else and it was working at one point. I'm not sure what package version were used then. :-/

jgabry · 2024-11-08T18:46:07Z

Sorry for the hassle! I'm honestly not sure. Originally rstan::read_stan_csv only existed because there was no such thing as cmdstanr. So users needed a way to import CmdStan output into R. But now that cmdstanr has been around for years, it hasn't been a priority to keep the CmdStan csv files (which change sometimes) backwards compatible with rstan::read_stan_csv nor has there been enough bandwidth among developers to keep rstan::read_stan_csv up to date with changes to CmdStan.

Is this for a model fit with cmdstanr? Here's some code from @andrjohns to produce a named list of arrays of posterior samples, similar to what rstan would do:

https://discourse.mc-stan.org/t/rstan-read-stan-csv-throwing-error-with-cmdstan-models-versions-2-35/35665/7

That will hopefully allow you to avoid rstan::read_stan_csv entirely.

jgabry · 2024-11-08T18:48:14Z

@bgoodri In the next RStan release read_stan_csv should probably be formally deprecated unless the plan is to keep updating it when CmdStan csv files change, but I don't think that's been the plan so far.

jgabry · 2024-11-08T21:44:27Z

@katrinabrock in addition to my previous comment, another option could be to run an older version of cmdstan. I think it might just be as of 2.35.0 that this is an issue, so you could try installing e.g. 2.34.0. cmdstanr::install_cmdstan() will let you specify version = "2.34.0".

katrinabrock · 2024-11-09T10:31:20Z

@jgabry

Thanks for the quick response! switching cmdstanr to version 2.34.0 did not appear to resolve the issue.
I'm a bit puzzled because the code I'm working with does not appear to be using cmdstanr. It appears to be using rstan only and passing an r object in as data. I will dig a bit more an post back if I discover anything useful.

jgabry · 2024-11-09T19:45:42Z

Yeah changing the cmdstanr version would only work if the csv files were being produced by cmdstanr. It’s weird if this is only using rstan because read_stan_csv() shouldn’t be necessary to use at all in that case. So yeah, quite strange!

katrinabrock · 2024-11-10T11:34:33Z

ok! I did some more digging and discovered a couple mistakes I made in my initial debugging: 1) I initially thought the script I was working on was calling rstan::stan when it was in fact calling rethinking::stan (https://github.com/rmcelreath/rethinking/blob/master/R/cmdstan_support.r#L104). 2) I switched my cmdstan versions, but forgot that cmdstan version mismatch will not automatically trigger recompilation of the model. Therefore, I was still running my code with models produced with 2.35.0 so of course this did not do anything to resolve the issue.

With this in mind, here is my advice to others (@santikka ) to workaround this issue:

Workarounds

Workarounds specific to rethinking package

For those hitting this issue from using the rethinking package (traceback includes this line https://github.com/rmcelreath/rethinking/blob/master/R/cmdstan_support.r#L97). This is happening because rethinking is trying to convert the cmdstanr output into a rstan output which is no longer supporter (per @jgabry above).

To resolve this, either

add rstan_out=FALSE to your rethinking::stan call which will result in the call returning a cmdstanr output instead of attempting to convert

OR

switch to set_ulam_cmdstan(FALSE) which will take cmdstanr out of the picture, use pure rstan and thus the result format will be unchanged

OR

use the general solution below

General workaround

Use cmdstan 3.34.0 or earlier.

Steps:
Step 1) (one time per machine) Install earlier version of cmdstanr. cmdstanr::install_cmdstan(version = "2.34.0").

Step 2) (every time run your legacy code) Activate the older version of cmdstan like cmdstanr::set_cmdstan_path(file.path(dirname(cmdstanr::cmdstan_path()), 'cmdstan-2.34.0')). (Should work cross platform, but only tested on Linux.). You can check that this worked by running cmdstanr::cmdstan_version(). Result should be "2.34.0". It would make sense to put this cmdstanr::set_cmdstan_path(file... at the top of any script where you're having this issue.

Step 3) (one time per model) Recompile any models that you compiled with an incompatible version of cmdstanr. There are a few ways you can do this:

Option A) Delete the exe file. This file will usually be in the same directory as your stan code. e.g. if you have a my-model.stan file, you will see another file called my-model in the same directory. When you delete this and re-run your model, it will automatically be re-created with the version of cmdstan that is currently activated.
Option B) If you are calling cmdstanr directly, add force_recompile = TRUE to either your cmdstanr::cmdstan_model(...) call or your $compile call.
Option C) Run options("cmdstanr_force_recompile" = TRUE) before running your model. This works even if you are not calling cmdstanr directly, but via some other library such as rethinking.

Note: With Option B or C, once you've done all this recompiling once for each model with version 2.34.0, you probably don't need to recompile every single time you run your code. To go back to not recompiling by default, remove force_recompile = TRUE from your arguments or run options("cmdstanr_force_recompile" = FALSE).

Real Solution

Re-write your code to use only cmdstanr since it's better supported than rstan and much better supported than a hybrid.

jgabry · 2024-11-11T17:14:09Z

Thanks so much for the detailed report! Hopefully this is useful to anyone else running into the same problem.

martinmodrak mentioned this issue Jul 30, 2024

Unable to replicate the 'SBC for brms models' hyunjimoon/SBC#104

Closed

michaldanaj mentioned this issue Nov 5, 2024

'n_kept not found' error during rstan::read_stan_csv(fit$output_files()) #1143

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`rstan` fails to read `cmdstanr` output .csv files #1133

`rstan` fails to read `cmdstanr` output .csv files #1133

santikka commented Jul 10, 2024

martinmodrak commented Jul 30, 2024 •

edited

Loading

jgabry commented Jul 30, 2024 •

edited

Loading

martinmodrak commented Jul 31, 2024

jgabry commented Jul 31, 2024

andrjohns commented Jul 31, 2024

martinmodrak commented Jul 31, 2024 via email

andrjohns commented Jul 31, 2024

jgabry commented Jul 31, 2024

katrinabrock commented Nov 8, 2024

jgabry commented Nov 8, 2024

jgabry commented Nov 8, 2024

jgabry commented Nov 8, 2024

katrinabrock commented Nov 9, 2024

jgabry commented Nov 9, 2024

katrinabrock commented Nov 10, 2024 •

edited

Loading

jgabry commented Nov 11, 2024

rstan fails to read cmdstanr output .csv files #1133

rstan fails to read cmdstanr output .csv files #1133

Comments

santikka commented Jul 10, 2024

Summary:

Description:

Reproducible Steps:

RStan Version:

R Version:

Operating System:

martinmodrak commented Jul 30, 2024 • edited Loading

jgabry commented Jul 30, 2024 • edited Loading

martinmodrak commented Jul 31, 2024

jgabry commented Jul 31, 2024

andrjohns commented Jul 31, 2024

martinmodrak commented Jul 31, 2024 via email

andrjohns commented Jul 31, 2024

jgabry commented Jul 31, 2024

katrinabrock commented Nov 8, 2024

jgabry commented Nov 8, 2024

jgabry commented Nov 8, 2024

jgabry commented Nov 8, 2024

katrinabrock commented Nov 9, 2024

jgabry commented Nov 9, 2024

katrinabrock commented Nov 10, 2024 • edited Loading

Workarounds

Workarounds specific to rethinking package

General workaround

Real Solution

jgabry commented Nov 11, 2024

`rstan` fails to read `cmdstanr` output .csv files #1133

`rstan` fails to read `cmdstanr` output .csv files #1133

martinmodrak commented Jul 30, 2024 •

edited

Loading

jgabry commented Jul 30, 2024 •

edited

Loading

katrinabrock commented Nov 10, 2024 •

edited

Loading