Description
Bug description
Thanks a lot for your work, quarto is awesome! My goal is to use one single .qmd file and render it n times in parallel on a HPC cluster (via a SLURM job array or MPI process) to create n output files. This could be e.g. parameterized or also (like in my case) just executing the same .qmd n times.
However, only one output file is created: Output created: test1.html
For the rest (e.g. test2.html) I get an error:
Error in readLines(con, warn = FALSE) : cannot open the connection
Calls: .main ... partition_yaml_front_matter -> grep -> is.factor -> read_utf8 -> readLines
In addition: Warning message:
In readLines(con, warn = FALSE) :
cannot open file 'test.qmd': No such file or directory
Execution halted
It seems that the first / quickest executed run is able to be rendered correctly and then there seem to be conflicting processes with reading/writing the .qmd file.
Steps to reproduce
The .qmd content doesn't really matter at this point, but here is an example:
---
title: "Test"
author: "Test"
date: now
format:
html:
toc: true
---
# Session Information
```{r}
# display session info
sessionInfo()
```
My shell script (exe.sh) looks like this:
#!/bin/bash
#SBATCH --job-name=quarto_test
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --mem-per-cpu=4096
#SBATCH --time=00:01:00
#SBATCH --qos=standard
module add R/4.2.2-foss-2022b
cd Scripts
quarto render test.qmd --output test${SLURM_ARRAY_TASK_ID}.html
The idea is to pass ${SLURM_ARRAY_TASK_ID} to the --output option to create n individual output files. On the cluster I then call
[xxx@curta 01_exe]$ sbatch --array=1-2 exe.sh
to execute the same .qmd file 2 times to create test1.html and test2.html.
Expected behavior
I would expect that the .qmd file remains untouched during the rendering process and that there shouldn't occur reading/writing issues.
For the code example I would expect that the same .qmd file gets executed 2 times to create test1.html and test2.html. It would be nice to have support for parallel computing of quarto documents without needing to create n copies of the .qmd file to avoid this error.
Actual behavior
The first job within the array converges, the other jobs throw the error mentioned in the bug description. My workaround so far is to create n copies of the .qmd files and name then test1.qmd, test2.qmd, .... and then use within the shell script:
quarto render test${SLURM_ARRAY_TASK_ID}.qmd
which works fine but is not the best way, especially when having a large number of runs.
Your environment
Platform: x86_64-pc-linux-gnu (64-bit) running on a HPC cluster
Running under: CentOS Linux 7 (Core)
Quarto check output
[✓] Checking versions of quarto binary dependencies...
Pandoc version 3.1.1: OK
Dart Sass version 1.55.0: OK
[✓] Checking versions of quarto dependencies......OK
[✓] Checking Quarto installation......OK
Version: 1.3.361
Path: /home/mklose/opt/quarto-1.3.361/bin
[✓] Checking basic markdown render....OK
[✓] Checking Python 3 installation....OK
Version: 3.10.8
Path: /trinity/shared/easybuild/software/Python/3.10.8-GCCcore-12.2.0/bin/python3
Jupyter: (None)
Jupyter is not available in this Python installation.
Install with python3 -m pip install jupyter
[✓] Checking R installation...........OK
Version: 4.2.2
Path: /trinity/shared/easybuild/software/R/4.2.2-foss-2022b/lib64/R
LibPaths:
- /home/mklose/R/x86_64-pc-linux-gnu-library/4.2
- /trinity/shared/easybuild/software/R/4.2.2-foss-2022b/lib64/R/library
knitr: 1.42
rmarkdown: 2.20
[✓] Checking Knitr engine render......OK