Skip to content

Links in epub and epubcheck #766

Closed
@muschellij2

Description

@muschellij2

By filing an issue to this repo, I promise that

  • I have fully read the issue guide at https://yihui.name/issue/.
  • I have provided the necessary information about my issue.
    • If I'm asking a question, I have already asked it on Stack Overflow or RStudio Community, waited for at least 24 hours, and included a link to my question there.
    • If I'm filing a bug report, I have included a minimal, self-contained, and reproducible example, and have also included xfun::session_info('bookdown'). I have upgraded all my packages to their latest versions (e.g., R, RStudio, and R packages), and also tried the development version: remotes::install_github('rstudio/bookdown').
    • If I have posted the same issue elsewhere, I have also mentioned it in this issue.
  • I have learned the Github Markdown syntax, and formatted my issue correctly.

I understand that my issue may be closed if I don't fulfill my promises.

This has been posted at rstudio/bookdown-demo#42, but probably better here. I will look into pandoc and bookdown to see if I can diagnose

Clone the Repo

library(git2r)
library(bookdown)
local_path = "bookdown-demo"
git2r::clone("https://github.com/rstudio/bookdown-demo.git",
             local_path = local_path)
#> cloning into 'bookdown-demo'...
#> Receiving objects:   1% (6/530),    9 kb
#> Receiving objects:  11% (59/530),   17 kb
#> Receiving objects:  21% (112/530),  121 kb
#> Receiving objects:  31% (165/530),  321 kb
#> Receiving objects:  41% (218/530),  409 kb
#> Receiving objects:  51% (271/530),  473 kb
#> Receiving objects:  61% (324/530),  545 kb
#> Receiving objects:  71% (377/530),  577 kb
#> Receiving objects:  81% (430/530),  585 kb
#> Receiving objects:  91% (483/530),  593 kb
#> Receiving objects: 100% (530/530),  723 kb, done.
#> Local:    master /private/var/folders/1s/wrtqcpxn685_zk570bnx9_rr0000gr/T/Rtmpe46Igu/reprex95b433c7d4f8/bookdown-demo
#> Remote:   master @ origin (https://github.com/rstudio/bookdown-demo.git)
#> Head:     [4e34630] 2018-10-22: Add now.json and Dockerfile for building HTML book and deploy to now.sh (#36)
setwd(local_path)
epub_file = bookdown::render_book(
  "index.Rmd",
  bookdown::epub_book())
#> processing file: bookdown-demo.Rmd
#> output file: bookdown-demo.knit.md
#> /usr/local/bin/pandoc +RTS -K512m -RTS bookdown-demo.utf8.md --to epub3 --from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash --output bookdown-demo.epub --number-sections --filter /usr/local/bin/pandoc-citeproc
#> 
#> Output created: _book/bookdown-demo.epub
epub_file = normalizePath(epub_file)

This is a function to fix one simple id, which is hard coded.

fix_one_id = function(epub_file) {
  epub_dir = tempfile()
  dir.create(epub_dir, recursive = TRUE)
  epub_files = unzip(epub_file, exdir = epub_dir, 
                     junkpaths = TRUE, list = TRUE)
  epub_files = epub_files$Name
  res = unzip(epub_file, exdir = epub_dir)
  
  all_xhtml = list.files(
    pattern = ".xhtml", 
    path = file.path(epub_dir, "EPUB", "text"),
    recursive = FALSE, full.names = TRUE)
  
  ifile = all_xhtml[2]
  # for (ifile in all_xhtml) {
  x = readLines(ifile)
  x[grep("file0", x)-1] = paste0(
    '<div class="figure" style="text-align: center" ', 
    'id="fig:nice-fig">')
  writeLines(x, ifile)
  # }
  owd = getwd()
  on.exit({
    setwd(owd)
  })
  setwd(epub_dir)
  new_epub = tempfile(fileext = ".epub")
  zip(new_epub, files = epub_files)
  # file.copy(new_epub, epub_file, overwrite = TRUE)
  return(new_epub)
}

Simple epub checker function

The epubcheck R function will get the output from epubcheck.

epubcheck = function(epub_file) {
  res = system2("epubcheck", epub_file, stdout = TRUE, stderr = TRUE)
  res
}

Then num_errors will count the number of errors

num_errors = function(out) {
  out = grep("Messages", out, value = TRUE)
  out = sub(".* (.*) errors.*", "\\1", out)
  as.numeric(out)
}

Test output

Here we see we get 5 errors from the result

result = epubcheck(epub_file)
#> Warning in system2("epubcheck", epub_file, stdout = TRUE, stderr
#> = TRUE): running command ''epubcheck' /private/var/folders/1s/
#> wrtqcpxn685_zk570bnx9_rr0000gr/T/Rtmpe46Igu/reprex95b433c7d4f8/bookdown-
#> demo/_book/bookdown-demo.epub 2>&1' had status 1
result
#>  [1] "Validating using EPUB version 3.2 rules."                                                                                                                                    
#>  [2] "ERROR(RSC-005): ./bookdown-demo/_book/bookdown-demo.epub/EPUB/nav.xhtml(19,9): Error while parsing file: element \"ol\" incomplete; missing required element \"li\""         
#>  [3] "ERROR(RSC-005): ./bookdown-demo/_book/bookdown-demo.epub/EPUB/text/ch002.xhtml(87,74): Error while parsing file: value of attribute \"width\" is invalid; must be an integer"
#>  [4] "ERROR(RSC-012): ./bookdown-demo/_book/bookdown-demo.epub/EPUB/text/ch002.xhtml(82,247): Fragment identifier is not defined."                                                 
#>  [5] "ERROR(RSC-012): ./bookdown-demo/_book/bookdown-demo.epub/EPUB/text/ch002.xhtml(92,123): Fragment identifier is not defined."                                                 
#>  [6] "ERROR(RSC-012): ./bookdown-demo/_book/bookdown-demo.epub/EPUB/text/ch002.xhtml(92,252): Fragment identifier is not defined."                                                 
#>  [7] ""                                                                                                                                                                            
#>  [8] "Check finished with errors"                                                                                                                                                  
#>  [9] "Messages: 0 fatals / 5 errors / 0 warnings / 0 infos"                                                                                                                        
#> [10] ""                                                                                                                                                                            
#> [11] "EPUBCheck completed"                                                                                                                                                         
#> attr(,"status")
#> [1] 1
num_errors(result)
#> [1] 5

Here we see we get only 4 errors (one fixed) after adding an id.

fixed = fix_one_id(epub_file)
new_result = epubcheck(fixed)
#> Warning in system2("epubcheck", epub_file, stdout = TRUE,
#> stderr = TRUE): running command ''epubcheck' /var/folders/1s/
#> wrtqcpxn685_zk570bnx9_rr0000gr/T//RtmpRWHDVK/file993116f24b23.epub 2>&1'
#> had status 1
new_result
#>  [1] "Validating using EPUB version 3.2 rules."                                                                                                                                                                              
#>  [2] "ERROR(RSC-005): /var/folders/1s/wrtqcpxn685_zk570bnx9_rr0000gr/T//RtmpRWHDVK/file993116f24b23.epub/EPUB/nav.xhtml(19,9): Error while parsing file: element \"ol\" incomplete; missing required element \"li\""         
#>  [3] "ERROR(RSC-005): /var/folders/1s/wrtqcpxn685_zk570bnx9_rr0000gr/T//RtmpRWHDVK/file993116f24b23.epub/EPUB/text/ch002.xhtml(87,74): Error while parsing file: value of attribute \"width\" is invalid; must be an integer"
#>  [4] "ERROR(RSC-012): /var/folders/1s/wrtqcpxn685_zk570bnx9_rr0000gr/T//RtmpRWHDVK/file993116f24b23.epub/EPUB/text/ch002.xhtml(82,247): Fragment identifier is not defined."                                                 
#>  [5] "ERROR(RSC-012): /var/folders/1s/wrtqcpxn685_zk570bnx9_rr0000gr/T//RtmpRWHDVK/file993116f24b23.epub/EPUB/text/ch002.xhtml(92,252): Fragment identifier is not defined."                                                 
#>  [6] ""                                                                                                                                                                                                                      
#>  [7] "Check finished with errors"                                                                                                                                                                                            
#>  [8] "Messages: 0 fatals / 4 errors / 0 warnings / 0 infos"                                                                                                                                                                  
#>  [9] ""                                                                                                                                                                                                                      
#> [10] "EPUBCheck completed"                                                                                                                                                                                                   
#> attr(,"status")
#> [1] 1
num_errors(new_result)
#> [1] 4

Created on 2019-08-28 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> ─ Session info ──────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.6.0 (2019-04-26)
#>  os       macOS Mojave 10.14.6        
#>  system   x86_64, darwin15.6.0        
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       America/New_York            
#>  date     2019-08-28                  
#> 
#> ─ Packages ──────────────────────────────────────────────────────────────
#>  package     * version     date       lib
#>  assertthat    0.2.1       2019-03-21 [1]
#>  backports     1.1.4       2019-04-10 [1]
#>  bookdown    * 0.11        2019-05-28 [1]
#>  callr         3.3.1       2019-07-18 [1]
#>  cli           1.1.0       2019-03-19 [1]
#>  crayon        1.3.4       2017-09-16 [1]
#>  curl          4.0         2019-07-22 [1]
#>  desc          1.2.0       2019-07-10 [1]
#>  devtools      2.1.0       2019-07-06 [1]
#>  digest        0.6.20      2019-07-04 [1]
#>  evaluate      0.14        2019-05-28 [1]
#>  fs            1.3.1       2019-05-06 [1]
#>  git2r       * 0.26.1      2019-06-29 [1]
#>  glue          1.3.1       2019-03-12 [1]
#>  highr         0.8         2019-03-20 [1]
#>  htmltools     0.3.6       2017-04-28 [1]
#>  httr          1.4.1       2019-08-05 [1]
#>  knitr         1.24        2019-08-08 [1]
#>  magrittr      1.5         2014-11-22 [1]
#>  memoise       1.1.0       2017-04-21 [1]
#>  mime          0.7         2019-06-11 [1]
#>  pkgbuild      1.0.3       2019-03-20 [1]
#>  pkgload       1.0.2       2018-10-29 [1]
#>  prettyunits   1.0.2       2015-07-13 [1]
#>  processx      3.4.1       2019-07-18 [1]
#>  ps            1.3.0       2018-12-21 [1]
#>  R6            2.4.0       2019-02-14 [1]
#>  Rcpp          1.0.2       2019-07-25 [1]
#>  remotes       2.1.0       2019-06-24 [1]
#>  rlang         0.4.0       2019-06-25 [1]
#>  rmarkdown     1.14        2019-07-12 [1]
#>  rprojroot     1.3-2       2018-01-03 [1]
#>  rstudioapi    0.10.0-9000 2019-07-30 [1]
#>  sessioninfo   1.1.1       2018-11-05 [1]
#>  stringi       1.4.3       2019-03-12 [1]
#>  stringr       1.4.0       2019-02-10 [1]
#>  testthat      2.1.1       2019-04-23 [1]
#>  usethis       1.5.1.9000  2019-08-15 [1]
#>  withr         2.1.2       2018-03-15 [1]
#>  xfun          0.8         2019-06-25 [1]
#>  xml2          1.2.1       2019-07-29 [1]
#>  yaml          2.2.0       2018-07-25 [1]
#>  source                             
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  Github (muschellij2/desc@b0c374f)  
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  Github (rstudio/rstudioapi@31d1afa)
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  local                              
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#>  CRAN (R 3.6.0)                     
#> 
#> [1] /Library/Frameworks/R.framework/Versions/3.6/Resources/library

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugan unexpected problem or unintended behavior

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions