Skip to content

Commit

Permalink
add CRAN changes to repo
Browse files Browse the repository at this point in the history
  • Loading branch information
pachadotdev committed Jan 3, 2025
1 parent 7325a8d commit 03cabea
Show file tree
Hide file tree
Showing 14 changed files with 52 additions and 24 deletions.
6 changes: 3 additions & 3 deletions CRAN-SUBMISSION
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
Version: 1.0.5-3
Date: 2024-05-20 21:23:46 UTC
SHA: afc71d378ab49fa29ae1c6075d499f8249a80110
Version: 1.0.5-5
Date: 2024-11-15 04:06:48 UTC
SHA: 7325a8d2b58e7fb0dc6097572b4e20bd469e8dc5
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Description: Bindings for the 'Tabula' <https://tabula.technology/> 'Java'
journalism. It allows for automatic and manual table extraction, the latter
facilitated through a 'Shiny' interface, enabling manual areas selection\
with a computer mouse for data retrieval.
Version: 1.0.5-4
Version: 1.0.5-5
Authors@R: c(
person("Thomas J.", "Leeper",
role = "aut",
Expand Down Expand Up @@ -57,4 +57,4 @@ SystemRequirements: Java (>= 7.0):
openjdk@11 (brew)
VignetteBuilder: knitr
Encoding: UTF-8
RoxygenNote: 7.3.1
RoxygenNote: 7.3.2
4 changes: 4 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
# CHANGES TO tabulapdf 1.0.5-5

* Updated tests to use offline files.

# CHANGES TO tabulapdf 1.0.5-4

* Faster Shiny interface (parts of PR #56, @jkeuskamp)
Expand Down
1 change: 0 additions & 1 deletion R/extract_metadata.R
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@
#' @examples
#' # simple demo file
#' f <- system.file("examples", "mtcars.pdf", package = "tabulapdf")
#'
#' extract_metadata(f)
#' @seealso \code{\link{extract_tables}}, \code{\link{extract_areas}}, \code{\link{extract_text}}, \code{\link{split_pdf}}
#' @importFrom rJava J new
Expand Down
1 change: 0 additions & 1 deletion R/extract_tables.R
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,6 @@
#' @examples
#' # simple demo file
#' f <- system.file("examples", "mtcars.pdf", package = "tabulapdf")
#'
#' # extract tables from only second page
#' extract_tables(f, pages = 2)
#' @seealso \code{\link{extract_areas}}, \code{\link{get_page_dims}}, \code{\link{make_thumbnails}}, \code{\link{split_pdf}}
Expand Down
7 changes: 2 additions & 5 deletions R/extract_text.R
Original file line number Diff line number Diff line change
Expand Up @@ -14,13 +14,10 @@
#' @examples
#' # simple demo file
#' f <- system.file("examples", "fortytwo.pdf", package = "tabulapdf")
#'
#' # extract all text
#' extract_text(f)
#'
#'
#' # extract all text from page 1 only
#' extract_text(f, pages = 1)
#'
#'
#' # extract text from selected area only
#' extract_text(f, area = list(c(209.4, 140.5, 304.2, 500.8)))
#' @seealso \code{\link{extract_tables}}, \code{\link{extract_areas}}, \code{\link{split_pdf}}
Expand Down
1 change: 0 additions & 1 deletion R/get_page_dims.R
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@
#' @examples
#' # simple demo file
#' f <- system.file("examples", "mtcars.pdf", package = "tabulapdf")
#'
#' get_n_pages(file = f)
#' get_page_dims(f)
#' @importFrom tools file_path_sans_ext
Expand Down
5 changes: 3 additions & 2 deletions R/make_thumbnails.R
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,9 @@
#' @examples
#' # simple demo file
#' f <- system.file("examples", "mtcars.pdf", package = "tabulapdf")
#'
#' make_thumbnails(f)
#'
#' # extract thumbnails from the first page
#' make_thumbnails(f, page = 1)
#' @importFrom tools file_path_sans_ext
#' @importFrom rJava J new .jfloat
#' @seealso \code{\link{extract_tables}}, \code{\link{extract_text}},
Expand Down
6 changes: 5 additions & 1 deletion README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,13 @@ You need do this before installing rJava or attempting to use tabulapdf. More on
[this](#installing-java-on-windows-with-chocolatey) and
[troubleshooting](#troubleshooting) below.

tabulapdf is not available on CRAN, but it can be installed from rOpenSci's
tabulapdf is available on CRAN, and it can also be installed from rOpenSci's
R-Universe:
```r
# either
install.packages("tabulapdf")

# or
install.packages("tabulapdf", repos = c("https://ropensci.r-universe.dev", "https://cloud.r-project.org"))
```

Expand Down
18 changes: 17 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,14 @@ Java. You need do this before installing rJava or attempting to use
tabulapdf. More on [this](#installing-java-on-windows-with-chocolatey)
and [troubleshooting](#troubleshooting) below.

tabulapdf is not available on CRAN, but it can be installed from
tabulapdf is available on CRAN, and it can also be installed from
rOpenSci’s R-Universe:

``` r
# either
install.packages("tabulapdf")

# or
install.packages("tabulapdf", repos = c("https://ropensci.r-universe.dev", "https://cloud.r-project.org"))
```

Expand Down Expand Up @@ -113,6 +117,18 @@ directory before trying to install the package. This can be changed from
ensure write permission by choosing “Run as administrator” when
launching R (again, from the right-click context menu).

## Debugging

Load the package like this:

``` r
devtools::load_all()
libname = "/home/pacha/R/x86_64-pc-linux-gnu-library/4.4"
pkgname = "tabulapdf"
rJava::.jpackage(pkgname, jars = "*", lib.loc = libname)
rJava::J("java.lang.System")$setProperty("java.awt.headless", "true")
```

## Meta

- Please [report any issues or
Expand Down
17 changes: 12 additions & 5 deletions cran-comments.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,15 @@
## R CMD check results

0 errors | 0 warnings | 1 note
0 errors | 0 warnings | 2 notes

tabulapdf 1.0.5-3
* This is a new release that replaces the archived tabulizer package.
* Moves examples with CPU time > 2.5 times elapsed time on Debian.
* Most of the R functions were designed to make it sure this works with Java 11 or later.
tabulapdf 1.0.5-5
* Updated tests to use offline files.
* The rest remains the same.

❯ checking installed package size ... NOTE
installed size is 13.7Mb
sub-directories of 1Mb or more:
java 12.7Mb

❯ checking for future file timestamps ... NOTE
unable to verify current time
Binary file added inst/examples/argentina.pdf
Binary file not shown.
Binary file added inst/examples/quebec.pdf
Binary file not shown.
6 changes: 4 additions & 2 deletions tests/testthat/test_non-latin.R
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
context("Non-latin character tests")

test_that("Read Spanish language PDF", {
f1 <- "https://github.com/tabulapdf/tabula-java/raw/98957221950af4b90620b51a29e0bbe502eea9ad/src/test/resources/technology/tabula/argentina_diputados_voting_record.pdf"
# file from "https://github.com/tabulapdf/tabula-java/raw/98957221950af4b90620b51a29e0bbe502eea9ad/src/test/resources/technology/tabula/argentina_diputados_voting_record.pdf"
f1 <- system.file("examples", "argentina.pdf", package = "tabulapdf")
t1 <- extract_tables(f1, pages = 1, area = list(c(269.875, 12.75, 790.5, 561)), guess = FALSE)
t1a <- extract_tables(f1, pages = 1, area = list(c(269.875, 12.75, 790.5, 561)), guess = FALSE, output = "tibble", encoding = "latin1")
t1b <- extract_tables(f1, pages = 1, area = list(c(269.875, 12.75, 790.5, 561)), guess = FALSE, output = "tibble", encoding = "UTF-8")
Expand All @@ -11,7 +12,8 @@ test_that("Read Spanish language PDF", {
})

test_that("Read French language PDF w/correct encoding", {
f2 <- "http://www.europarl.europa.eu/oeil/popups/printfichetechnical.pdf?id=673511&lang=fr"
# file from https://cdn-contenu.quebec.ca/cdn-contenu/adm/min/finances/publications-adm/Comptes-publics/FR/CPFR_Devancement_Preparation.pdf
f2 <- system.file("examples", "quebec.pdf", package = "tabulapdf")
t2a <- extract_text(f2, page = 1, encoding = "latin1")
t2b <- extract_text(f2, page = 1, encoding = "UTF-8")
expect_false(t2a == t2b)
Expand Down

0 comments on commit 03cabea

Please sign in to comment.