Skip to content

Commit

Permalink
make CRAN happy
Browse files Browse the repository at this point in the history
Signed-off-by: Yitao Li <yitao@rstudio.com>
  • Loading branch information
yitao-li committed Dec 15, 2020
1 parent 0837e60 commit fd8378d
Show file tree
Hide file tree
Showing 10 changed files with 45 additions and 212 deletions.
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,14 @@ Maintainer: Yitao Li <yitao@rstudio.com>
Description: Load WARC (Web ARChive) files into Apache Spark using 'sparklyr'. This
allows to read files from the Common Crawl project <http://commoncrawl.org/>.
License: Apache License 2.0
BugReports: https://github.com/javierluraschi/sparkwarc
BugReports: https://github.com/r-spark/sparkwarc
Encoding: UTF-8
LazyData: true
Imports:
DBI,
sparklyr,
Rcpp
RoxygenNote: 6.0.1
RoxygenNote: 7.1.1
LinkingTo:
Rcpp,
SystemRequirements: C++11
201 changes: 0 additions & 201 deletions LICENSE

This file was deleted.

2 changes: 2 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ export(spark_read_warc)
export(spark_read_warc_sample)
export(spark_warc_sample_path)
import(DBI)
import(Rcpp)
import(sparklyr)
importFrom(utils,download.file)
importFrom(utils,read.table)
useDynLib(sparkwarc, .registration = TRUE)
8 changes: 8 additions & 0 deletions R/package.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#' sparkwarc
#'
#' Sparklyr extension for loading WARC Files into Apache Spark
#'
#' @docType package
#' @import Rcpp
#' @name sparkwarc
NULL
4 changes: 2 additions & 2 deletions R/sample.R
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,12 @@ spark_warc_sample_path <- function() {
rcpp_read_warc_sample <- function(filter = "", include = "") {
sample_warc <- spark_warc_sample_path()

sparkwarc:::rcpp_read_warc(sample_warc, filter, include)
rcpp_read_warc(sample_warc, filter, include)
}

#' Loads the sample warc file in Spark
#'
#' @param An active \code{spark_connection}.
#' @param sc An active \code{spark_connection}.
#' @param filter A regular expression used to filter to each warc entry
#' efficiently by running native code using \code{Rcpp}.
#' @param include A regular expression used to keep only matching lines
Expand Down
7 changes: 5 additions & 2 deletions R/sparkwarc.R
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
#'
#' @examples
#'
#' \dontrun{
#' library(sparklyr)
#' sc <- spark_connect(master = "spark://HOST:PORT")
#' df <- spark_read_warc(
Expand All @@ -31,9 +32,11 @@
#' )
#'
#' spark_disconnect(sc)
#'}
#'
#' @export
#' @import DBI
#' @importFrom utils download.file
#' @export
spark_read_warc <- function(sc,
name,
path,
Expand Down Expand Up @@ -87,7 +90,7 @@ spark_read_warc <- function(sc,
spark_apply_log("finished downloading warc file")
}

result <- sparkwarc::spark_rcpp_read_warc(path, match_warc, match_line)
result <- spark_rcpp_read_warc(path, match_warc, match_line)

if (!is.null(temp_warc)) unlink(temp_warc)

Expand Down
17 changes: 14 additions & 3 deletions man/spark_read_warc.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions man/spark_read_warc_sample.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 9 additions & 0 deletions man/sparkwarc.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions src/Makevars
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
PKG_LIBS=-lz

0 comments on commit fd8378d

Please sign in to comment.