Skip to content

Commit

Permalink
fixed badge
Browse files Browse the repository at this point in the history
  • Loading branch information
schochastics committed Sep 23, 2024
1 parent d46c700 commit 7737a05
Show file tree
Hide file tree
Showing 2 changed files with 48 additions and 50 deletions.
2 changes: 1 addition & 1 deletion README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ knitr::opts_chunk$set(

<!-- badges: start -->
[![CRAN status](https://www.r-pkg.org/badges/version/webtrackR)](https://CRAN.R-project.org/package=webtrackR)
[![CRAN Downloads](http://cranlogs.r-pkg.org/badges/webtrackR)](https://CRAN.R-project.org/package=webtrackR)
[![CRAN Downloads](https://cranlogs.r-pkg.org/badges/webtrackR)](https://CRAN.R-project.org/package=webtrackR)
[![R-CMD-check](https://github.com/schochastics/webtrackR/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/schochastics/webtrackR/actions/workflows/R-CMD-check.yaml)
[![Codecov test coverage](https://codecov.io/gh/schochastics/webtrackR/branch/main/graph/badge.svg)](https://app.codecov.io/gh/schochastics/webtrackR?branch=main)
<!-- badges: end -->
Expand Down
96 changes: 47 additions & 49 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
[![CRAN
status](https://www.r-pkg.org/badges/version/webtrackR)](https://CRAN.R-project.org/package=webtrackR)
[![CRAN
Downloads](http://cranlogs.r-pkg.org/badges/webtrackR)](https://CRAN.R-project.org/package=webtrackR)
Downloads](https://cranlogs.r-pkg.org/badges/webtrackR)](https://CRAN.R-project.org/package=webtrackR)
[![R-CMD-check](https://github.com/schochastics/webtrackR/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/schochastics/webtrackR/actions/workflows/R-CMD-check.yaml)
[![Codecov test
coverage](https://codecov.io/gh/schochastics/webtrackR/branch/main/graph/badge.svg)](https://app.codecov.io/gh/schochastics/webtrackR?branch=main)
Expand All @@ -22,11 +22,11 @@ survey data of the same participants.
`webtrackR` is part of a series of R packages to analyse webtracking
data:

- [webtrackR](https://github.com/schochastics/webtrackR): preprocess
raw webtracking data
- [domainator](https://github.com/schochastics/domainator): classify
domains
- [adaR](https://github.com/gesistsa/adaR): parse urls
- [webtrackR](https://github.com/schochastics/webtrackR): preprocess raw
webtracking data
- [domainator](https://github.com/schochastics/domainator): classify
domains
- [adaR](https://github.com/gesistsa/adaR): parse urls

## Installation

Expand Down Expand Up @@ -54,9 +54,9 @@ method are included in the package.
Each row in a web tracking data set represents a visit. Raw data need to
have at least the following variables:

- `panelist_id`: the individual from which the data was collected
- `url`: the URL of the visit
- `timestamp`: the time of the URL visit
- `panelist_id`: the individual from which the data was collected
- `url`: the URL of the visit
- `timestamp`: the time of the URL visit

The function `as.wt_dt` assigns the class `wt_dt` to a raw web tracking
data set. It also allows you to specify the name of the raw variables
Expand All @@ -71,52 +71,50 @@ Otherwise an error is thrown.
Several other variables can be derived from the raw data with the
following functions:

- `add_duration()` adds a variable called `duration` based on the
sequence of timestamps. The basic logic is that the duration of a
visit is set to the time difference to the subsequent visit, unless
this difference exceeds a certain value (defined by argument
`cutoff`), in which case the duration will be replaced by `NA` or
some user-defined value (defined by `replace_by`).
- `add_session()` adds a variable called `session`, which groups
subsequent visits into a session until the difference to the next
visit exceeds a certain value (defined by `cutoff`).
- `extract_host()`, `extract_domain()`, `extract_path()` extracts the
host, domain and path of the raw URL and adds variables named
accordingly. See function descriptions for definitions of these
terms. `drop_query()` lets you drop the query and fragment
components of the raw URL.
- `add_next_visit()` and `add_previous_visit()` adds the previous or
the next URL, domain, or host (defined by `level`) as a new
variable.
- `add_referral()` adds a new variable indicating whether a visit was
referred by a social media platform. Follows the logic of Schmidt et
al., [(2023)](https://doi.org/10.31235/osf.io/cks68).
- `add_title()` downloads the title of a website (the text within the
`<title>` tag of a web site’s `<head>`) and adds it as a new
variable.
- `add_panelist_data()`. Joins a data set containing information about
participants such as a survey.
- `add_duration()` adds a variable called `duration` based on the
sequence of timestamps. The basic logic is that the duration of a
visit is set to the time difference to the subsequent visit, unless
this difference exceeds a certain value (defined by argument
`cutoff`), in which case the duration will be replaced by `NA` or some
user-defined value (defined by `replace_by`).
- `add_session()` adds a variable called `session`, which groups
subsequent visits into a session until the difference to the next
visit exceeds a certain value (defined by `cutoff`).
- `extract_host()`, `extract_domain()`, `extract_path()` extracts the
host, domain and path of the raw URL and adds variables named
accordingly. See function descriptions for definitions of these terms.
`drop_query()` lets you drop the query and fragment components of the
raw URL.
- `add_next_visit()` and `add_previous_visit()` adds the previous or the
next URL, domain, or host (defined by `level`) as a new variable.
- `add_referral()` adds a new variable indicating whether a visit was
referred by a social media platform. Follows the logic of Schmidt et
al., [(2023)](https://doi.org/10.31235/osf.io/cks68).
- `add_title()` downloads the title of a website (the text within the
`<title>` tag of a web site’s `<head>`) and adds it as a new variable.
- `add_panelist_data()`. Joins a data set containing information about
participants such as a survey.

## Classification

- `classify_visits()` categorizes website visits by either extracting
the URL’s domain or host and matching them to a list of domains or
hosts, or by matching a list of regular expressions against the
visit URL.
- `classify_visits()` categorizes website visits by either extracting
the URL’s domain or host and matching them to a list of domains or
hosts, or by matching a list of regular expressions against the visit
URL.

## Summarizing and aggregating

- `deduplicate()` flags or drops (as defined by argument `method`)
consecutive visits to the same URL within a user-defined time frame
(as set by argument `within`). Alternatively to dropping or flagging
visits, the function aggregates the durations of such duplicate
visits.
- `sum_visits()` and `sum_durations()` aggregate the number or the
durations of visits, by participant and by a time period (as set by
argument `timeframe`). Optionally, the function aggregates the
number / duration of visits to a certain class of visits.
- `sum_activity()` counts the number of active time periods (defined
by `timeframe`) by participant.
- `deduplicate()` flags or drops (as defined by argument `method`)
consecutive visits to the same URL within a user-defined time frame
(as set by argument `within`). Alternatively to dropping or flagging
visits, the function aggregates the durations of such duplicate
visits.
- `sum_visits()` and `sum_durations()` aggregate the number or the
durations of visits, by participant and by a time period (as set by
argument `timeframe`). Optionally, the function aggregates the number
/ duration of visits to a certain class of visits.
- `sum_activity()` counts the number of active time periods (defined by
`timeframe`) by participant.

## Example code

Expand Down

0 comments on commit 7737a05

Please sign in to comment.