From 7737a05b9f1155ab1ef8fa95042030b915830f53 Mon Sep 17 00:00:00 2001 From: schochastics Date: Mon, 23 Sep 2024 20:46:06 +0200 Subject: [PATCH] fixed badge --- README.Rmd | 2 +- README.md | 96 ++++++++++++++++++++++++++---------------------------- 2 files changed, 48 insertions(+), 50 deletions(-) diff --git a/README.Rmd b/README.Rmd index 68e028b..54ec926 100644 --- a/README.Rmd +++ b/README.Rmd @@ -17,7 +17,7 @@ knitr::opts_chunk$set( [![CRAN status](https://www.r-pkg.org/badges/version/webtrackR)](https://CRAN.R-project.org/package=webtrackR) -[![CRAN Downloads](http://cranlogs.r-pkg.org/badges/webtrackR)](https://CRAN.R-project.org/package=webtrackR) +[![CRAN Downloads](https://cranlogs.r-pkg.org/badges/webtrackR)](https://CRAN.R-project.org/package=webtrackR) [![R-CMD-check](https://github.com/schochastics/webtrackR/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/schochastics/webtrackR/actions/workflows/R-CMD-check.yaml) [![Codecov test coverage](https://codecov.io/gh/schochastics/webtrackR/branch/main/graph/badge.svg)](https://app.codecov.io/gh/schochastics/webtrackR?branch=main) diff --git a/README.md b/README.md index 4cf4020..31dea77 100644 --- a/README.md +++ b/README.md @@ -8,7 +8,7 @@ [![CRAN status](https://www.r-pkg.org/badges/version/webtrackR)](https://CRAN.R-project.org/package=webtrackR) [![CRAN -Downloads](http://cranlogs.r-pkg.org/badges/webtrackR)](https://CRAN.R-project.org/package=webtrackR) +Downloads](https://cranlogs.r-pkg.org/badges/webtrackR)](https://CRAN.R-project.org/package=webtrackR) [![R-CMD-check](https://github.com/schochastics/webtrackR/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/schochastics/webtrackR/actions/workflows/R-CMD-check.yaml) [![Codecov test coverage](https://codecov.io/gh/schochastics/webtrackR/branch/main/graph/badge.svg)](https://app.codecov.io/gh/schochastics/webtrackR?branch=main) @@ -22,11 +22,11 @@ survey data of the same participants. `webtrackR` is part of a series of R packages to analyse webtracking data: -- [webtrackR](https://github.com/schochastics/webtrackR): preprocess - raw webtracking data -- [domainator](https://github.com/schochastics/domainator): classify - domains -- [adaR](https://github.com/gesistsa/adaR): parse urls +- [webtrackR](https://github.com/schochastics/webtrackR): preprocess raw + webtracking data +- [domainator](https://github.com/schochastics/domainator): classify + domains +- [adaR](https://github.com/gesistsa/adaR): parse urls ## Installation @@ -54,9 +54,9 @@ method are included in the package. Each row in a web tracking data set represents a visit. Raw data need to have at least the following variables: -- `panelist_id`: the individual from which the data was collected -- `url`: the URL of the visit -- `timestamp`: the time of the URL visit +- `panelist_id`: the individual from which the data was collected +- `url`: the URL of the visit +- `timestamp`: the time of the URL visit The function `as.wt_dt` assigns the class `wt_dt` to a raw web tracking data set. It also allows you to specify the name of the raw variables @@ -71,52 +71,50 @@ Otherwise an error is thrown. Several other variables can be derived from the raw data with the following functions: -- `add_duration()` adds a variable called `duration` based on the - sequence of timestamps. The basic logic is that the duration of a - visit is set to the time difference to the subsequent visit, unless - this difference exceeds a certain value (defined by argument - `cutoff`), in which case the duration will be replaced by `NA` or - some user-defined value (defined by `replace_by`). -- `add_session()` adds a variable called `session`, which groups - subsequent visits into a session until the difference to the next - visit exceeds a certain value (defined by `cutoff`). -- `extract_host()`, `extract_domain()`, `extract_path()` extracts the - host, domain and path of the raw URL and adds variables named - accordingly. See function descriptions for definitions of these - terms. `drop_query()` lets you drop the query and fragment - components of the raw URL. -- `add_next_visit()` and `add_previous_visit()` adds the previous or - the next URL, domain, or host (defined by `level`) as a new - variable. -- `add_referral()` adds a new variable indicating whether a visit was - referred by a social media platform. Follows the logic of Schmidt et - al., [(2023)](https://doi.org/10.31235/osf.io/cks68). -- `add_title()` downloads the title of a website (the text within the - `` tag of a web site’s `<head>`) and adds it as a new - variable. -- `add_panelist_data()`. Joins a data set containing information about - participants such as a survey. +- `add_duration()` adds a variable called `duration` based on the + sequence of timestamps. The basic logic is that the duration of a + visit is set to the time difference to the subsequent visit, unless + this difference exceeds a certain value (defined by argument + `cutoff`), in which case the duration will be replaced by `NA` or some + user-defined value (defined by `replace_by`). +- `add_session()` adds a variable called `session`, which groups + subsequent visits into a session until the difference to the next + visit exceeds a certain value (defined by `cutoff`). +- `extract_host()`, `extract_domain()`, `extract_path()` extracts the + host, domain and path of the raw URL and adds variables named + accordingly. See function descriptions for definitions of these terms. + `drop_query()` lets you drop the query and fragment components of the + raw URL. +- `add_next_visit()` and `add_previous_visit()` adds the previous or the + next URL, domain, or host (defined by `level`) as a new variable. +- `add_referral()` adds a new variable indicating whether a visit was + referred by a social media platform. Follows the logic of Schmidt et + al., [(2023)](https://doi.org/10.31235/osf.io/cks68). +- `add_title()` downloads the title of a website (the text within the + `<title>` tag of a web site’s `<head>`) and adds it as a new variable. +- `add_panelist_data()`. Joins a data set containing information about + participants such as a survey. ## Classification -- `classify_visits()` categorizes website visits by either extracting - the URL’s domain or host and matching them to a list of domains or - hosts, or by matching a list of regular expressions against the - visit URL. +- `classify_visits()` categorizes website visits by either extracting + the URL’s domain or host and matching them to a list of domains or + hosts, or by matching a list of regular expressions against the visit + URL. ## Summarizing and aggregating -- `deduplicate()` flags or drops (as defined by argument `method`) - consecutive visits to the same URL within a user-defined time frame - (as set by argument `within`). Alternatively to dropping or flagging - visits, the function aggregates the durations of such duplicate - visits. -- `sum_visits()` and `sum_durations()` aggregate the number or the - durations of visits, by participant and by a time period (as set by - argument `timeframe`). Optionally, the function aggregates the - number / duration of visits to a certain class of visits. -- `sum_activity()` counts the number of active time periods (defined - by `timeframe`) by participant. +- `deduplicate()` flags or drops (as defined by argument `method`) + consecutive visits to the same URL within a user-defined time frame + (as set by argument `within`). Alternatively to dropping or flagging + visits, the function aggregates the durations of such duplicate + visits. +- `sum_visits()` and `sum_durations()` aggregate the number or the + durations of visits, by participant and by a time period (as set by + argument `timeframe`). Optionally, the function aggregates the number + / duration of visits to a certain class of visits. +- `sum_activity()` counts the number of active time periods (defined by + `timeframe`) by participant. ## Example code