diff --git a/DESCRIPTION b/DESCRIPTION index 6f2ac48..6b5f743 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,14 +1,14 @@ Package: grafzahl Title: Supervised Machine Learning for Textual Data Using Transformers and 'Quanteda' -Version: 0.0.7.9999 +Version: 0.0.8 Authors@R: person("Chung-hong", "Chan", , "chainsawtiney@gmail.com", role = c("aut", "cre"), comment = c(ORCID = "0000-0002-6232-7530")) -Description: Duct tape the 'quanteda' ecosystem (Benoit et al., 2018) to modern Transformer-based text classification models (Wolf et al., 2020) , in order to facilitate supervised machine learning for textual data. This package mimics the behaviors of 'quanteda.textmodels' and provides a function to setup the 'Python' environment to use the pretrained models from 'Hugging Face' . +Description: Duct tape the 'quanteda' ecosystem (Benoit et al., 2018) to modern Transformer-based text classification models (Wolf et al., 2020) , in order to facilitate supervised machine learning for textual data. This package mimics the behaviors of 'quanteda.textmodels' and provides a function to setup the 'Python' environment to use the pretrained models from 'Hugging Face' . More information: . License: GPL (>= 3) Encoding: UTF-8 Roxygen: list(markdown = TRUE) -RoxygenNote: 7.2.1 +RoxygenNote: 7.2.3 URL: https://github.com/chainsawriot/grafzahl BugReports: https://github.com/chainsawriot/grafzahl/issues Suggests: diff --git a/README.Rmd b/README.Rmd index 1d40566..5a4ba84 100644 --- a/README.Rmd +++ b/README.Rmd @@ -25,7 +25,7 @@ If you don't know what I am talking about, don't worry, this package is gracious Please cite this software as: -Chan, C., (2023). [grafzahl: fine-tuning Transformers for text data from within R](paper/grafzahl_sp.pdf). *Computational Communication Research* (Accepted) +Chan, C., (2023). [grafzahl: fine-tuning Transformers for text data from within R](paper/grafzahl_sp.pdf). *Computational Communication Research* 5(1): 76-84. [https://doi.org/10.5117/CCR2023.1.003.CHAN](https://doi.org/10.5117/CCR2023.1.003.CHAN) ## Installation diff --git a/README.md b/README.md index facb1e1..9c3e83c 100644 --- a/README.md +++ b/README.md @@ -26,7 +26,7 @@ Please cite this software as: Chan, C., (2023). [grafzahl: fine-tuning Transformers for text data from within R](paper/grafzahl_sp.pdf). *Computational Communication Research* -(Accepted) +5(1): 76-84. ## Installation diff --git a/inst/CITATION b/inst/CITATION index 30e5284..16f1435 100644 --- a/inst/CITATION +++ b/inst/CITATION @@ -5,6 +5,9 @@ bibentry(bibtype = "article", title = "grafzahl: fine-tuning Transformers for text data from within R.", journal = "Computational Communication Research", author = c(person("Chung-hong", "Chan")), - url = "https://github.com/chainsawriot/grafzahl", + doi = "10.5117/CCR2023.1.003.CHAN", + volume = 5, + number = 1, + pages = "76-84", year = 2023 ) diff --git a/paper/grafzahl_sp.qmd b/paper/grafzahl_sp.qmd index fd866ea..049f682 100644 --- a/paper/grafzahl_sp.qmd +++ b/paper/grafzahl_sp.qmd @@ -152,7 +152,7 @@ The out-of-sample F1 measures of the fine-tuned model are .76, .67, and .72 (vs #| echo: false #| fig.cap: Learning curve of machine learning algorithms #| label: fig-fig1 -readRDS(here::here("learning.RDS")) +readRDS(here::here("paper/learning.RDS")) ``` [^caret]: The function `confusionMatrix()` can accept the predicted values and ground truth directly, without using `table()` first. But the predicted values and ground truth must be `factor`: `confusionMatrix(as.factor(predicted_sentiment), as.factor(docvars(test_corpus, "value")), mode = "prec_recall")`.