-
Notifications
You must be signed in to change notification settings - Fork 10
/
pal_testthat.Rd
179 lines (136 loc) · 6.59 KB
/
pal_testthat.Rd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/doc-pal-testthat.R
\name{pal_testthat}
\alias{pal_testthat}
\title{The testthat pal}
\description{
testthat 3.0.0 was released in 2020, bringing with it numerous changes
that were both huge quality of life improvements for package developers
and also highly breaking changes.
While some of the task of converting legacy unit testing code to testthat
3e is quite is pretty straightforward, other components can be quite tedious.
The testthat pal helps you transition your R package's unit tests to
the third edition of testthat, namely via:
\itemize{
\item Converting to snapshot tests
\item Disentangling nested expectations
\item Transitioning from deprecated functions like \verb{expect_known_*()}
}
}
\section{Cost}{
The system prompt from a testthat pal includes something like 1,000 tokens.
Add in (a generous) 100 tokens for the code that's actually highlighted
and also sent off to the model and you're looking at 1,100 input tokens.
The model returns approximately the same number of output tokens as it
receives, so we'll call that 100 output tokens per refactor.
As of the time of writing (October 2024), the default pal model Claude
Sonnet 3.5 costs $3 per million input tokens and $15 per million output
tokens. So, using the default model,
\strong{testthat pals cost around $4 for every 1,000 refactored pieces of code}. GPT-4o
Mini, by contrast, doesn't tend to get many pieces of formatting right and
often fails to line-break properly, but \emph{does} usually return syntactically
valid calls to testthat functions, and it would cost around
20 cents per 1,000 refactored pieces of code.
}
\section{Gallery}{
This section includes a handful of examples
"\href{https://github.com/tidymodels/broom/tree/7fa26488ab522bf577092e99aad1f2003f21b327/tests}{from}
the \href{https://github.com/tidymodels/tune/tree/f8d734ac0fa981fae3a87ed2871a46e9c40d509d/tests}{wild}"
and are generated with the default model, Claude Sonnet 3.5.
Testthat pals convert \code{expect_error()} (and \verb{*_warning()} and \verb{*_message()}
and \verb{*_condition()}) calls to use \code{expect_snapshot()} when there's a
regular expression present:
\if{html}{\out{<div class="sourceCode r">}}\preformatted{expect_warning(
check_ellipses("exponentiate", "tidy", "boop", exponentiate = TRUE, quick = FALSE),
"\\\\`exponentiate\\\\` argument is not supported in the \\\\`tidy\\\\(\\\\)\\\\` method for \\\\`boop\\\\` objects"
)
}\if{html}{\out{</div>}}
Returns:
\if{html}{\out{<div class="sourceCode r">}}\preformatted{expect_snapshot(
.res <- check_ellipses(
"exponentiate", "tidy", "boop", exponentiate = TRUE, quick = FALSE
)
)
}\if{html}{\out{</div>}}
Note, as well, that intermediate results are assigned to an object so as
not to be snapshotted when their contents weren't previously tests.
Another example with multiple, redudant calls:
\if{html}{\out{<div class="sourceCode r">}}\preformatted{augment_error <- "augment is only supported for fixest models estimated with feols, feglm, or femlm"
expect_error(augment(res_fenegbin, df), augment_error)
expect_error(augment(res_feNmlm, df), augment_error)
expect_error(augment(res_fepois, df), augment_error)
}\if{html}{\out{</div>}}
Returns:
\if{html}{\out{<div class="sourceCode r">}}\preformatted{expect_snapshot(error = TRUE, augment(res_fenegbin, df))
expect_snapshot(error = TRUE, augment(res_feNmlm, df))
expect_snapshot(error = TRUE, augment(res_fepois, df))
}\if{html}{\out{</div>}}
They know about \code{regexp = NA}, which means "no error" (or warning, or message):
\if{html}{\out{<div class="sourceCode r">}}\preformatted{expect_error(
p4_b <- check_parameters(w4, p4_a, data = mtcars),
regex = NA
)
}\if{html}{\out{</div>}}
Returns:
\if{html}{\out{<div class="sourceCode r">}}\preformatted{expect_no_error(p4_b <- check_parameters(w4, p4_a, data = mtcars))
}\if{html}{\out{</div>}}
They also know not to adjust calls to those condition expectations when
there's a \code{class} argument present (which usually means that one is
testing a condition from another package, which should be able to change
the wording of the message without consequence):
\if{html}{\out{<div class="sourceCode r">}}\preformatted{expect_error(tidy(pca, matrix = "u"), class = "pca_error")
}\if{html}{\out{</div>}}
Returns:
\if{html}{\out{<div class="sourceCode r">}}\preformatted{expect_error(tidy(pca, matrix = "u"), class = "pca_error")
}\if{html}{\out{</div>}}
When converting non-erroring code, testthat pals will assign intermediate
results so as not to snapshot both the result and the warning:
\if{html}{\out{<div class="sourceCode r">}}\preformatted{expect_warning(
tidy(fit, robust = TRUE),
'"robust" argument has been deprecated'
)
}\if{html}{\out{</div>}}
Returns:
\if{html}{\out{<div class="sourceCode r">}}\preformatted{expect_snapshot(
.res <- tidy(fit, robust = TRUE)
)
}\if{html}{\out{</div>}}
Nested expectations can generally be disentangled without issue:
\if{html}{\out{<div class="sourceCode r">}}\preformatted{expect_equal(
fit_resamples(decision_tree(cost_complexity = 1), bootstraps(mtcars)),
expect_warning(tune_grid(decision_tree(cost_complexity = 1), bootstraps(mtcars)))
)
}\if{html}{\out{</div>}}
Returns:
\if{html}{\out{<div class="sourceCode r">}}\preformatted{expect_snapshot(\{
fit_resamples_result <- fit_resamples(decision_tree(cost_complexity = 1),
bootstraps(mtcars))
tune_grid_result <- tune_grid(decision_tree(cost_complexity = 1),
bootstraps(mtcars))
\})
expect_equal(fit_resamples_result, tune_grid_result)
}\if{html}{\out{</div>}}
There are also a few edits the pal knows to make to third-edition code.
For example, it transitions \code{expect_snapshot_error()} and friends to
use \code{expect_snapshot(error = TRUE)} so that the error context is snapshotted
in addition to the message itself:
\if{html}{\out{<div class="sourceCode r">}}\preformatted{expect_snapshot_error(
fit_best(knn_pca_res, parameters = tibble(neighbors = 2))
)
}\if{html}{\out{</div>}}
Returns:
\if{html}{\out{<div class="sourceCode r">}}\preformatted{expect_snapshot(
error = TRUE,
fit_best(knn_pca_res, parameters = tibble(neighbors = 2))
)
}\if{html}{\out{</div>}}
}
\section{Interfacing manually with the testthat pal}{
Pals are typically interfaced with via the pal addin. To call the testthat
pal directly, use:
\if{html}{\out{<div class="sourceCode r">}}\preformatted{pal_testthat <- .init_pal("testthat")
}\if{html}{\out{</div>}}
Then, to submit a query, run:
\if{html}{\out{<div class="sourceCode r">}}\preformatted{pal_testthat$chat(\{x\})
}\if{html}{\out{</div>}}
}