forked from Biometris/statgenHTP
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathcreateTimePoints.Rd
126 lines (115 loc) · 5.09 KB
/
createTimePoints.Rd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/createTimePoints.R
\name{createTimePoints}
\alias{createTimePoints}
\title{Create an object of class TP}
\usage{
createTimePoints(
dat,
experimentName,
genotype,
timePoint,
timeFormat = NULL,
plotId,
repId = NULL,
rowNum = NULL,
colNum = NULL,
addCheck = FALSE,
checkGenotypes = NULL
)
}
\arguments{
\item{dat}{A data.frame.}
\item{experimentName}{A character string, the name of the experiment. Stored
with the data and used in default plot titles.}
\item{genotype}{A character string indicating the column in dat containing
the genotypes.}
\item{timePoint}{A character string indicating the column in dat containing
the time points.}
\item{timeFormat}{A character string indicating the input format of the time
points. E.g. for a date/time input of the form day/month/year hour:minute,
use "\%d/\%m/\%Y \%H:\%M". For a full list of abbreviations see
\code{\link[base]{strptime}}. If \code{NULL}, a best guess is done based on
the input.}
\item{plotId}{A character string indicating the column in dat containing
the plotId. This has to be a unique identifier per plot/plant per time point.}
\item{repId}{A character string indicating the column in dat containing
the replicates.}
\item{rowNum}{A character string indicating the column in dat containing
the row number of the plot.}
\item{colNum}{A character string indicating the column in dat containing
the column number of the plot.}
\item{addCheck}{Should a column check be added to the output? If \code{TRUE},
checkGenotypes cannot be \code{NULL}.}
\item{checkGenotypes}{A character vector containing the genotypes used as
checks in the experiment.}
}
\value{
An object of class \code{TP}. A list with, per time point in the
input, a data.frame containing the data for that time point. A data.frame
with columns timeNumber and timePoint is added as attribute timePoints to
the data. This data.frame can be used for referencing timePoints by their
number.
}
\description{
Convert a data.frame to an object of class TP (Time Points).
The function converts a data.frame to an object of class TP in the following
steps:
\itemize{
\item{Quality control on the input data. For example, warnings will be given
when more than 50\% of observations are missing for a plant.}
\item{Rename columns to default column names used by the functions in the
statgenHTP package. For example, the column in the data containing
variety/accession/genotype is renamed to “genotype.” Original column names
are stored as an attribute of the individual data.frames in the TP object.}
\item{Convert column types to the default column types. For example, the
column “genotype” is converted to a factor and “rowNum” to a numeric column.}
\item{Convert the column containing time into time format. If needed, the
time format can be provided in \code{timeFormat}. For example, with a
date/time input of the form “day/month/year hour:minute”, use
"\%d/\%m/\%Y \%H:\%M". For a full list of abbreviations see the R package
strptime. When the input time is a numeric value, the function will
convert it to time from 01-01-1970.}
\item{If \code{addCheck = TRUE}, the genotypes listed in
\code{checkGenotypes} are reference genotypes (or check). It will add a
column check with a value "noCheck" for the genotypes that are not in
\code{checkGenotypes} and the name of the genotypes for the
\code{checkGenotypes}. A column genoCheck is also added with the names of
the genotypes that are not in \code{checkGenotypes} and \code{NA} for the
\code{checkGenotypes}. These columns are necessary for fitting models on
data that includes check genotypes, e.g. reference genotypes that are
highly replicated or in case of augmented design.}
\item{Split the data into separate data.frames by time point. A TP object is
a list of data.frames where each data.frame contains the data for a single
time point. If there is only one time point the output will be a list with
only one item.}
\item{Add a data.frame with columns timeNumber and timePoint as attribute
“timePoints” to the TP object. This data.frame can be used for referencing
time points by a unique number.}
}
Note that \code{plotId} needs to be a unique identifier for a plot or a
plant. It cannot occur more than once per time point.
}
\examples{
## Create a TP object containing the data from the Phenovator.
phenoTP <- createTimePoints(dat = PhenovatorDat1,
experimentName = "Phenovator",
genotype = "Genotype",
timePoint = "timepoints",
repId = "Replicate",
plotId = "pos",
rowNum = "y", colNum = "x",
addCheck = TRUE,
checkGenotypes = c("check1", "check2",
"check3","check4"))
summary(phenoTP)
}
\seealso{
Other functions for data preparation:
\code{\link{as.data.frame.TP}()},
\code{\link{getTimePoints}()},
\code{\link{plot.TP}()},
\code{\link{removeTimePoints}()},
\code{\link{summary.TP}()}
}
\concept{functions for data preparation}