-
Notifications
You must be signed in to change notification settings - Fork 25
/
CHANGELOG
executable file
·489 lines (408 loc) · 23 KB
/
CHANGELOG
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
########################
version 1.1 (2016-01-24)
########################
Initial release, the sun shines and sommer has arrived!
########################
version 1.2 (2016-02-07)
########################
+ 'fdr' function bug was fixed
+ addition of the 'randef' function
+ addition of the converter 'atcg1234' function
+ names in the blup's or random effects added
+ zero-boundary constraint added to Average Information algorithm
- it finds which var.comps are pushed to zero constantly
- recalculates variance components removing such components
- fix those values and calculates the most likely value for
the problematic var.comp
+ now 'mmer2' can handle missing data in explanatory variables as lmer
+ now summary of 'mmer2' has names in the variance components
+ A.mat, D.mat and E.mat supported for polyploids
+ mmer can run GWAS for polyploid organisms
- the models implemented are the same than Rosyara (2016):
- "additive","1-dom-alt","1-dom-ref","2-dom-alt","2-dom-ref"
+ eigen decomposition to accelarate genomic prediction based on Lee (2015)
has been added in the argument 'MTG2' of the AI, mmer and mmer2 algorithm
########################
version 1.3 (2016-03-09)
########################
+ The 'bag' function for bagging-GBLUP from Abdollahi-Arpanahi et al. (2015)
has been added:
- The function takes a model fitted and creates a bag matrix with
the top markers (most significant) and creates a design matrix
to be used as fixed effects in the GBLUP model to increase
prediction accuracy.
+ 'bag' function has been equiped with stepwise selection to make sure that markers
selected by "clustering" or "maximum" p.values methods provide at least a minimum
increase in the prediction accuracy.
+ The Fisher Information matrix can be returned from the mmer function when
the AI is used (default) but the argument 'Fishers' needs to be set to TRUE.
+ The bug for the AI algorithm when one var.comp and K and Z are diagonal has been
fixed by changing to EMMA in this naive situation.
+ AI algorithm has been debuged to return the most likely variance components when the
likelihood takes values around the maximum in a zig zag pattern. Just takes the value
where the ML was found. When the likelihod follows a scale and dropping pattern
the program will do the same. A warning message is emmitted.
+ GWAS modality of 'mmer' now adds the names of the markers of each score to keep track
the value for each marker.
########################
version 1.4 (2016-03-24)
########################
+ The AI algorithm will take 5 EM steps if after 10 iterations (AI) the likelihood drops
suddenly, indicating that initial values were too far from real values causing a bad
behavior of the likelihood. The EM steps aim to provide initials values for those
problematic variance components. ONLY mmer2!!!!!!
+ The AI likelihood behaving in a zig zag pattern is detected only after 10 iterations
and we opted for returning the ML estimators.
+ Minor bugs have been fixed (names in random terms). In addition, ordering random effects
based on their degrees of freedom has been implemented to provide more stability to the
AI algorithm.
########################
version 1.5 (2016-05-03)
########################
+ Problem of "not-full rank X matrix" has been solved by reducing the X matrix column by
column until is solved.
+ New alleles for deletions "-" has been added to the atcg1234 function
+ More examples have been added to the software, including maize Technow et al. (2015) data
and rice Zhao et al. (2011)
+ We have opted for using the "EM" algorithm as default for the mmer2 function given the fact
that most users using the mmer2 function would not use covariance structures. The mmer keeps
using the AI algorithm as default method.
+ Implementation of the initial version of TP.prep function designed to select the best
training population for genomic selection.
########################
version 1.6 (2016-05-19)
########################
+ Addition of the Newton-Raphson algorithm to sommer.
+ Vignettes with several examples available by typping vignette("sommer").
+ Bug in Principal Components within the function TP.prep has been fixed.
+ Example of GWAS for single cross hybrids has been added to the Technow data examples.
+ Bug on map.plot function for genetic maps fixed when having a column "Locus" as factor
+ Addition of winter bean data "FDdata" to exemplify the full diallel design calculation
+ mmer2 function now is fully functional as mmer but using data frames. All parameters
can be manipulated as mmer except for the hdm function for half diallels.
+ The program will subset W if some markers are not polymorphic
+ beeper has beed added to the program to choose what sound to display when the program
is don doing the calculations.
+ Now genetic maps can be added for a better display of the manhattan plots.
+ QQ plots are now displayed when doing GWAS for different models.
########################
version 1.7 (2016-05-23)
########################
+ Now the AI algorithm checks the situation when there are random effects with n < p
(less observations than parameters to estimate) to take the right values in AI
when extreme variance components values are found.
+ Bug in eigen decomposition of AI has been fixed by providing smaller starting values for
the variance components
+ In addition, when n < p is present the initial values of the variance components are
reduced significantly to provide better starting values.
+ New bug in the FDR function fixed. Now GWAS with or without map displais the FDR line.
########################
version 1.8 (2016-06-06)
########################
+ Minor changes, mainly in plotting defaults, warning messages, etc.
+ Addition of the phase.F1 function to create parental maps for F1(CP) crosses.
+ Addition of F1geno data accompaning phase.F1 function.
+ Argument MTG2 has been renamed as EIGEND.
+ EIGEND feature added to EMMA algorithm.
+ Bugs for errors when V(e) tends to zero have been fixed for all algorithms, warning
messages have changed.
+ The function for finding biggest peaks "maxi.qtl" has been added.
+ Multiple responses can be fit using parallelization with the 'n.cores' argument
########################
version 1.9 (2016-07-01)
########################
+ lmerHELP has been slightly modified. Now only uses lmer initial values when required
and only used for non-square random effects.
+ AI has been modified to dimminish updates values when they scalate to quickly.
+ zig zag detection in AI2 has been implemented, now even when boundary constraint is
applied there should not be any problem.
+ Now 'bag' function respects if a map is available and does the bagging based on the map.
+ 'manhattan' function added to the package.
+ 'eigenGWAS' function included based on Chen et al. (2016) in Heredity.
+ Bug in AI failing when only one variance compnent was present and was zero, making the
program collapse when trying to recalculate has been fixed.
+ Bugs in 'TP.prep' function fixed.
+ Now 'EMMA' algorithm handles multiple random effects.
+ The first multivariate algorithm "EMMAM" has been added to sommer (a single random effect).
+ Vignettes have examples for multi-response model and parallel univariate models.
+ 'bag' function has changed name to 'hits' to avoid confusion of selecting top 10 hits with
Bootstrap aggregation (bagging).
########################
version 2.0 (2016-08-01)
########################
+ Multivariate algorithms "M-EMMA", "M-AI", "M-NR" have been implemented.
+ Minor bugs fixed.
+ Multivariate GWAS inplemented for any of the methods.
+ The function "mmer2" can run multivariate and parallel models as well.
+ Bug for parallel models fixed.
+ Eigen decomposition of additive relationship matrix implemented for
multivariate models in "NR" and "AI".
########################
version 2.1 (2016-09-01)
########################
+ Standard errors and Z ratios added to the summary of mixed models.
+ Addition of functionality for residual structures, only AR1, CS, and ARMA
supported. The most flexible functions "MNR" and "NRR" are not able to deal
with missing data so you should be careful because the program imputes with
the mean.
+ atcg1234 has a new functonality to get a presence/absence matrix for each
allele at each marker.
########################
version 2.2 (2016-10-01)
########################
+ The LD.decay function has been added to sommer which takes a marker matrix
and a map and calculates the LD decay.
+ In addition, the LD.decay function has the "unlinked" and "gamma" arguments
that estimate the interchromosomal threshold for the gamma percentile to
determine the real LD decay combined with the loess regression.
+ Bugs in GWAS models for univariate and multivariate forms when missing data
exist have been fixed.
########################
version 2.3 (2016-11-01)
########################
+ The nearest neighbor function 'nna' has been added to adjust for neighbouring
plots based on Lado et al. (2013).
+ The dataset 'ExpDesigns' with several datasets to teach users how to analyze
certain experimental designs relevant to plant breeding has been added.
+ A fatal bug in mmer2 and mmer of imputing data when missing data existed has
been corrected
########################
version 2.4 (2016-12-01)
########################
+ The IMP argument has been added to the function to allow the user to decide if the
Y matrix should be imputed or get rid of missing values when estimating the variance
components in the multivariate mixed models (only). The default is FALSE which means
that the missing values are removed.
+ The gryphon dataset was included in the package to provide some help to the users
that want tosee how pedigree data is used
########################
version 2.5 (2017-01-01)
########################
+ Bugs in fdr function fixed
+ Q+K model univariate fixed
+ Q+K model multivariate enabled
+ EIGEND feature added to NR method
########################
version 2.6 (2017-03-01)
########################
+ 'imp' argument added to the atcg1234() function to allow users to avoid imputation
+ Covariance structures for the residual component are not longer supported.
########################
version 2.7 (2017-05-01)
########################
+ at(.), diag(.), and(.), g(.) functions added to be used in mmer2.
+ NR algorithm updated to deal with multiple variance components equal to zero
########################
version 2.8 (2017-06-01)
########################
+ addition of the pin function to the package
+ issue with Year:g(id) type of arguments solved
+ more efficient EM algorithm; covariance matrices are inverted only once
+ EM fixed to don't calculate statistics at the end by direct inversion
+ rcov argument with options 'units' and 'at(.):units' enabled
+ AI MME-based added to sommer
+ new argument DI=TRUE/FALSE for deciding between MME-based and
Direct inversion algorithms
########################
version 2.9 (2017-07-01)
########################
+ good versions of blocker and fill.design implemented
+ bug fixed in PEV for D-AI algorithm
+ minor bug fixes from summary functions in parallel models
+ going back to zero constraint where a model is fitted again witout the random
effect close to the boundary.
########################
version 3.0 (2017-09-01)
########################
+ D.mat function now provides 3 different calculation methods
+ cleaner version of NR and AI algorithms
+ fixed EIGEND feature
+ AI algorithms now takes EM steps for better initial var.comp values
+ MME-base algorithms removed from mmer functions
+ remove of several function
+ move of GWAS functionality to a different function called GWAS and GWAS2
+ change of examples to show the flexibility
+ implementation of us(trait) and diag(trait) functionalities in mmer2
########################
version 3.1 (2017-11-01)
########################
+ pin function improved to don't work with scaled parameters
+ dominance relationship matrices corrected
+ general maintenance to several algorithms for minor bugs
+ constrained parameters in multivariate models not included in the summary as
expected
+ better documentation about multivariate models
########################
version 3.2 (2018-01-01)
########################
+ variogram and plot.variogram functions enabled to visualize the residuals or the
spatial model fitted (in case of 2D spline models)
########################
version 3.3 (2018-03-01)
########################
+ spl2D function to fit 2 dimensional spline model has been added and documented
+ spatPlots function added to visualize the fitted values and residuals
+ now the mmer and mmer2 function can use weights
########################
version 3.4 (2018-06-01)
########################
+ and() function has been replaced by the overlay() function effectively.
+ spl2D() can be used directly in the formula solver mmer2().
########################
version 3.5 (2018-07-01)
########################
+ h2.fun returns now corrected heritabilities for Oakey (2006) and Cullis (2006) formulas.
########################
version 3.6 (2018-09-01)
########################
+ some documentation changes
+ almost finished c++ version, just wait for it.
########################
version 3.7 (2018-09-01)
########################
+ C++ (Armadillo library) implementation done.
+ now the vs() function is the main function to create variance models
+ ds(),us(),cs(),at() for complex variance models added
+ constraint matrices can be easily indicated with unsm(), fixm(), uncm()
in the Gtc argument of vs()
+ fatal bug in the multivariate BLUPs has been fixed
+ implementation of predict() function for lsmeans
+ multivariate GWAS implemented again, fixed and ready to rock
+ random regression implemented
+ anova() function enabled for single model to get sum of squares and MS
+ EIGEN functionality not available anymore, same with EMMA called from mmer
+ mmer2 and GWAS2 have been deprecated, now mmer and GWAS functions can do all
########################
version 3.8 (2019-01-01)
########################
+ stable version of c++ implementation, all bugs reported by users in 3.7 have been fixed
+ implement overlayed and leg in fixed effects
########################
version 3.9 (2019-04-01)
########################
+ documentation on the changes from old to new versions of sommer added to vignettes among better
structure of the documentation
+ atcg1234() function now can take a marker matrix and a matrix with reference alleles and do the
conversion with the customized reference alleles.
+ bug in the fixed effects depending of the order has been fixed. Is due to a strange behavior of the
model.matrix() function.
+ Now Gtc and Gt arguments can take a list of matrices to apply different constraints when using the
ds() and us() functions, although the cs() can provide the same results.
+ Now unsm(), uncm(), fixm(), fcm() have the rep argument to repeat the constraint matrix multiple times
+ Now DF for
########################
version 4.0 (2019-07-01)
########################
+ No updates for now other than documentation
+ Bug in the GWAS function for P3D=TRUE/FALSE fixed
########################
version 4.1 (2020-06-01)
########################
+ Bug for removing the intercept in models has been fixed
+ Better documentation for GxE models
+ Addition if the unsBLUP function to extract right BLUPs for unstructured models
+ Change of the vs() argument from Gt to Gti to specify initial values
########################
version 4.1.1 (2020-10-01)
########################
+ Redesign of the predict function to predict full grids
+ Better structure of the documentation
+ Bug in predict for models that include spl2D() or overlay() functions fixed
########################
version 4.1.2 (2021-02-01)
########################
+ Addition of the fitted() function to return Xb + Zu.1 + ... + Zu.n instead of just Xb
+ Bugs reported for the predict() function addressed
+ The function residuals() now returns e = y - Xb - Zu instead of e = y - Xb
+ A.mat, D.mat and E.mat functions are now C++ Armadillo implementations
+ Addition of the H.mat function in C++ Armadillo
+ When providing and Gu argument in the vs() function, missing levels in the data vector are added
to provide all BLUPs
+ The predict function now provides the correct standard errors
+ GWAS C++ implementation
+ Progress bar for GWAS function added
########################
version 4.1.3 (2021-04-01)
########################
+ Addition of the gvs() function to fit indirect genetic effects and other competition models
+ Addition of an example of indirect genetic effects to vignettes.
+ Bug in predict() when missing data in covariates (x) exist was fixed.
########################
version 4.1.4 (2021-07-01)
########################
+ Addition of the stepweight argument to mmer() to control the magnitude of the update of the AI and NR
when the information matrix goes out of boundaries too quickly.
+ Addition of emupdate argument to mmer() to control the use of expectation-maximization updates as
alternative to the use of second derivative methods (NR, and AI).
+ Addition of the percChange output to the mmer() function to check the % change in the variance comp.
+ Argument buildGu for the vs() function added to avoid rrBLUP models to become slow. This implied modifying
the MNR.cpp function to understand when the matrix multiplications with K (covariance matrix) should be
avoided.
########################
version 4.1.5 (2021-11-01)
########################
+ Addition of the spl2Db and spl2Dmats() functions to fit all models capable to fit by the SpATS package.
+ Rename of spl2D function to spl2Da plus change of names for the arguments.
+ Fixes to a*b fixed model issues and predict improvements thanks to Sam Rogers and Julian Taylor.
+ GWAS by GBLUP examples added to the QG vignette. In addition new vignette for spatial modeling.
########################
version 4.1.7 (2022-07-01)
########################
+ Addition of the W argument to allows users to input a weights matrix instead of only a vector.
+ Henderson mixed model equation version of the AI algorithm based on Jensen, Madsen and Thompson (1997) finally availble with the mmec function. Please try it.
########################
version 4.2.0.1 (2023-01-01)
########################
+ P3D argument enabled again
+ Bug in mmec() for using weights with simple residual structures has been fixed. Now stage-wise analysis should work well. The problem still exist in the mmer() function though, where variance components are restricted in a different way.
########################
version 4.2.1 (2023-04-01)
########################
+ Predict function for mmec() now works well and returns the same results than asreml.
+ The predict function for mmer() on the other hand has been modified to calculate C = t(W) Vi W and Ci = solve(C), where W=[X Z] but when Vi is singular it makes difficult to come up with standard error identical to the mme-based methods.
+ The overlay() function now works in the fixed part of the formula.
+ The predict function now works with models using the overlay function.
+ A bug in the atcg1234 function that was causing failure when few markers were used has been fixed.
+ Weights can be used now in the GWAS function.
+ The emWeight argument now uses a decreasing logarithmic scale to assign weights to each iteration of the AI algorithm to guarantee convergence as best as we can.
+ Bug in the spl2Dc function in mmec has been fixed
########################
version 4.3.0 (2023-05-11)
########################
+ Bug in the rrc() covariance structure function when using a relationship matrix has been solved.
+ The new function redmm() function to create a reduced model matrix to fit huge models has been added
########################
version 4.3.1 (2023-06-11)
########################
+ There was a bug where the A matrix was not being reordered when provided in the mmec() function. It has been fixed.
+ A function to calculate effective population size based on allele coverage has been added.
########################
version 4.3.2 (2023-08-01)
########################
+ Bug in isc function fixed when matrices have a single column.
+ redmm() function improved calling the RSpectra package
+ now the mmec() function can use the different covariance structures in the fixed effect part so overlay and at models can be fitted as fixed.
+ Bug in atc() function fixed when a single level was used.
+ The covc() function has been added to calculate covariance between random effects as long as the incidence matrices of the random effects have the same levels.
########################
version 4.3.4 (2023-04-01)
########################
+ thetaC and theta arguments in special functions in mmec are better documented and have some examples
+ the atcg1234backTransform function has been added to bring back from numbers to letters.
########################
version 4.3.5 (2023-08-01)
########################
+ we have added the tps function to produce tensor product splines from the TPSbits package in a more straightforward way to produce incidence matrices that can be used in the mmec solver more easily.
+ keep improving the documentation of parameters and examples in different functions
+ rrc function for reduced rank models changed the specification and a good example is provided.
+ mmec function now ensures that data is transformed to as.data.frame() to avoid issues with tibble objects as reported in github.
######################
## PENDINGS AND PRIORITIES
+ we need to find out how to solve the message 'Error: Mat::init(): requested size is too large; suggest to enable ARMA_64BIT_WORD' that shows up when we have very big models. In practice doesn't work in the compilation step. Still to be figured out.
+ we need to recode how the C matrix is constructed, adopt linear sum instead of [W y]' R [W y], otherwise the solver is extremely slow for complex models with many effects to be estimated since we multiply every time. Build the coefficient matrix only once and never modify. Then the Gi matrix should be a linear sum of Ai * vc (* being kronecker). If R will always be a diagonal with ones we should only ignore it and multiply the C + Gi by the factor 1/Ve. Still not sure how we avoid multiplication R for heterogeneous models.
+ internally make the ai_mme function to scale the trait first and then bring back to the original units to avoid modifying the tolParinv argument.
+ we need to find how to do symbolic cholesky instead of computing cholesky every iteration in mmec
+ we need to find how to avoid inverting the coeficcient matrix for the calculation of first derivatives (Meyer 1995 may have the solution but I can't understand that paper fully.).
+ arma::inv, arma::solve, arma::chol are slower than Eigen, we need to find how to use both libraries in the same function
+ Add correlation models
+ Generalized linear models...