-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Codebook and other minor changes #103
Conversation
update GGIR
Thanks Jairo, Includewindowcrit: Would it not be easier to re-use the same windowcriteria arguments as use in part2, 4 or 5? I think that would help to keep the number of input arguments to a minimum. Cokebook: As I mentioned before I am concerned that a codebook could be a nightmare for us to maintain, because we will have three separate variable descriptions all of which need to be consistent and up to date: The vignette, the standard function documentation and now this codebook. I thought we agreed to optimize the vignette and the standard function documentation before building something new? It would have been helpful if you had first created an issue to clarify and discuss the topic before doing all the work. In that way we would not have worked past each other.... Another problem I see with the codebook solution is that you included the functionality inside existing functions g.part5 and g.report.part5. Having long functions is a bad habit because it makes the code harder to navigate, especially when documentation and code are mixed in the same file. Therefore, I have been trying the last couple of years to split up long functions as much as possible. If you insist on making a codebook then I think it is better to build this as a separate function that reads the variable names produced by part5, and then generates a csv file with the variable documentation. Advantages of such an approach are:
All checks have failed: The checks (tests) fail as indicated by the red markers. It would safe me a lot of time if you could fix this. It is probably best if you learn how to use R CMD CHECK to check the code before making a pull request. Merging a branch with failing tests will break GGIR, so the checks always need to pass. General comments: Cheers, Vincent |
Hi Vincent!
Sorry for the delay, I've been some days off before start planning my
travel to US.
*Includewindowcrit: *I thought this in a separate way, because it is not
related to wearing time, but total window time, but maybe we can think this
better for the next time and re-use the other arguments.
*Number of days:* g.report.part5 function. Lines 308-329 in my pull request
*Numeric variables identification:* Ok, should I create an issue with this?
Currently, the code is looking at the first 30 rows of data, in case there
is no values in these 30 days, then the variable is considered as character
and excluded from the mean calculations.
*Cokebook:* No problem, I though that you agreed on woring in a codebook
for the part5 to start with something. But it was just a misunderstanding,
it didn't take too much time for me, so don't worry, we can leave this out.
*Checks: *yes, I have to work on this. I'm sorry for this, now during my
stay is going to be difficult to find the time to do this, but as soon I'm
back in Spain I will enroll one of these courses to learn how to use git
and R CMD CHECK.
So... should I prepare a pull request including only the changes about the
number of days and report an issue about the numeric variables
identification?
Thanks,
Jairo
2018-07-22 15:17 GMT+02:00 Vincent van Hees <notifications@github.com>:
… Thanks Jairo,
*Includewindowcrit:* Would it not be easier to re-use the same
windowcriteria arguments as use in part2, 4 or 5? I think that would help
to keep the number of input arguments to a minimum.
*Number of days:* Thanks, can you clarify what part of the code you had
to change for this?
*Numeric variables identification:* I believe I inserted this at some
point to address problems in specific datasets, but I cannot remember why.
It is good that you commented it out such that we can find it back if this
causes problems again, and then look for an alternative solution.
*Cokebook:* As I mentioned before I am concerned that a codebook could be
a nightmare for us to maintain, because we will have three separate
variable descriptions all of which need to be consistent and up to date:
The vignette, the standard function documentation and now this codebook. I
thought we agreed to optimize the vignette and the standard function
documentation before building something new? It would have been helpful if
you had first created an issue <https://github.com/wadpac/GGIR/issues> to
clarify and discuss the topic before doing all the work. In that way we
would not have worked past each other.... Another problem I see with the
codebook solution is that you included the functionality inside existing
functions g.part5 and g.report.part5. Having long functions is a bad habit
because it makes the code harder to navigate, especially when documentation
and code are mixed in the same file. Therefore, I have been trying the last
couple of years to split up long functions as much as possible. If you
insist on making a codebook then I think it is better to build this as a
separate function that reads the variable names produced by part5, and then
generates a csv file with the variable documentation. Advantages of such an
approach are:
- Makes it easier to add more elaborate documentation, which is stored
separate from the code to ease updating.
- It would allow GGIR users to generate a codebook for analysis they
did in the past (your current solution requires full re-analysis).
- It would create a framework for doing the same for part2 and 4.
*All checks have failed:* The checks (tests) fail as indicated by the red
markers. It would safe me a lot of time if you could fix this. It is
probably best if you learn how to use R CMD CHECK to check the code before
making a pull request. Merging a branch with failing tests will break GGIR,
so the checks always need to pass.
*General comments:*
As we discussed it would help a lot if you could try to learn how to use
git, it is one of the most important skills if you want to become more
proficient in software development. This
<http://swcarpentry.github.io/git-novice/> is a good course on how git
works I would recommend. You can do it on your own or try to attend one of
the Software Carpentry courses in Spain, which are typically organised by
volunteers. In that way you could work on one repository branch per issue
and make a pull request per issue. Pull requests will then be more
oversee-able, and it would be easier to stepwise improve GGIR.
Cheers, Vincent
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#103 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AbZzgJqFlR3NgnOHMN19mKUm9zd-9_zFks5uJHuGgaJpZM4VZqbd>
.
|
Hi Vincent,
I've been working in the codebook report for part5, maybe we can improve some of the definitions, but I think this is a good start, have a look on it when you find the time. This is reporting a codebook for each database exported.
Also, while working on the codebook, I've seen some minor things to change in g.report.part5, here a summary:
includewindowcrit argument: I've included this argument in g.report.part5 and g.shell. Its function is to give the user the chance to select how many hours should have a window to be considered in the analyses. In the previous version it was hard-coded in 2 hours, so I have maintained this as default parameter. This is specially important if the user is interested in 24-hours days. For example, if the accelerometer is initialized at 8pm, then the first day would have 4 hours, if the user select an includewindowcrit of 24 hours, then this day would be excluded from analyses, otherwise it would be included in the daily means.
Number of valid days: I was getting inconsistent results in these variables (for example: Nvaliddays = 8, Nvaliddays_WE = 2, Nvaliddays_WD = 5). I've also summarized the code, now it is working correctly and considering only the days included in the daily mean calculation.
Numeric variables identification: Some the numeric variables were being considered as character, and then they were not averaged. I found the problem in this line of code 'if(nr > 30) nr = 30)'. Removing this part, it analyses the whole variable and it works correctly.
I think this is all so far, let me know if you need any extra detail about these changes.
Best,
Jairo