Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Codebook and other minor changes #103

Closed
wants to merge 13 commits into from
Closed

Codebook and other minor changes #103

wants to merge 13 commits into from

Conversation

jhmigueles
Copy link
Collaborator

Hi Vincent,
I've been working in the codebook report for part5, maybe we can improve some of the definitions, but I think this is a good start, have a look on it when you find the time. This is reporting a codebook for each database exported.

Also, while working on the codebook, I've seen some minor things to change in g.report.part5, here a summary:
includewindowcrit argument: I've included this argument in g.report.part5 and g.shell. Its function is to give the user the chance to select how many hours should have a window to be considered in the analyses. In the previous version it was hard-coded in 2 hours, so I have maintained this as default parameter. This is specially important if the user is interested in 24-hours days. For example, if the accelerometer is initialized at 8pm, then the first day would have 4 hours, if the user select an includewindowcrit of 24 hours, then this day would be excluded from analyses, otherwise it would be included in the daily means.

Number of valid days: I was getting inconsistent results in these variables (for example: Nvaliddays = 8, Nvaliddays_WE = 2, Nvaliddays_WD = 5). I've also summarized the code, now it is working correctly and considering only the days included in the daily mean calculation.

Numeric variables identification: Some the numeric variables were being considered as character, and then they were not averaged. I found the problem in this line of code 'if(nr > 30) nr = 30)'. Removing this part, it analyses the whole variable and it works correctly.

I think this is all so far, let me know if you need any extra detail about these changes.

Best,
Jairo

@vincentvanhees
Copy link
Member

Thanks Jairo,

Includewindowcrit: Would it not be easier to re-use the same windowcriteria arguments as use in part2, 4 or 5? I think that would help to keep the number of input arguments to a minimum.
Number of days: Thanks, can you clarify what part of the code you had to change for this?
Numeric variables identification: I believe I inserted this at some point to address problems in specific datasets, but I cannot remember why. It is good that you commented it out such that we can find it back if this causes problems again, and then look for an alternative solution.

Cokebook: As I mentioned before I am concerned that a codebook could be a nightmare for us to maintain, because we will have three separate variable descriptions all of which need to be consistent and up to date: The vignette, the standard function documentation and now this codebook. I thought we agreed to optimize the vignette and the standard function documentation before building something new? It would have been helpful if you had first created an issue to clarify and discuss the topic before doing all the work. In that way we would not have worked past each other.... Another problem I see with the codebook solution is that you included the functionality inside existing functions g.part5 and g.report.part5. Having long functions is a bad habit because it makes the code harder to navigate, especially when documentation and code are mixed in the same file. Therefore, I have been trying the last couple of years to split up long functions as much as possible. If you insist on making a codebook then I think it is better to build this as a separate function that reads the variable names produced by part5, and then generates a csv file with the variable documentation. Advantages of such an approach are:

  • Makes it easier to add more elaborate documentation, which is stored separate from the code to ease updating.
  • It would allow GGIR users to generate a codebook for analysis they did in the past (your current solution requires full re-analysis).
  • It would create a framework for doing the same for part2 and 4.

All checks have failed: The checks (tests) fail as indicated by the red markers. It would safe me a lot of time if you could fix this. It is probably best if you learn how to use R CMD CHECK to check the code before making a pull request. Merging a branch with failing tests will break GGIR, so the checks always need to pass.

General comments:
As we discussed it would help a lot if you could try to learn how to use git, it is one of the most important skills if you want to become more proficient in software development. This is a good course on how git works I would recommend. You can do it on your own or try to attend one of the Software Carpentry courses in Spain, which are typically organised by volunteers. In that way you could work on one repository branch per issue and make a pull request per issue. Pull requests will then be more oversee-able, and it would be easier to stepwise improve GGIR.

Cheers, Vincent

@jhmigueles
Copy link
Collaborator Author

jhmigueles commented Jul 28, 2018 via email

@jhmigueles jhmigueles closed this Oct 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants