-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update stats plots, add longitudinal sample size calculation #98
base: master
Are you sure you want to change the base?
Conversation
- update plots: sample_size, error_function_of_csa - update legends: boxplot_atrophy, boxplot_csa
- expected trend and mean values on CSA boxplot - mean values on atrophy boxplot - add function for adding pearson's r and p-value stats - add diff and std diff column in dataframe for sample size computation - add longitudinal study sample size change: - correct difference between means (atrophy_% * CSA) on sample size plot - correct x_label on error_function_of_CSA - correct x_label on error_function_of_intra_cov - automatize the detection of outliers on error_function_of_intra_cov_outlier
# Conflicts: # csa_rescale_stat.py
…ore this diff was computed from mean CSA across transformations)
Up to now the computed longitudinal sample sizes were small (< 1). After investigation, we observed that the SD of the difference of measured CSA across subjects were computed using mean CSA across transformations ex: Implemented in 7874439, contrary to formula (1) program does not mean CSA across transformations but randomly samples a CSA value for each subject. ex: |
df_sub['perc_error'] = 100 * (df_sub['mean'] - df_sub['theoretic_csa']).div(df_sub['theoretic_csa']) | ||
diff = [] | ||
for rescale, group in df.groupby('rescale'): | ||
for sub, subgroup in group.groupby('subject'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add comment/explanation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added explanations for the sample size function in commit f50f18a
With commit 05d7fcd using difference formula (tY and tZ are two different transforms):
longitudinal sample size variability is relatively important Also note that, when looking at between-group differences (vs. paired differences as described above), the formula was also updated as follows:
Results seem to vary much less: SD of sample size no more than 3% of sample size (between groups). Therefore, my best guess is that the important variability found computing longitudinal sample sizes are mostly due to the variability of CSA measures between scalings (which has already been shown in article). @jcohenadad, should we continue with these results? My idea is now to keep the Monte Carlo simulations for both sample size computations. |
@PaulBautin This is an interesting investigation but I need more guidance to understand the formula described in #98 (comment). Without the context of the code I cannot advise on what is the most appropriate solution. I suggest we discuss it in a meeting. |
- normalize by the square of rescale for df['Normalized CSA in mm²'] - use poly1d when plotting trends in plots
- append fake values for pearson and and p_value for rescale =1
- un-comment concatenating csv files
- change iteration number for computing sample size and print message
@jcohenadad, could you review? I think this PR is ready to be merged (plots in PR match plots in article). |
@jcohenadad, could you review? This PR should be merged into master because plots and stats for the article are based on this PR. |
sorry-- realistically i will not have time to review |
This PR intends to homogenize notations and conventions between the graphs presented in the manuscript and the "csa_atrophy" repo.
Done:
FIX #95, FIX #92, FIX #100, FIX #80, FIX #103