-
Notifications
You must be signed in to change notification settings - Fork 24
Version 2 ideas of what ml can do #80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This reverts commit c83a8e5.
…d the config, and updated the test_ml.py{
|
my lastest commit didn't run the automated tests bc the config.yaml has a merge conflict (i was also trying to see if the error I'm getting on my end for summary.py was occuring on these tests) I looked, and it was just a space issue, but I wasn't sure if I should just go ahead and fix it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my final testing, with data0 and the cosine metric I received this error:
RuleException:
ValueError in line 265 of C:\Users\agitter\Desktop\madison\collaborators\Ritz\spras\Snakefile:
The condensed distance matrix must contain only finite values.
File "C:\Users\agitter\Desktop\madison\collaborators\Ritz\spras\Snakefile", line 265, in __rule_ml_analysis
File "C:\Users\agitter\Desktop\madison\collaborators\Ritz\spras\src\analysis\ml.py", line 219, in hac_vertical
File "C:\Users\agitter\.conda\envs\spras\lib\site-packages\seaborn\matrix.py", line 1258, in clustermap
File "C:\Users\agitter\.conda\envs\spras\lib\site-packages\seaborn\matrix.py", line 1129, in plot
File "C:\Users\agitter\.conda\envs\spras\lib\site-packages\seaborn\matrix.py", line 974, in plot_dendrograms
File "C:\Users\agitter\.conda\envs\spras\lib\site-packages\seaborn\matrix.py", line 687, in dendrogram
File "C:\Users\agitter\.conda\envs\spras\lib\site-packages\seaborn\matrix.py", line 495, in __init__
File "C:\Users\agitter\.conda\envs\spras\lib\site-packages\seaborn\matrix.py", line 562, in calculated_linkage
File "C:\Users\agitter\.conda\envs\spras\lib\site-packages\seaborn\matrix.py", line 530, in _calculate_linkage_scipy
File "C:\Users\agitter\.conda\envs\spras\lib\site-packages\scipy\cluster\hierarchy.py", line 1065, in linkage
File "C:\Users\agitter\.conda\envs\spras\lib\concurrent\futures\thread.py", line 57, in run
Removing output files of failed job ml_analysis since they might be corrupted:
output/data0-pca.png, output/data0-pca-components.txt, output/data0-pca-coordinates.txt
I believe this is caused by all the identical graphs in this dataset. We can leave the behavior for now but may want a more informative error message in the future.
I renamed the PCA output file names. data1-pca-components.txt describes the variance explained, not the principal components.
I tested running with multiple cores, and somehow the wrong figure is sometimes saved to data1-pca.png. Once it was horizontal hierarchical clustering plot. Once it was the data0 PCA plot. I'm not sure how that happens when the rules run in parallel, but we'll need to debug that. It's fine when I run with 1 core. @ntalluri can you reproduce that behavior? Here's an example where it looks like the same figure object was being manipulated in both parallel rules.
data1-pca.png:

|
Great work with these final updates. This is ready to merge. I noticed the algorithm name parsing didn't work in the EGFR dataset because the dataset is called |

No description provided.