Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update PM results #375

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
update results
  • Loading branch information
rcannood committed Jan 10, 2025
commit bd85c8b2fe3e21bea18df85bcffef2c8006d85ae
68 changes: 45 additions & 23 deletions results/predict_modality/data/dataset_info.json
Original file line number Diff line number Diff line change
@@ -1,25 +1,14 @@
[
{
"dataset_id": "openproblems_neurips2021/bmmc_cite/normal",
"dataset_name": "NeurIPS2021 CITE-Seq (GEX2ADT)",
"dataset_summary": "Single-cell CITE-Seq (GEX+ADT) data collected from bone marrow mononuclear cells of 12 healthy human donors.",
"dataset_description": "Single-cell CITE-Seq data collected from bone marrow mononuclear cells of 12 healthy human donors using the 10X 3 prime Single-Cell Gene Expression kit with Feature Barcoding in combination with the BioLegend TotalSeq B Universal Human Panel v1.0. The dataset was generated to support Multimodal Single-Cell Data Integration Challenge at NeurIPS 2021. Samples were prepared using a standard protocol at four sites. The resulting data was then annotated to identify cell types and remove doublets. The dataset was designed with a nested batch layout such that some donor samples were measured at multiple sites with some donors measured at a single site.",
"data_reference": "luecken2021neurips",
"data_url": "https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE194122",
"date_created": "25-11-2024",
"file_size": 704994,
"common_dataset_id": "openproblems_neurips2021/bmmc_cite"
},
{
"dataset_id": "openproblems_neurips2021/bmmc_multiome/normal",
"dataset_name": "NeurIPS2021 Multiome (GEX2ATAC)",
"dataset_id": "openproblems_neurips2022/pbmc_multiome/swap",
"dataset_name": "OpenProblems NeurIPS2022 Multiome (ATAC2GEX)",
"dataset_summary": "Single-cell Multiome (GEX+ATAC) data collected from bone marrow mononuclear cells of 12 healthy human donors.",
"dataset_description": "Single-cell CITE-Seq data collected from bone marrow mononuclear cells of 12 healthy human donors using the 10X Multiome Gene Expression and Chromatin Accessibility kit. The dataset was generated to support Multimodal Single-Cell Data Integration Challenge at NeurIPS 2021. Samples were prepared using a standard protocol at four sites. The resulting data was then annotated to identify cell types and remove doublets. The dataset was designed with a nested batch layout such that some donor samples were measured at multiple sites with some donors measured at a single site.",
"data_reference": "luecken2021neurips",
"data_url": "https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE194122",
"date_created": "25-11-2024",
"file_size": 31080807,
"common_dataset_id": "openproblems_neurips2021/bmmc_multiome"
"dataset_description": "Single-cell CITE-Seq data collected from bone marrow mononuclear cells of 12 healthy human donors using the 10X Multiome Gene Expression and Chromatin Accessibility kit. The dataset was generated to support Multimodal Single-Cell Data Integration Challenge at NeurIPS 2022. Samples were prepared using a standard protocol at four sites. The resulting data was then annotated to identify cell types and remove doublets. The dataset was designed with a nested batch layout such that some donor samples were measured at multiple sites with some donors measured at a single site.",
"data_reference": "lance2024predicting",
"data_url": "https://www.kaggle.com/competitions/open-problems-multimodal/data",
"date_created": "09-01-2025",
"file_size": 18717069,
"common_dataset_id": "openproblems_neurips2022/pbmc_multiome"
},
{
"dataset_id": "openproblems_neurips2021/bmmc_multiome/swap",
Expand All @@ -28,18 +17,29 @@
"dataset_description": "Single-cell CITE-Seq data collected from bone marrow mononuclear cells of 12 healthy human donors using the 10X Multiome Gene Expression and Chromatin Accessibility kit. The dataset was generated to support Multimodal Single-Cell Data Integration Challenge at NeurIPS 2021. Samples were prepared using a standard protocol at four sites. The resulting data was then annotated to identify cell types and remove doublets. The dataset was designed with a nested batch layout such that some donor samples were measured at multiple sites with some donors measured at a single site.",
"data_reference": "luecken2021neurips",
"data_url": "https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE194122",
"date_created": "25-11-2024",
"date_created": "09-01-2025",
"file_size": 7883109,
"common_dataset_id": "openproblems_neurips2021/bmmc_multiome"
},
{
"dataset_id": "openproblems_neurips2021/bmmc_cite/normal",
"dataset_name": "NeurIPS2021 CITE-Seq (GEX2ADT)",
"dataset_summary": "Single-cell CITE-Seq (GEX+ADT) data collected from bone marrow mononuclear cells of 12 healthy human donors.",
"dataset_description": "Single-cell CITE-Seq data collected from bone marrow mononuclear cells of 12 healthy human donors using the 10X 3 prime Single-Cell Gene Expression kit with Feature Barcoding in combination with the BioLegend TotalSeq B Universal Human Panel v1.0. The dataset was generated to support Multimodal Single-Cell Data Integration Challenge at NeurIPS 2021. Samples were prepared using a standard protocol at four sites. The resulting data was then annotated to identify cell types and remove doublets. The dataset was designed with a nested batch layout such that some donor samples were measured at multiple sites with some donors measured at a single site.",
"data_reference": "luecken2021neurips",
"data_url": "https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE194122",
"date_created": "09-01-2025",
"file_size": 704994,
"common_dataset_id": "openproblems_neurips2021/bmmc_cite"
},
{
"dataset_id": "openproblems_neurips2022/pbmc_cite/normal",
"dataset_name": "OpenProblems NeurIPS2022 CITE-Seq (GEX2ADT)",
"dataset_summary": "Single-cell CITE-Seq (GEX+ADT) data collected from bone marrow mononuclear cells of 12 healthy human donors.",
"dataset_description": "Single-cell CITE-Seq data collected from bone marrow mononuclear cells of 12 healthy human donors using the 10X 3 prime Single-Cell Gene Expression kit with Feature Barcoding in combination with the BioLegend TotalSeq B Universal Human Panel v1.0. The dataset was generated to support Multimodal Single-Cell Data Integration Challenge at NeurIPS 2022. Samples were prepared using a standard protocol at four sites. The resulting data was then annotated to identify cell types and remove doublets. The dataset was designed with a nested batch layout such that some donor samples were measured at multiple sites with some donors measured at a single site.",
"data_reference": "lance2024predicting",
"data_url": "https://www.kaggle.com/competitions/open-problems-multimodal/data",
"date_created": "25-11-2024",
"date_created": "09-01-2025",
"file_size": 591886,
"common_dataset_id": "openproblems_neurips2022/pbmc_cite"
},
Expand All @@ -50,7 +50,7 @@
"dataset_description": "Single-cell CITE-Seq data collected from bone marrow mononuclear cells of 12 healthy human donors using the 10X 3 prime Single-Cell Gene Expression kit with Feature Barcoding in combination with the BioLegend TotalSeq B Universal Human Panel v1.0. The dataset was generated to support Multimodal Single-Cell Data Integration Challenge at NeurIPS 2022. Samples were prepared using a standard protocol at four sites. The resulting data was then annotated to identify cell types and remove doublets. The dataset was designed with a nested batch layout such that some donor samples were measured at multiple sites with some donors measured at a single site.",
"data_reference": "lance2024predicting",
"data_url": "https://www.kaggle.com/competitions/open-problems-multimodal/data",
"date_created": "25-11-2024",
"date_created": "09-01-2025",
"file_size": 32551804,
"common_dataset_id": "openproblems_neurips2022/pbmc_cite"
},
Expand All @@ -61,8 +61,30 @@
"dataset_description": "Single-cell CITE-Seq data collected from bone marrow mononuclear cells of 12 healthy human donors using the 10X 3 prime Single-Cell Gene Expression kit with Feature Barcoding in combination with the BioLegend TotalSeq B Universal Human Panel v1.0. The dataset was generated to support Multimodal Single-Cell Data Integration Challenge at NeurIPS 2021. Samples were prepared using a standard protocol at four sites. The resulting data was then annotated to identify cell types and remove doublets. The dataset was designed with a nested batch layout such that some donor samples were measured at multiple sites with some donors measured at a single site.",
"data_reference": "luecken2021neurips",
"data_url": "https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE194122",
"date_created": "25-11-2024",
"date_created": "09-01-2025",
"file_size": 13467880,
"common_dataset_id": "openproblems_neurips2021/bmmc_cite"
},
{
"dataset_id": "openproblems_neurips2022/pbmc_multiome/normal",
"dataset_name": "OpenProblems NeurIPS2022 Multiome (GEX2ATAC)",
"dataset_summary": "Single-cell Multiome (GEX+ATAC) data collected from bone marrow mononuclear cells of 12 healthy human donors.",
"dataset_description": "Single-cell CITE-Seq data collected from bone marrow mononuclear cells of 12 healthy human donors using the 10X Multiome Gene Expression and Chromatin Accessibility kit. The dataset was generated to support Multimodal Single-Cell Data Integration Challenge at NeurIPS 2022. Samples were prepared using a standard protocol at four sites. The resulting data was then annotated to identify cell types and remove doublets. The dataset was designed with a nested batch layout such that some donor samples were measured at multiple sites with some donors measured at a single site.",
"data_reference": "lance2024predicting",
"data_url": "https://www.kaggle.com/competitions/open-problems-multimodal/data",
"date_created": "09-01-2025",
"file_size": 4322721,
"common_dataset_id": "openproblems_neurips2022/pbmc_multiome"
},
{
"dataset_id": "openproblems_neurips2021/bmmc_multiome/normal",
"dataset_name": "NeurIPS2021 Multiome (GEX2ATAC)",
"dataset_summary": "Single-cell Multiome (GEX+ATAC) data collected from bone marrow mononuclear cells of 12 healthy human donors.",
"dataset_description": "Single-cell CITE-Seq data collected from bone marrow mononuclear cells of 12 healthy human donors using the 10X Multiome Gene Expression and Chromatin Accessibility kit. The dataset was generated to support Multimodal Single-Cell Data Integration Challenge at NeurIPS 2021. Samples were prepared using a standard protocol at four sites. The resulting data was then annotated to identify cell types and remove doublets. The dataset was designed with a nested batch layout such that some donor samples were measured at multiple sites with some donors measured at a single site.",
"data_reference": "luecken2021neurips",
"data_url": "https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE194122",
"date_created": "09-01-2025",
"file_size": 31080807,
"common_dataset_id": "openproblems_neurips2021/bmmc_multiome"
}
]
64 changes: 48 additions & 16 deletions results/predict_modality/data/method_info.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,9 @@
"code_url": "https://github.com/openproblems-bio/task_predict_modality",
"documentation_url": null,
"image": "https://ghcr.io/openproblems-bio/task_predict_modality/control_methods/mean_per_gene:build_main",
"implementation_url": "https://github.com/openproblems-bio/task_predict_modality/blob/0bd597e201b39fbcbc1fcd7047f7654a9713a197/src/control_methods/mean_per_gene",
"implementation_url": "https://github.com/openproblems-bio/task_predict_modality/blob/b333268bf19de5c7b9003f69a864bda48ae827a1/src/control_methods/mean_per_gene",
"code_version": "build_main",
"commit_sha": "0bd597e201b39fbcbc1fcd7047f7654a9713a197"
"commit_sha": "b333268bf19de5c7b9003f69a864bda48ae827a1"
},
{
"task_id": "control_methods",
Expand All @@ -27,9 +27,9 @@
"code_url": "https://github.com/openproblems-bio/task_predict_modality",
"documentation_url": null,
"image": "https://ghcr.io/openproblems-bio/task_predict_modality/control_methods/random_predict:build_main",
"implementation_url": "https://github.com/openproblems-bio/task_predict_modality/blob/0bd597e201b39fbcbc1fcd7047f7654a9713a197/src/control_methods/random_predict",
"implementation_url": "https://github.com/openproblems-bio/task_predict_modality/blob/b333268bf19de5c7b9003f69a864bda48ae827a1/src/control_methods/random_predict",
"code_version": "build_main",
"commit_sha": "0bd597e201b39fbcbc1fcd7047f7654a9713a197"
"commit_sha": "b333268bf19de5c7b9003f69a864bda48ae827a1"
},
{
"task_id": "control_methods",
Expand All @@ -43,9 +43,9 @@
"code_url": "https://github.com/openproblems-bio/task_predict_modality",
"documentation_url": null,
"image": "https://ghcr.io/openproblems-bio/task_predict_modality/control_methods/zeros:build_main",
"implementation_url": "https://github.com/openproblems-bio/task_predict_modality/blob/0bd597e201b39fbcbc1fcd7047f7654a9713a197/src/control_methods/zeros",
"implementation_url": "https://github.com/openproblems-bio/task_predict_modality/blob/b333268bf19de5c7b9003f69a864bda48ae827a1/src/control_methods/zeros",
"code_version": "build_main",
"commit_sha": "0bd597e201b39fbcbc1fcd7047f7654a9713a197"
"commit_sha": "b333268bf19de5c7b9003f69a864bda48ae827a1"
},
{
"task_id": "control_methods",
Expand All @@ -59,9 +59,9 @@
"code_url": "https://github.com/openproblems-bio/task_predict_modality",
"documentation_url": null,
"image": "https://ghcr.io/openproblems-bio/task_predict_modality/control_methods/solution:build_main",
"implementation_url": "https://github.com/openproblems-bio/task_predict_modality/blob/0bd597e201b39fbcbc1fcd7047f7654a9713a197/src/control_methods/solution",
"implementation_url": "https://github.com/openproblems-bio/task_predict_modality/blob/b333268bf19de5c7b9003f69a864bda48ae827a1/src/control_methods/solution",
"code_version": "build_main",
"commit_sha": "0bd597e201b39fbcbc1fcd7047f7654a9713a197"
"commit_sha": "b333268bf19de5c7b9003f69a864bda48ae827a1"
},
{
"task_id": "methods",
Expand All @@ -75,9 +75,9 @@
"code_url": "https://github.com/openproblems-bio/task_predict_modality",
"documentation_url": null,
"image": "https://ghcr.io/openproblems-bio/task_predict_modality/methods/knnr_py:build_main",
"implementation_url": "https://github.com/openproblems-bio/task_predict_modality/blob/0bd597e201b39fbcbc1fcd7047f7654a9713a197/src/methods/knnr_py",
"implementation_url": "https://github.com/openproblems-bio/task_predict_modality/blob/b333268bf19de5c7b9003f69a864bda48ae827a1/src/methods/knnr_py",
"code_version": "build_main",
"commit_sha": "0bd597e201b39fbcbc1fcd7047f7654a9713a197"
"commit_sha": "b333268bf19de5c7b9003f69a864bda48ae827a1"
},
{
"task_id": "methods",
Expand All @@ -91,9 +91,9 @@
"code_url": "https://github.com/openproblems-bio/task_predict_modality",
"documentation_url": null,
"image": "https://ghcr.io/openproblems-bio/task_predict_modality/methods/knnr_r:build_main",
"implementation_url": "https://github.com/openproblems-bio/task_predict_modality/blob/0bd597e201b39fbcbc1fcd7047f7654a9713a197/src/methods/knnr_r",
"implementation_url": "https://github.com/openproblems-bio/task_predict_modality/blob/b333268bf19de5c7b9003f69a864bda48ae827a1/src/methods/knnr_r",
"code_version": "build_main",
"commit_sha": "0bd597e201b39fbcbc1fcd7047f7654a9713a197"
"commit_sha": "b333268bf19de5c7b9003f69a864bda48ae827a1"
},
{
"task_id": "methods",
Expand All @@ -107,9 +107,9 @@
"code_url": "https://github.com/openproblems-bio/task_predict_modality",
"documentation_url": null,
"image": "https://ghcr.io/openproblems-bio/task_predict_modality/methods/lm:build_main",
"implementation_url": "https://github.com/openproblems-bio/task_predict_modality/blob/0bd597e201b39fbcbc1fcd7047f7654a9713a197/src/methods/lm",
"implementation_url": "https://github.com/openproblems-bio/task_predict_modality/blob/b333268bf19de5c7b9003f69a864bda48ae827a1/src/methods/lm",
"code_version": "build_main",
"commit_sha": "0bd597e201b39fbcbc1fcd7047f7654a9713a197"
"commit_sha": "b333268bf19de5c7b9003f69a864bda48ae827a1"
},
{
"task_id": "methods",
Expand All @@ -123,8 +123,40 @@
"code_url": "https://github.com/openproblems-bio/task_predict_modality",
"documentation_url": null,
"image": "https://ghcr.io/openproblems-bio/task_predict_modality/methods/guanlab_dengkw_pm:build_main",
"implementation_url": "https://github.com/openproblems-bio/task_predict_modality/blob/0bd597e201b39fbcbc1fcd7047f7654a9713a197/src/methods/guanlab_dengkw_pm",
"implementation_url": "https://github.com/openproblems-bio/task_predict_modality/blob/b333268bf19de5c7b9003f69a864bda48ae827a1/src/methods/guanlab_dengkw_pm",
"code_version": "build_main",
"commit_sha": "0bd597e201b39fbcbc1fcd7047f7654a9713a197"
"commit_sha": "b333268bf19de5c7b9003f69a864bda48ae827a1"
},
{
"task_id": "methods",
"method_id": "novel",
"method_name": "Novel",
"method_summary": "A method using encoder-decoder MLP model",
"method_description": "This method trains an encoder-decoder MLP model with one output neuron per component in the target. As an input, the encoders use representations obtained from ATAC and GEX data via LSI transform and raw ADT data. The hyperparameters of the models were found via broad hyperparameter search using the Optuna framework.",
"is_baseline": false,
"references_doi": "10.1101/2022.04.11.487796",
"references_bibtex": null,
"code_url": "https://github.com/openproblems-bio/neurips2021_multimodal_topmethods/tree/main/src/predict_modality/methods/novel",
"documentation_url": "https://github.com/openproblems-bio/neurips2021_multimodal_topmethods/tree/main/src/predict_modality/methods/novel#readme",
"image": "https://github.com/orgs/openproblems-bio/packages?repo_name=task_predict_modality&q=methods/novel/novel",
"implementation_url": "https://github.com/openproblems-bio/task_predict_modality/blob/b333268bf19de5c7b9003f69a864bda48ae827a1/src/methods/novel/novel",
"code_version": "build_main",
"commit_sha": "b333268bf19de5c7b9003f69a864bda48ae827a1"
},
{
"task_id": "methods",
"method_id": "simple_mlp",
"method_name": "Simple MLP",
"method_summary": "Ensemble of MLPs trained on different sites (team AXX)",
"method_description": "This folder contains the AXX solution to the OpenProblems-NeurIPS2021 Single-Cell Multimodal Data Integration.\nTeam took the 4th place of the modality prediction task in terms of overall ranking of 4 subtasks: namely GEX\nto ADT, ADT to GEX, GEX to ATAC and ATAC to GEX. Specifically, our methods ranked 3rd in GEX to ATAC and 4th\nin GEX to ADT. More details about the task can be found in the\n[competition webpage](https://openproblems.bio/events/2021-09_neurips/documentation/about_tasks/task1_modality_prediction).\n",
"is_baseline": false,
"references_doi": "10.1101/2022.04.11.487796",
"references_bibtex": null,
"code_url": "https://github.com/openproblems-bio/neurips2021_multimodal_topmethods/tree/main/src/predict_modality/methods/AXX",
"documentation_url": "https://github.com/openproblems-bio/neurips2021_multimodal_topmethods/tree/main/src/predict_modality/methods/AXX",
"image": "https://github.com/orgs/openproblems-bio/packages?repo_name=task_predict_modality&q=methods/simple_mlp/simple_mlp",
"implementation_url": "https://github.com/openproblems-bio/task_predict_modality/blob/b333268bf19de5c7b9003f69a864bda48ae827a1/src/methods/simple_mlp/simple_mlp",
"code_version": "build_main",
"commit_sha": "b333268bf19de5c7b9003f69a864bda48ae827a1"
}
]
Loading
Loading