-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial data portal endpoints #324
Draft
jeffbaumes
wants to merge
17
commits into
main
Choose a base branch
from
data-portal
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* replace dead link 'nmdc-metadata' with 'issues' repo * update name of make command to ssh into nersc mongo dbs --------- Co-authored-by: Jing - Peters MBP <jingcao.yale@gmail.com>
* initial checkin with base class and basic tests - Base ChangeSheet write class - unit tests for base class * add conftest and gold changesheet tests - move test fixtures to conftest.py - add get_biosample_name function and unit test to GoldBiosample generator * update biosample name unit test add explicit expected values * Sketch out functions for gold changesheet generator * function and test for missing GOLD ecosystem metadata * add function and test for missing gold_biosample_identifiers * add get_normalized_gold_biosample_identifier * update logic with omics processing step * skeleton find_omics_processing_set function, and updated (correct this time) test data files * Add Omics to Biosample map - add omics_to_biosample map imput - added nmdc / gold BioSample comparison logic - unit tests - stub API dependent methods * Add changesheets.py pachage for common functions and classes - Changesheet and ChangesheetLineItem classes - API @op functions * refactor to split omice procesing data file read to stand-aloine function * more refactoring and code cleanup * add test generation job * add resource definitions and config * refactor and code cleanup Simplify to just ChangeSheet and ChangeSheerLineItem classes * Cleanup this branch to focus on getting assets working * fix defs and fetch statement * get basic GOLD asset generation working * Add Api resources as ConfigurableResources * Add asset scaffolding * update normalizer functions to all take and return strings * update resources add empty click script * fix gold ID normalization and add unit tests * implement compare biosamples and write_changesheet * add omics reccord comparison * Add validate_changesheet method * cleanup unused data files * fix validate_changesheet method and add logging * delete dagster asset based code and tests - move to a demo branch * add changesheet_output to .gitignore * add changesheet_output to .gitignore * remove Dagster-related code and settings * style: format with black * Use TypeAlias for JSON_OBJECT * Removed hard-coded URL from Changesheet.validate() * remove .tsv file - should be ignorewd * clarify function name and blacken formatting * fix click options help text and blacken * yet more blackening * uncomment wait-for-it * Delete get_data.ipynb * Revert "Delete get_data.ipynb" This reverts commit fe3e68a. * add docstring for generate_changesheet * automatic reformatting * bring get_data noteback back to original state * add some logging * update to use gold_sequencing_identifiers over alternative_identifiers * Delete neon_cache.sqlite * strip and de-tab the value in tsv output * set default line_items in changesheet class correctly * update output_dir type hint * remove apply_changes option * Dry up unfindable logging * Clean up gold normalization and documentation * fix: style --------- Co-authored-by: Donny Winston <donny@polyneme.xyz>
…urrent nmdc-schema
…ce OmicsProcessing and DataObject records
model-field ranges with `Query`-annotated types aren't covered by the automated bump-pydantic tool.
re-submission of "same" changes is a valid use case closes #340
coordinated with microbiomedata/nmdc-server#1037 |
note: denormalization of mongo collections for data portal, via a series of mongo aggregation pipelines ( |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.