Skip to content

make sure that the ebi-ena null values are lower case #3246

Closed
@antgonza

Description

@antgonza

We currently don't force ebi-ena null values to be lower case but we should.

The code is something like this:

update_values = {
    'not collected': 'not collected',
    'not provided': 'not provided',
    'restricted access': 'restricted access',
    'not applicable': 'not applicable',
    'unspecified': 'not applicable',
    'not_collected': 'not collected',
    'not_provided': 'not provided',
    'restricted_access': 'restricted access',
    'not_applicable': 'not applicable',
    'missing: not collected': 'not collected',
    'missing: not provided': 'not provided',
    'missing: restricted access': 'restricted access',
    'missing: not applicable': 'not applicable',
}

df = [sample_or_prep_object].to_dataframe().fillna("").applymap(str.lower)
ddf = df[df.isin(update_values.keys()).any(axis=1)]
if ddf.shape[0] != 0:
    cols = [c for c in ddf.columns if set(update_values) & set(ddf[c].values)]
    to_replace = ddf[cols].copy()
    to_replace.replace(update_values, inplace=True)
    [sample_or_prep_object].update(to_replace)    

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions