You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am combing results from hapsb_ind() output using the pp_individual_roh() and I don't want to add meta information. But pp_individual_roh() throws me an error even if I do meta_info=False. It seems to me it is because it is reading meta_path before checking meta_info which happens only during the merging step. Am I missing something?
Function:
def pp_individual_roh(iids, meta_path="./Data/ReichLabEigenstrat/Raw/meta.csv",
base_folder="./Empirical/Eigenstrat/Reichall/",
suffix='_roh_full.csv', save_path="", min_cm=[4,8,12], snp_cm=50,
gap=0.5, min_len1=2.0, min_len2=4.0,
output=True, meta_info=True):
"""Post-process Individual ROH .csv files. Combines them into one summary ROH.csv, saved in save_path.
Use Individuals iids, create paths and run the combining.
iids: List of target Individuals
base_folder: Folder where to find individual results .csvs
min_cm: Minimum post-processed Length of ROH blocks. Array (to have multiple possible values)
snp_cm: Minimum Number of SNPs per cM
gap: Maximum length of gaps to merge
output: Whether to plot output per Individual.
meta_info: Whether to merge in Meta-Info from the original Meta File
save_path: If given, save resulting dataframe there
min_len1: Minimum Length of shorter block to merge [cM]
min_len2: Maximum Length of longer block to merge [cM]"""
### Look up Individuals in meta_df and extract relevant sub-table
df_full = pd.read_csv(meta_path)
df_meta = df_full[df_full["iid"].isin(iids)] # Extract only relevant Indivdiuals
print(f"Loaded {len(df_meta)} / {len(df_full)} Individuals from Meta")
paths = give_iid_paths(df_meta["iid"], base_folder=base_folder, suffix=suffix)
df1 = create_combined_ROH_df(paths, df_meta["iid"].values, df_meta['clst'].values,
min_cm=min_cm, snp_cm=snp_cm, gap=gap,
min_len1=min_len1, min_len2=min_len2, output=output)
### Merge results with Meta-Dataframe
if meta_info:
df1 = pd.merge(df1, df_meta, on="iid")
if len(save_path) > 0:
df1.to_csv(save_path, sep="\t", index=False)
print(f"Saved to: {save_path}")
return df1
The text was updated successfully, but these errors were encountered:
Ah yes, this function for post-processing is overly complicated as it strictly requires a meta table path.
"meta_info" just indicates whether the info from the meta file should be part of the output table - but it does not turn off the requirement to provide such a meta table.
As a quick fix, you can simply create a table with "iid" and "clst" columns - I know that this is not ideal.
I will try to update this function to make the meta table file fully optional. Stay tuned for the next release!
Hi,
I am combing results from hapsb_ind() output using the pp_individual_roh() and I don't want to add meta information. But pp_individual_roh() throws me an error even if I do meta_info=False. It seems to me it is because it is reading meta_path before checking meta_info which happens only during the merging step. Am I missing something?
Function:
The text was updated successfully, but these errors were encountered: