Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New json files for MiniAODv4, minor updates in runner and README #88

Merged
merged 29 commits into from
Jan 24, 2024

Conversation

uttiyasarkar
Copy link
Collaborator

  1. New json files for 2023 are added for QCD,QCDmu,JetMET, and btag
  2. 2022 jsons with MiniAODv4 samples updated
  3. runner.py script is updated to pick up either conda/micromamba environment
  4. README updated stating that al9 is not recommended

Missing jsons:
mc_summer22EE_MINIAODv4_qcdmu
mc_summer22_MINIAODv4_qcdincl
mc_summer22EE_MINIAODv4_BTV

Copy link
Collaborator

@Ming-Yan Ming-Yan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @uttiyasarkar thank you for the PR, looks good to me

I have some question regarding to the file names

  1. why the file name is with v3?
  2. if you want to modify the new file name instead of replacing the original file list, please also change recommended file name in readme.

DY_test.txt Outdated Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this test data for ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate files for test, can be removed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are also other test_*.json duplicate files from the past. Do I clean them up as well or are they for some reason?
test_bta_run3.json
test_data.json
test_w_dj_ee.json
test_w_dj_e.json
test_w_dj_emu.json
test_w_dj_mu.json
test_w_dj_mumu.json

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are use to construct the ci pipeline, please leave it there

src/BTVNanoCommissioning/utils/compile_jec_test.py Outdated Show resolved Hide resolved
@Ming-Yan
Copy link
Collaborator

please do reformatting via black then we can merge :)

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Ming-Yan
Ming-Yan previously approved these changes Jan 19, 2024
@Ming-Yan Ming-Yan dismissed their stale review January 19, 2024 13:06

Missing files

@uttiyasarkar
Copy link
Collaborator Author

In the json filesets, we still have the following datasets missing:

  1. data_2022_btagmu.json
  2. data_2022_jetmet.json
  3. data_2023_Mu.json
  4. data_2023_em.json

@uttiyasarkar uttiyasarkar added the documentation Improvements or additions to documentation label Jan 23, 2024
@uttiyasarkar
Copy link
Collaborator Author

All jsons are prepared now. @Ming-Yan please review and it should be ready to be merged.

Copy link
Collaborator

@Ming-Yan Ming-Yan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @uttiyasarkar thanks for the changes!
Only one minor thing on naming was not clear, I think for the file name we would like to format as following.
$CAMPAIGN refers to the --campaign in runner.py
$YEAR refers to --year in runner.py
$SAMPLE are the collections we collected for QCD/QCD-mu, rest of MC and dataset name
$TAG_EXTENSTION refers to the production string in our customize production. i.e. BTV_Run3_2022_Comm_MINIAODv4 for our current production

MC: MC_$CAMPAIGN_$YEAR_$SAMPLE_$TAG_EXTENSION
data: data_$CAMPAIGN_$YEAR_$DATASET_$TAG_EXTENSION

.sites_map.json Outdated Show resolved Hide resolved
README.md Outdated
Comment on lines 244 to 247
DoubleMuon (BTA,BTV_Comm_v2)| 1243MB | 848MB |1249MB|
DoubleMuon (BTA,BTV_Comm_v3)| 1243MB | 848MB |1249MB|
DoubleMuon (PFCands, BTV_Comm_v1)|1650MB |1274MB |1632MB|
DoubleMuon (Nano_v11)|1183MB| 630MB |1180MB|
WJets_inc (BTA,BTV_Comm_v2)| 1243MB |848MB |1249MB|
WJets_inc (BTA,BTV_Comm_v3)| 1243MB |848MB |1249MB|
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test are based on v2

README.md Outdated
@@ -208,7 +208,7 @@ python runner.py --workflow valid --json metadata/$json file

Based on Congqiao's [development](notebooks/BTA_array_producer.ipynb) to produce BTA ntuples based on PFNano.

:exclamation: Only the newest version [BTV_Run3_2022_Comm_v2](https://github.com/cms-jet/PFNano/tree/13_0_7_from124MiniAOD) ntuples work. Example files are given in [this](metadata/test_bta_run3.json) json. Optimize the chunksize(`--chunk`) in terms of the memory usage. This depends on sample, if the sample has huge jet collection/b-c hardons. The more info you store, the more memory you need. I would suggest to test with `iterative` to estimate the size.
:exclamation: Only the newest version [BTV_Run3_2022_Comm_v3](https://github.com/cms-jet/PFNano/tree/13_0_7_from124MiniAOD) ntuples work. Example files are given in [this](metadata/test_bta_run3.json) json. Optimize the chunksize(`--chunk`) in terms of the memory usage. This depends on sample, if the sample has huge jet collection/b-c hardons. The more info you store, the more memory you need. I would suggest to test with `iterative` to estimate the size.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The newest version should not refers to PFNano but the btvnano development. cms-sw/cmssw#43485

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pointed to the recent framework https://github.com/cms-btv-pog/btvnano-prod

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any recent to put as BP?

BTW, given Summer23 already refers to run3, I think Run3 is no longer needed in the syntax.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comments with all BPix samples

Copy link
Collaborator

@Ming-Yan Ming-Yan Jan 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

choices=[
            "Rereco17_94X",
            "Winter22Run3",
            "Summer22Run3",
            "Summer22EERun3",
            "Summer23",
            "Summer23BPix",
            "2018_UL",
            "2017_UL",
	    "2016preVFP_UL",
	    "2016postVFP_UL",
        ],

Copy link
Collaborator

@Ming-Yan Ming-Yan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @uttiyasarkar looks good to me! we can merge this PR

@Ming-Yan Ming-Yan merged commit fed444f into cms-btv-pog:master Jan 24, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation runner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants