Inquiry Regarding Training HAT Model #668
Unanswered
mabubakarsaleem
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I recently came across the HAT model, available on GitHub URL: https://github.com/XPixelGroup/HAT?tab=readme-ov-file. However, I have encountered a couple of challenges regarding the hardware requirements and configuration settings.
Firstly, I have an RTX 2070 GPU with 8GB of VRAM. I noticed in the documentation that the default batch size per GPU is set to 4, which would require approximately 20GB of memory per GPU for training. Given my limited GPU memory, I am seeking guidance on how to adjust the memory requirements to accommodate my RTX 2070 with 8GB of VRAM effectively. Any insights or recommendations on modifying the batch size or other parameters to suit my hardware configuration would be immensely helpful.
Secondly, I am seeking clarification on the training command provided in the documentation. The command includes specifications such as "--nproc_per_node=8" and "--master_port=4321". As I am using a system equipped with a CORE i9 11th generation processor, I would like to know how I should adjust the "--nproc_per_node" parameter to match my hardware specifications. Additionally, regarding the "--master_port" parameter, I noticed a different port number (e.g., "29500") mentioned in the .yml file of the training folder.
Could anyone please advise me on the appropriate values for these parameters based on my hardware setup?
Beta Was this translation helpful? Give feedback.
All reactions