-
Notifications
You must be signed in to change notification settings - Fork 15
HWP PCU agent minor modification #599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
for more information, see https://pre-commit.ci
BrianJKoopman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a bit unusual to me. I believe it could be due to the timeout in the 'acq' process being set to a non-zero value. Try changing the timeout on line 162 to 0 and leaving these timeouts modified here as they were and see if that helps:
socs/socs/agents/hwp_pcu/agent.py
Line 162 in aaf066e
| with self.lock.acquire_timeout(timeout=3, job='acq') as acquired: |
|
@BrianJKoopman a bit confused about your comment (I think it was me who suggested non-zero timeouts). Why would a timeout of zero help here? |
|
I think the problem is the long socs/socs/agents/hwp_pcu/agent.py Line 193 in aaf066e
While this is sleeping, the lock cannot be acquired by other processes. I think to fix we should make this a fast loop (.1 sec or so) and only take and publish data on iterations where 5 sec have elapsed. Or we could also restructure the agent to avoid TimeoutLocks. |
Ah, yup, agreed this 5 second sleep is the issue, the task timeouts would need to be longer than that. Sorry for the noise. I've just seen non-zero acq timeouts cause issues in the past, and the example in the docs has a 0 timeout in long running processes. |
|
I realize it might not be clear what I'm talking about when I say lockless agent restructure, so I threw together what I have in mind in this draft PR for reference which should fix the locking issues. Feel free to use that fix if you would like, but may need some testing: |
|
Thank you so much for your comments and suggestions. |
I tested the recently added hwp_pcu agent, and found that 3 seconds of timeout is sometimes too short to communicate with the HWP PCU module.
This is a request to extend timeout for sending commands/receiving messages.
Description
I extended the timeout for
send_commandfrom 3 sec to 10 sec; and forget_statusto 10 sec.Motivation and Context
Improving the operation stability
How Has This Been Tested?
I tested this agent in the lab and it worked.
The recently added agent worked fine except for the timeout issue.
Types of changes
Checklist: