-
Notifications
You must be signed in to change notification settings - Fork 787
[CI] Add AWS EC2 dynamic runner support #6471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@apstasen, do you know if it's possible to get remote access to the machines from AWS EC2 for debugging failures? |
Yes, it is possible. Even if you have non admin access to this Intel provided AWS account you can create your SSH keypair in AWS, run instance from my pre-created AWS AMI (or use generic ones) with that keypair and SSH port open. After that you can access this host using usual SSH client (need to be outside Intel network or use Intel socks5 proxy). Will not put specific details here about this proxy. Dynamically created AWS instances in this PR use "default" security group that have all incoming connections blocked, so you will not be able to access these instances using SSH. Of course admin can can open SSH port in this default security group but it is not recommended to do (and not convenient since these instances are normally short lived). |
Will the logs be publicly available? We have non-Intel developers who ideally should be able to debug pre-commit issues and having access to logs is highly desirable (access to HW would be ideal). |
Logs from these runners will be visible as usual in Github actions interface, so if developers can see logs from our persistent runner they can see these logs too. |
According to my understanding CI linter is not supposed to applied to Javascript, so I suggest reverting 565732b to more and return more readable version. |
This reverts commit 565732b.
@bader OK. Restored original formatting. Also this PR can not be merged until "aws" secret environment is created (otherwise newly added aws-start-matrix and aws-stop-matrix jobs will fail). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This adds infrastructure to spawn AWS EC2 runners dynamically for lts suite testing. This will be only functional if you will add "aws-type" keys as well as other keys into devops/test_configs.json configuration file like this:
Also please make sure that other non AMD/nVidia GPU jobs do not have too generic self-hosted runner labels like "Linux", "x64" since otherwise they can go to these AWS hosts and we do not want to use them for generic workloads.
Intel provided AWS account is supposed to be used. To configure it for this repo please do the following (I will keep this BKM schematic to avoid disclosing any sensitive info):