Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support aarch64 OpenSearch distribution for MacOS architecture #4670

Closed
reta opened this issue May 2, 2024 · 24 comments
Closed

Support aarch64 OpenSearch distribution for MacOS architecture #4670

reta opened this issue May 2, 2024 · 24 comments

Comments

@reta
Copy link
Contributor

reta commented May 2, 2024

Is your feature request related to a problem? Please describe

The OpenSeach core does support the MacOS aarch64 based distribution (Mx CPUs) but we never publish those distribution during the release.

Describe the solution you'd like

Publish aarch64 OpenSearch distributions for MacOS architecture.

Describe alternatives you've considered

N/A

Additional context

@peterzhuamazon fyi

Acceptance Criteria

  • Request M2 pro dedicated host in both staging and prod jenkins account
  • Change both mac1 and mac2 instance to have 4 executors max per agent
  • Wait for Jenkins upgrade to latest LTS:2.426.3 opensearch-ci#389 to complete in order to use the latest aws-java-sdk-ec2 to retrieve the mac2-m2pro.metal instance type
  • Add mac2 runner in Jenkins cdk automation
  • Add arm64 macos min distribution build in jenkins pipeline
  • Publish macos arm64 min artifact to public bucket
  • Test with @reta on the distribution retrieval in core code
@reta reta added enhancement New Enhancement untriaged Issues that have not yet been triaged labels May 2, 2024
@peterzhuamazon
Copy link
Member

Will sync up with team on the next steps after 2.14.0:

  • Request resource from EC2 on M1/M2.
  • Create packer profile for the arm64 macos.
  • Deploy to jenkins.
  • Adding new build sections for macos arm64 snapshot core on jenkinsfile.
  • Trigger build
  • Test pulling artifacts

Thanks.

@peterzhuamazon peterzhuamazon self-assigned this May 2, 2024
@peterzhuamazon peterzhuamazon removed the untriaged Issues that have not yet been triaged label May 6, 2024
@peterzhuamazon
Copy link
Member

Will take a look next week after 2.14.0.

@peterzhuamazon
Copy link
Member

peterzhuamazon commented May 28, 2024

Taking a look on adding ec2 instances with:

  1. Changing existing x64 mac1.metal instance from 6 executors to 4, so that each will have 3cpu/8GB ram.
  2. New arm64 will use the mac2-m2pro.metal instance with the same 4 executors, with 3cpu/8GB ram each container setup.

@peterzhuamazon
Copy link
Member

Seems like mac1.metal is still using the host tenancy, trying to switch to on-demand default tenancy and have issues:

2024-05-28 22:02:22.403+0000 [id=115853]        WARNING h.i.i.InstallUncaughtExceptionHandler#handleException: Caught unhandled exception with ID 7ab1f0b4-a262-4186-bdd8-2ba5ecd55bd1
com.amazonaws.services.ec2.model.AmazonEC2Exception: The requested tenancy is not supported for this instance type. Please check the documentation for supported configurations. (Service: AmazonEC2; Status Code: 400; Error Code: Unsupported; Request ID: dfc7b12e-635d-4ed7-8c0c-8117eedafd4b; Proxy: null)

@peterzhuamazon
Copy link
Member

peterzhuamazon commented May 28, 2024

Seems like mac1.metal is still using the host tenancy, trying to switch to on-demand default tenancy and have issues:

2024-05-28 22:02:22.403+0000 [id=115853]        WARNING h.i.i.InstallUncaughtExceptionHandler#handleException: Caught unhandled exception with ID 7ab1f0b4-a262-4186-bdd8-2ba5ecd55bd1
com.amazonaws.services.ec2.model.AmazonEC2Exception: The requested tenancy is not supported for this instance type. Please check the documentation for supported configurations. (Service: AmazonEC2; Status Code: 400; Error Code: Unsupported; Request ID: dfc7b12e-635d-4ed7-8c0c-8117eedafd4b; Proxy: null)

Per this docs:
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-mac-instances.html

Mac instances are available only as bare metal instances on Dedicated Hosts, with a minimum allocation period of 24 hours before you can release the Dedicated Host. You can launch one Mac instance per Dedicated Host. You can share the Dedicated Host with the AWS accounts or organizational units within your AWS organization, or the entire AWS organization.

@peterzhuamazon
Copy link
Member

Need to apply for a dedicated host for testing.

@peterzhuamazon peterzhuamazon changed the title Support aarch64 OpenSearch distribution for MacOS architecture. Support aarch64 OpenSearch distribution for MacOS architecture May 29, 2024
@peterzhuamazon
Copy link
Member

Added acceptance criteria to the description.

@peterzhuamazon
Copy link
Member

peterzhuamazon commented May 30, 2024

Get the dedicated hosts for m2 pro and able to setup 1 host, tho there are few issues to provision the instance onto the host through Jenkins:

  1. m2 pro on ec2 was announced last year on 2023/09/19: https://aws.amazon.com/blogs/aws/new-amazon-ec2-m2-pro-mac-instances-built-on-apple-silicon-m2-pro-mac-mini-computers/
  2. The corresponding sdk change of aws-java-sdk-ec2 was updated with this instance type on 1.12.556 (2023/09/22): https://github.com/aws/aws-sdk-java/blame/ee7313b7505156127a3dfe251acdf2a79133e7ad/aws-java-sdk-ec2/src/main/java/com/amazonaws/services/ec2/model/InstanceType.java#L763
  3. Our current Jenkins is using ec2 plugin 2.0.7 which in turn uses sdk 1.12.406 (https://github.com/jenkinsci/ec2-plugin/blob/ec2-2.0.7/pom.xml#L121). At the time of the creation of Jenkins main node docker we update the sdk to 1.12.481. We might need to wait for Jenkins upgrade in this issue to complete: Jenkins upgrade to latest LTS:2.426.3 opensearch-ci#389. The ideal case is to use the latest version of ec2 plugin and at least 1.12.556 version of aws-java-sdk-ec2 to support mac2-m2pro.metal instance type.

Thanks.

@peterzhuamazon
Copy link
Member

peterzhuamazon commented Jun 6, 2024

Able to confirm the latest version of EC2 plugin is able to select mac2m2prometal:
Screenshot 2024-06-06 at 2 34 04 PM

@peterzhuamazon
Copy link
Member

The dedicated hosts seems have availability limitations.
Previous mac1 are all launched on one zone, while this time it is on other zone with previous zone have limited availability.

Need more digging.

@peterzhuamazon
Copy link
Member

peterzhuamazon commented Jul 2, 2024

I am eventually able to allocate enough dedicated hosts in the particular zone our jenkins located.
Will start the next step of adding a new packer template.

Thanks.

@peterzhuamazon
Copy link
Member

After scheduling the same version of macOS 12 on arm64 instance, it failed with check 0/2 and never able to launch.

Investigating why that is the case and potentially move to macos13.

Thanks.

@peterzhuamazon
Copy link
Member

Apparently even tho macOS 12 support arm64, it only support M1 where as our instance of arm64 is M2.
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-mac-instances.html#mac-instance-considerations



    macOS Mojave (version 10.14) (x86 Mac instances only)

    macOS Catalina (version 10.15) (x86 Mac instances only)

    macOS Big Sur (version 11) (x86 and M1 Mac instances)

    macOS Monterey (version 12) (x86 and M1 Mac instances)

    macOS Ventura (version 13) (all Mac instances, M2 and M2 Pro Mac instances support macOS Ventura version 13.2 or later)

    macOS Sonoma (version 14) (all Mac instances)

I will now switch arm64 to macOS 13 and potentially update the x64 one to macOS 13 as well.

@peterzhuamazon
Copy link
Member

New PR:

@peterzhuamazon
Copy link
Member

@peterzhuamazon
Copy link
Member

@peterzhuamazon
Copy link
Member

@peterzhuamazon
Copy link
Member

Adding Jenkins Builds:

@peterzhuamazon
Copy link
Member

peterzhuamazon commented Jul 19, 2024

@reta
Copy link
Contributor Author

reta commented Jul 19, 2024

Thanks a lot @peterzhuamazon !

@peterzhuamazon
Copy link
Member

We are closing this issue as macOS arm64 min snapshot artifact seems stable now.

Thanks.

@peterzhuamazon
Copy link
Member

peterzhuamazon commented Jul 26, 2024

Able to run the integTest of alerting as an example on arm64 MacOS system:

$ ./gradlew integTest
......
> Task :alerting:integTest
<===========--> 88% EXECUTING [1m 15s]
> IDLE
> IDLE
> IDLE
> IDLE
> IDLE
> IDLE
> IDLE
> :alerting:compileTestKotlin
> :alerting-sample-remote-monitor-plugin:integTest > Resolve files of :alerting-sample-remote-monitor-plugin:opensearch_distro_extracted_testclusters-alerting-sample-remote-monitor-plugin-integTest-0-2.16.0-SNAPSHOT- > opensearch-min-2.16.0-SNAPSHOT-darwin-arm64-latest.tar.gz > 80.4 MiB/225.8 MiB downloaded
> IDLE
> IDLE


> Task :alerting:integTest
Picked up JAVA_TOOL_OPTIONS: -Dlog4j2.formatMsgNoLookups=true
<============-> 97% EXECUTING [10m 53s]
> IDLE
> IDLE
> IDLE
> IDLE
> IDLE
> IDLE
> IDLE
> :alerting:integTest > 102 tests completed, 1 skipped
> :alerting:integTest > Executing test org.opensearch.alerting.MonitorDataSourcesIT
> IDLE
> IDLE



> Task :alerting:integTest
Picked up JAVA_TOOL_OPTIONS: -Dlog4j2.formatMsgNoLookups=true

Deprecated Gradle features were used in this build, making it incompatible with Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings and determine if they come from your own scripts or plugins.

For more on this, please refer to https://docs.gradle.org/8.5/userguide/command_line_interface.html#sec:command_line_warnings in the Gradle documentation.

BUILD SUCCESSFUL in 22m 48s
28 actionable tasks: 25 executed, 3 up-to-date

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: ✅ Done
Development

No branches or pull requests

2 participants