Skip to content

[FEA] Relax pe_count for CSP vCPU with hard limits #157

Open
@pdmack

Description

Is your feature request related to a problem? Please describe.
Azure subscriptions impose a hard limit on vCPU resources in their VM.
So if we run a Morpheus pipeline with 8 threads on a NC6s v2 (6 vCPU) instance we get an abort in thread_engines.cpp

F20220809 15:38:51.809185   262 thread_engines.cpp:38] 
Check failed: launch_options().pe_count == m_cpu_set.weight() 
(8 vs. 6) mismatch in the number of cores in the cpu set with respect to the requested pe_count
*** Check failure stack trace: ***
Aborted (core dumped)

Describe the solution(s) you'd like

  • Graceful degradation to the value of m_cpu_set.weight() with appropriate warnings in logging.
  • Possibly make hard v. soft limits configurable via the SRF API and throw the error back up to Morpheus (app layer).

Describe alternatives you've considered
Workaround is for the Morpheus pipeline to be run with threads <= vCPU count in Azure VM.

Additional context
https://docs.microsoft.com/en-us/azure/databricks/kb/clusters/azure-core-limit

Metadata

Assignees

Labels

Type

No type

Projects

  • Status

    Todo

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions