Skip to content

Expose resource based autotuning #293

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Jul 1, 2024
Merged

Expose resource based autotuning #293

merged 23 commits into from
Jul 1, 2024

Conversation

Sushisource
Copy link
Member

What was changed

Expose resource based auto-tuner

Why?

Adding in all SDKs

Checklist

  1. Closes

  2. How was this tested:
    Added some tests, manual verification with omes / existing core tests

  3. Any docs updates needed?
    Will be after experimental phase

Comment on lines -455 to -458
var nonDetWorkflowFailForTypes = options.Workflows.Where(
w => w.FailureExceptionTypes?.Any(
t => t.IsAssignableFrom(typeof(WorkflowNondeterminismException))) ?? false).
Select(w => w.Name).ToArray();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was unused

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Wonder why there isn't an unused-local-var check in .NET, maybe I have it turned off

Copy link
Member

@cretz cretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM. I do have a suggestion about reworking the worker tuner hierarchy in a comment below.

Comment on lines +372 to +387
typedef enum SlotSupplier_Tag {
FixedSize,
ResourceBased,
} SlotSupplier_Tag;

typedef struct SlotSupplier {
SlotSupplier_Tag tag;
union {
struct {
struct FixedSizeSlotSupplier fixed_size;
};
struct {
struct ResourceBasedSlotSupplier resource_based;
};
};
} SlotSupplier;
Copy link
Member

@cretz cretz Jun 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes our header file a bit uglier to consume from C compared to mutually exclusive options (technically doesn't matter though)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷‍♂️ Eh, not really a problem IMO. If we care about the header being as easy as possible we can do that when we have another consumer, but honestly I think this is pretty easy to use as-is.


#[repr(C)]
#[derive(Clone, Copy, PartialEq, Debug)]
pub struct ResourceBasedTunerConfig {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub struct ResourceBasedTunerConfig {
pub struct ResourceBasedTunerOptions {

@@ -49,6 +48,43 @@ pub struct WorkerOptions {
nondeterminism_as_workflow_fail_for_types: ByteArrayRefArray,
}

#[repr(C)]
pub struct TunerHolder {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub struct TunerHolder {
pub struct WorkerTuner {

I think this is a better name, but doesn't matter much to have this separate/different "holder" terminology in the bridge only I suppose so long as user's don't have to see it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This matches the thing it's getting turned into Core side so I think it's worth keeping

/// </summary>
public int MaxConcurrentActivities { get; set; } = 100;
public int? MaxConcurrentActivities { get; set; }
Copy link
Member

@cretz cretz Jun 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is technically a backwards incompatible change (people could use this property in some log or something). I wonder if it'd be better to leave these and just say this is ignored if Tuner is non-null? I don't have a super strong opinion here, we can probably break this kind of compat if we must.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's probably worth it. It's more clear at the type level about what's going on and I think is fairly unlikely to break anyone and easy to fix if it is.

/// Gets a slot supplier for workflow tasks.
/// </summary>
/// <returns>A slot supplier for workflow tasks.</returns>
public abstract ISlotSupplier GetWorkflowTaskSlotSupplier();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public abstract ISlotSupplier GetWorkflowTaskSlotSupplier();
public abstract ISlotSupplier WorkflowTaskSlotSupplier { get; }

Lets make these abstract properties instead of getters

/// <summary>
/// WorkerTuners allow for the dynamic customization of some aspects of worker configuration.
/// </summary>
public abstract class WorkerTuner
Copy link
Member

@cretz cretz Jun 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an alternative suggestion here. .NET prefers interfaces and the default worker tuner can just be composite, so how about something like:

public interface IWorkerTuner
{
    ISlotSupplier WorkflowTaskSlotSupplier { get; }
    ISlotSupplier ActivityTaskSlotSupplier { get; }
    ISlotSupplier LocalActivitySlotSupplier { get; }
}

// Not a record on purpose
public class WorkerTuner : IWorkerTuner
{
    public static WorkerTuner CreateResourceBased(...)

    public static WorkerTuner CreateFixed(...)

    public WorkerTuner(
        ISlotSupplier workflowTaskSlotSupplier,
        ISlotSupplier activityTaskSlotSupplier,
        ISlotSupplier localActivitySlotSupplier)
    {
        WorkflowTaskSlotSupplier = workerTaskSlotSupplier;
        ActivityTaskSlotSupplier = activityTaskSlotSupplier;
        LocalActivitySlotSupplier = localActivitySlotSupplier;
    }

    public ISlotSupplier WorkflowTaskSlotSupplier { get; init; }
    public ISlotSupplier ActivityTaskSlotSupplier { get; init; }
    public ISlotSupplier LocalActivitySlotSupplier { get; init; }
}

And get rid of CompositeTuner and accept IWorkerTuner in the worker options. I think this is more idiomatic .NET (have interface and common impl as the primary impl). Arguably it'd be more idiomatic in .NET (unlike Python and others) to also make ResourceBasedTuner a class with a constructor instead of a static helper here. So could be something like:

public interface IWorkerTuner
{
    ISlotSupplier WorkflowTaskSlotSupplier { get; }
    ISlotSupplier ActivityTaskSlotSupplier { get; }
    ISlotSupplier LocalActivitySlotSupplier { get; }
}

// Not a record on purpose
public class WorkerTuner : IWorkerTuner
{
    public WorkerTuner(
        ISlotSupplier workflowTaskSlotSupplier,
        ISlotSupplier activityTaskSlotSupplier,
        ISlotSupplier localActivitySlotSupplier)
    {
        WorkflowTaskSlotSupplier = workerTaskSlotSupplier;
        ActivityTaskSlotSupplier = activityTaskSlotSupplier;
        LocalActivitySlotSupplier = localActivitySlotSupplier;
    }

    public ISlotSupplier WorkflowTaskSlotSupplier { get; init; }
    public ISlotSupplier ActivityTaskSlotSupplier { get; init; }
    public ISlotSupplier LocalActivitySlotSupplier { get; init; }
}

public class ResourceBasedTuner : WorkerTuner
{
    public ResourceBasedTuner(
        double targetMemoryUsage,
        double targetCpuUsage,
        ResourceBasedSlotSupplierOptions? workflowOptions = null,
        ResourceBasedSlotSupplierOptions? activityOptions = null,
        ResourceBasedSlotSupplierOptions? localActivityOptions = null)
        : base(
            new ResourceBasedSlotSupplier(workflowOptions ?? new(), new(targetMemoryUsage, targetCpuUsage),
            new ResourceBasedSlotSupplier(activityOptions ?? new(), new(targetMemoryUsage, targetCpuUsage),
            new ResourceBasedSlotSupplier(localActivityOptions ?? new(), new(targetMemoryUsage, targetCpuUsage))
    {
    }
}

public class FixedSizeTuner : WorkerTuner
{
    public FixedSizeTuner(int workflowTaskSlots, int activityTaskSlots, int localActivitySlots)
        : base(
            new FixedSizeSlotSupplier(workflowTaskSlots),
            new FixedSizeSlotSupplier(activityTaskSlots),
            new FixedSizeSlotSupplier(localActivitySlots))
    {
    }
}

I think that last one is the most idiomatic

/// <param name="numActivityTaskSlots">The number of available activity task slots.</param>
/// <param name="numLocalActivitySlots">The number of available local activity slots.</param>
/// <returns>The tuner.</returns>
public static WorkerTuner CreateFixed(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public static WorkerTuner CreateFixed(
public static WorkerTuner CreateFixedSize(

To be consistent

Comment on lines 533 to 534
this Temporalio.Worker.Tuning.ISlotSupplier supplier,
bool isWorkflow)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
this Temporalio.Worker.Tuning.ISlotSupplier supplier,
bool isWorkflow)
this Temporalio.Worker.Tuning.ISlotSupplier supplier,
bool isWorkflow)

Inconsistent spacing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dotnet format seems to just... not format stuff I would expect it to fairly often

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't use it to rewrite code, I just use it to know CI will pass

Comment on lines 60 to 62
uint numWorkflowTaskSlots,
uint numActivityTaskSlots,
uint numLocalActivitySlots)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it makes more sense in Rust, it's not that common in .NET to use uints unless you really need to (note our options don't really). Can check for negative at runtime if needed. Same type of thing as Go, most people pass around the common integer type and usually they aren't expected to convert for value-safety reasons.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what would rise to the definition of "really need to" in the sense that things that shouldn't be negative shouldn't be negative and that's a fairly binary distinction.

It's annoying as hell that unsigned types exist in these languages but aren't used for whatever reason. Oh well though, I'll change it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that is a bit annoying. I think the reasoning is more about the specific-bit-size integers and just carried over to here. For those, people avoid them unless they need specific binary sizes or are in a space-conscious place (e.g. embedded). And I think the general "use int for integers unless there is a good use case not to" was kinda the general accepted thing and applied to unsigned as well as specific-bit-size types. The only reason I bring up here is because in cases where uints or other things are used, .NET has a whole thing about unchecked conversions and it's a bit of a pain for devs using ints everywhere else.

/// A slot supplier that will only ever issue at most a fixed number of slots.
/// </summary>
/// <param name="NumSlots">The maximum number of slots that will ever be issued.</param>
public sealed record FixedSizeSlotSupplier(uint NumSlots) : ISlotSupplier;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public sealed record FixedSizeSlotSupplier(uint NumSlots) : ISlotSupplier;
public sealed record FixedSizeSlotSupplier(int SlotCount) : ISlotSupplier;

Besides the int/uint thing, I think .NET would prefer either Count suffix or just Slots over "num".

Comment on lines 23 to 24
uint? MinimumSlots = null,
uint? MaximumSlots = null,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we switch these to ints too?

Copy link
Member

@cretz cretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May need to downgrade proto to 23.x and regen those

Comment on lines 96 to 98
int numWorkflowTaskSlots,
int numActivityTaskSlots,
int numLocalActivitySlots)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int numWorkflowTaskSlots,
int numActivityTaskSlots,
int numLocalActivitySlots)
int workflowTaskSlots,
int activityTaskSlots,
int localActivitySlots)

Not a big deal, but if you do end up making another commit, might as well (or suffix with Count or whatever). Matters a bit more here than in Java because .NET supports named parameters.

Comment on lines 96 to 98
int numWorkflowTaskSlots,
int numActivityTaskSlots,
int numLocalActivitySlots)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, for single-line method implementations, can use => instead of a block + return. Makes no difference though.

@Sushisource Sushisource force-pushed the expose-autotune branch 3 times, most recently from 6e7d4cd to fabd19b Compare June 28, 2024 23:26
Copy link
Member

@cretz cretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, will leave unapproved pending further investigation for the worker.is_null() thing on worker_free and there may be some whitespace issues.

@Sushisource Sushisource merged commit 29753e5 into main Jul 1, 2024
7 checks passed
@Sushisource Sushisource deleted the expose-autotune branch July 1, 2024 22:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants