Telemetry Support #24

mrm9084 · 2024-04-17T19:56:22Z

Description

Adds Telemetry support + changes from design meeting

user_id without parameter
Added Targeting_Context

featuremanagement/_featuremanager.py

featuremanagement/_models/_targeting_context.py

featuremanagement/_models/_evaluation_event.py

featuremanagement/_models/_feature_flag.py

featuremanagement/_models/_telemetry.py

featuremanagement/_models/_targeting_context.py

zhiyuanliang-ms · 2024-05-13T06:00:15Z

@mrm9084
Please refer to this PR: microsoft/FeatureManagement-Dotnet#290

The variant assignment reason should be VariantAssignmentReason.NONE only when there is no Variant section in feature flag configuration. Discussion can be found here

BTW, I also didn't find the VariantAssignmentReason.DEFAULT_WHEN_ENABLED and DEFAULT_WHEN_DISABLED are used anywhere. Please refer to the .NET feature management.

zhiyuanliang-ms · 2024-05-13T08:29:30Z

The overall variant assign reason should be:

Only when there is no Variants section in feature flag configuration, the reason is None. In this case, in the telemetry sent to app insights, we will not attach variant and variant assignment reason.

If Allocation is none/empty, which can be considered as none/empty "DefaultWhenEnabled" and "DefaultWhenDisabled"
even if we get null/none variant in this case, the reason should be "DefaultWhenEnabled" or "DefaultWhenDisabled"

if allocation is not none and we hit some allocation rule and get a variant, the reason should be that allocation rule name.

if no allocation rule is hit and we get null/none variant, the reason should be "DefaultWhenEnabled" or "DefaultWhenDisabled"

The status_override of variant will not affect the variant assignment reason, because it takes place after the variant is assigned.

https://github.com/microsoft/FeatureManagement-Dotnet/blob/preview/src/Microsoft.FeatureManagement/FeatureManager.cs#L293

@zhenlan @jimmyca15 Please confirm whether this is correct.

tests/test_send_telemetry_appinsights.py

featuremanagement/aio/_featuremanager.py

featuremanagement/_models/_variant_assignment_reason.py

Co-authored-by: Zhiyuan Liang <141655842+zhiyuanliang-ms@users.noreply.github.com>

mrm9084 · 2024-05-13T18:33:06Z

The overall variant assign reason should be:

if there is no Variants, the reason is "None"

if allocation is none/empty, which can be considered as none/empty "DefaultWhenEnabled" and "DefaultWhenDisabled" even if we get null/none variant in this case, the reason should be "DefaultWhenEnabled" or "DefaultWhenDisabled"

if allocation is not none and we hit some allocation rule and get a variant, the reason should be that allocation rule name.

if no allocation rule is hit and we get null/none variant, the reason should be "DefaultWhenEnabled" or "DefaultWhenDisabled"

The status_override of variant will not affect the variant assignment reason, because it takes place after the variant is assigned.

https://github.com/microsoft/FeatureManagement-Dotnet/blob/preview/src/Microsoft.FeatureManagement/FeatureManager.cs#L293

@zhenlan @jimmyca15 Please confirm whether this is correct.

So, basically the default value of reason should be None unless otherwise specified? I think the code is as is because no telemetry is sent when the reason is set to None so I ended up not setting a value.

zhiyuanliang-ms · 2024-05-14T01:03:46Z

So, basically the default value of reason should be None unless otherwise specified? I think the code is as is because no telemetry is sent when the reason is set to None so I ended up not setting a value

Sorry, I just updated my previous comments. I didn't explain it clearly.

Only when there is no Variants in the feature flag configuration, we will set variant assignment reason as None and it will be sent to the telemetry as a "None" string. So, I think you may remove the "NONE" in the VariantAssignmentReason enum in this PR because it will never be used. If you choose to set the reason of evaluation event as None instead of VariantAssignmentReason.NONE.

.

mrm9084 · 2024-05-14T17:04:28Z

So, basically the default value of reason should be None unless otherwise specified? I think the code is as is because no telemetry is sent when the reason is set to None so I ended up not setting a value

Sorry, I just updated my previous comments. I didn't explain it clearly.

Only when there is no Variants in the feature flag configuration, we will set variant assignment reason as None and it will be sent to the telemetry as a "None" string. So, I think you may remove the "NONE" in the VariantAssignmentReason enum in this PR because it will never be used. If you choose to set the reason of evaluation event as None instead of VariantAssignmentReason.NONE.

.

We should try and be consistent in what we do between the libraries. As customers can access the EvaluationEvent. I'll just try summing up what I'm thinking.

By default, the value is VariantAssignmentReason.NONE.

But if the feature flag has any Variants then the VaraintAssignmentReason should be set to something else while we process the feature flag.

Implementation wise. Both Python and Dot Net use the EvaluationEvent as a main object in the evaluation logic and the only difference between is_enabled and get_variant is how it is read at the end.

samples/feature_variant_sample_with_telemetry.py

featuremanagement/opentelemetry/_send_telemetry.py

zhiyuanliang-ms · 2024-07-16T03:37:39Z

featuremanagement/azuremonitor/_send_telemetry.py

+        event[VARIANT] = evaluation_event.variant.name
+        event[REASON] = evaluation_event.reason.value
+
+    event["ETag"] = evaluation_event.feature.telemetry.metadata.get("etag", "")


We should publish all fields in feature.telemetry.metadata like this

If python provider sets "etag", "feature_flag_reference", "feature_flag_id" fields to the telemetry metadata, I think we should have another PR to update them to pascal case names.

@rossgrambo has this changed? When I wrote this originally, I copied the say outputs as DotNet.

.net provider: https://github.com/Azure/AppConfiguration-DotnetProvider/blob/preview/src/Microsoft.Extensions.Configuration.AzureAppConfiguration/FeatureManagement/FeatureManagementConstants.cs#L44

@rossgrambo has this changed? When I wrote this originally, I copied the say outputs as DotNet.

You might be thinking of the provider. FeatureManagement never knew what an ETag was- it just grabs everything in metadata and blindly adds it.

This should be correct now.

The telemetry metadata field name is inconsistent with .NET provider.

samples/feature_variant_sample_with_telemetry.py

featuremanagement/_featuremanager.py

featuremanagement/aio/_featuremanager.py

featuremanagement/_featuremanager.py

featuremanagement/aio/_featuremanager.py

zhiyuanliang-ms · 2024-07-16T05:18:43Z

featuremanagement/aio/_featuremanager.py

-        evaluation_event.variant = self._variant_name_to_variant(feature_flag, variant_name)
-        evaluation_event.feature = feature_flag
-        return evaluation_event
+            variant_name = self._assign_variant(targeting_context, evaluation_event)


Inconsistent with another feature manager. I am ok with another version.

This should be fixed. I went through an made sure the sync and async code matches. After I finish this PR I really need to make a base class both sync and async extend that should remove a good amount of duplicate code.

After I finish this PR I really need to make a base class both sync and async extend that should remove a good amount of duplicate code.

YESSSS!

zhiyuanliang-ms · 2024-07-16T05:35:52Z

featuremanagement/_featuremanager.py

        if not feature_flag.enabled:
            # Feature flags that are disabled are always disabled
-            evaluation_event = FeatureManager._check_default_disabled_variant(feature_flag)
+            FeatureManager._check_default_disabled_variant(evaluation_event)


check_variant_override is called in _check_default_disabled_variant. However, status_override should not work when feature_flag.enabled == false. Current behavior is wrong.

Currently, you have check_variant_override called in _check_default_disabled_variant, _check_default_enabled_variant and _assign_variant. The code path is complex.

I suggest we have the code path as below

def _check_feature(): feature_flag = _get_feature_flag() ... evaluation_event = EvaluationEvent(feature_flag) _check_feature_filters(evaluation_event) _assign_allocation(evaluation_event) if feature_flag.enabled: _check_variant_override()

_assign_allocation follows
#24 (comment)

I am ok with the current change on the _assign_variant method where default_when_enabled is also handled there.

@rossgrambo, @zhiyuanliang-ms, doesn't the fact that this override doesn't work mean that it is more complex as in the is_enabled case we need to return false always if the feature flag isn't enabled, but the correct override variant in the case of get_variant?

However, status_override should not work when feature_flag.enabled == false. Current behavior is wrong.

I don't think this is correct. We run both IsEnabled and Variant allocation on both .IsEnabled and .GetVariant calls, regardless of if the flag is enabled or disabled- then apply the override https://github.com/microsoft/FeatureManagement-Dotnet/blob/preview/src/Microsoft.FeatureManagement/FeatureManager.cs#L282

But according to the flowcharts it should be disabled if the flag is disabled... So actually you're right.

I misread the code. We have a check at the end to see if the flag is not disabled before overriding: https://github.com/microsoft/FeatureManagement-Dotnet/blob/preview/src/Microsoft.FeatureManagement/FeatureManager.cs#L354

doesn't the fact that this override doesn't work mean that it is more complex as in the is_enabled case we need to return false always if the feature flag isn't enabled, but the correct override variant in the case of get_variant?

Yes but it's not complex if you just do both everytime. Dotnet just always determines both the IsEnabled result and the Variant result (which will be defaultWhenDisabled in the disabled case). Then if IsEnabled was the calling function- just returns the bool, and if GetVariant was the calling function, returns the variant.

This code works the same as DotNet, which is why is does the slightly more complex operation, but it does simplify the whole logic as variants and is_enabled mostly work the same.

zhiyuanliang-ms

check_variant_override is called in _check_default_disabled_variant. However, status_override should not work when feature_flag.enabled == false. Current behavior is wrong.

Currently, you have check_variant_override called in _check_default_disabled_variant, _check_default_enabled_variant and _assign_variant. The code path is complex.

I suggest we have the code path as below

def _check_feature():
    feature_flag = _get_feature_flag()
    ...

    evaluation_event = EvaluationEvent(feature_flag)

    _check_feature_filters(evaluation_event)

    _assign_allocation(evaluation_event)

    if feature_flag.enabled:
        _check_variant_override()

_assign_allocation follows
#24 (comment)

I am ok with the current change on the _assign_variant method where default_when_enabled is also handled there.

zhiyuanliang-ms · 2024-07-23T03:26:36Z

featuremanagement/aio/_featuremanager.py

-            feature_flag.variants, feature_flag.allocation.default_when_disabled, False
+        evaluation_event.reason = VariantAssignmentReason.DEFAULT_WHEN_DISABLED
+        if not evaluation_event.feature.allocation:
+            evaluation_event.enabled = False


In line 286,

if not evaluation_event.enabled: FeatureManager._check_default_disabled_variant(evaluation_event)

Setting evaluation_event.enabled = False in _check_default_disabled_variant is redundant.

In line 327,

if not feature_flag.enabled: # Feature flags that are disabled are always disabled FeatureManager._check_default_disabled_variant(evaluation_event) if feature_flag.allocation: variant_name = feature_flag.allocation.default_when_disabled evaluation_event.variant = self._variant_name_to_variant(feature_flag, variant_name) evaluation_event.feature = feature_flag # If a feature flag is disabled and override can't enable it evaluation_event.enabled = False return evaluation_event

After calling _check_default_disabled_variant, you set evaluation_event.enabled = False again. I understand this is handing the corner case for status_override.

Why not extract the logic of handling status_override and put it at the end of `_check_feature"?

You have short cut when if not feature_flag.enabled. Except for this case, the code path will always fall into the status override check.

I don't think it's necessary to add this layer of abstraction for both _check_default_disabled_variant and _check_default_enabled_variant methods. The logic within these two functions isn't complex, and they aren't used frequently. Writing the logic inline won't make the code harder to read.

For the second part, I still want to set the variant info for is_enabled as it will be provided to telemetry either way. And I believe that is how DotNet also provides things.

zhiyuanliang-ms · 2024-07-23T04:24:32Z

featuremanagement/azuremonitor/_send_telemetry.py

+        event[REASON] = evaluation_event.reason.value
+
+    for metadata_key, metadata_value in evaluation_event.feature.telemetry.metadata.items():
+        if metadata_key not in event:


.NET Provider will put these fields in feature flag telemetry metadata.

public const string ETag = "ETag"; public const string FeatureFlagId = "FeatureFlagId"; public const string FeatureFlagReference = "FeatureFlagReference";

https://github.com/Azure/AppConfiguration-DotnetProvider/blob/preview/src/Microsoft.Extensions.Configuration.AzureAppConfiguration/FeatureManagement/FeatureManagementConstants.cs#L44

However,in this PR

python provider will put these fields

ETAG_KEY = "etag" FEATURE_FLAG_REFERENCE_KEY = "feature_flag_reference" FEATURE_FLAG_ID_KEY = "feature_flag_id"

Will be updated in the Provider PR once Auto Failover has been completed.

featuremanagement/_featuremanager.py

mrm9084 added 9 commits April 4, 2024 13:35

telemetry support

e3fd558

updating from feedback

a9847cc

Changes from design meeting

de6a442

formatting, updating samples

089268f

Merge branch 'main' into Telemetry

2dbecec

fixing merge issue

43e2c04

typing issue

4219544

Merge branch 'main' into Telemetry

86a50e1

fixing test validations from changes

545175a

rossgrambo reviewed May 3, 2024

View reviewed changes

featuremanagement/_featuremanager.py Outdated Show resolved Hide resolved

Update test_json_validations.py

eb1c8c2

zhiyuanliang-ms reviewed May 13, 2024

View reviewed changes

featuremanagement/_models/_targeting_context.py Show resolved Hide resolved

zhiyuanliang-ms reviewed May 13, 2024

View reviewed changes

featuremanagement/_models/_evaluation_event.py Outdated Show resolved Hide resolved

zhiyuanliang-ms reviewed May 13, 2024

View reviewed changes

featuremanagement/_models/_feature_flag.py Outdated Show resolved Hide resolved

zhiyuanliang-ms reviewed May 13, 2024

View reviewed changes

featuremanagement/_models/_telemetry.py Outdated Show resolved Hide resolved

zhiyuanliang-ms reviewed May 13, 2024

View reviewed changes

featuremanagement/_models/_targeting_context.py Show resolved Hide resolved

zhiyuanliang-ms reviewed May 13, 2024

View reviewed changes

tests/test_send_telemetry_appinsights.py Show resolved Hide resolved

zhiyuanliang-ms reviewed May 13, 2024

View reviewed changes

featuremanagement/aio/_featuremanager.py Outdated Show resolved Hide resolved

zhiyuanliang-ms reviewed May 13, 2024

View reviewed changes

featuremanagement/_models/_variant_assignment_reason.py Outdated Show resolved Hide resolved

mrm9084 and others added 4 commits May 13, 2024 10:07

Apply suggestions from code review

61d0022

Co-authored-by: Zhiyuan Liang <141655842+zhiyuanliang-ms@users.noreply.github.com>

Updating doc strings

b1a2b25

Spelling

8025453

Updating async name.

06f9536

Adding missing eval reason. Fixed formatting.

40a1275

mrm9084 changed the base branch from main to beta May 13, 2024 22:05