You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(features) Use dataclasses for flagpole instead of pydantic (#75859)
These changes switch flagpole from using pydantic as the base classes to
`dataclasses` from stdlib. During the pydantic 2 upgrade the time to
parse features shot way up which was unexpected. We have also not been
as impressed with feature flag match times and suspected that pydantic
might be contributing overhead.
The changes of this pull request re-implement flagpole with basic python
dataclasses. The new implementation has reduced time to build feature
flags, which should help improve the overall performance of our feature
flagging. Improvements were measured both with cProfile, and local
mini-benchmarks. (scripts provided below)
## cProfile results
The cprofile script builds a feature 1000 times and collects profiling
data from those operations.
*current master (pydantic)*
```
9004 function calls in 1.125 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 1.125 1.125 <string>:1(<module>)
1000 1.123 0.001 1.124 0.001 __init__.py:117(from_feature_dictionary)
1 0.001 0.001 1.125 1.125 flagpole-profile:21(main)
2000 0.001 0.000 0.001 0.000 typing.py:2424(get_origin)
1 0.000 0.000 1.125 1.125 {built-in method builtins.exec}
6000 0.000 0.000 0.000 0.000 {built-in method builtins.isinstance}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
```
*after (dataclasses)*
```
41004 function calls in 0.009 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.010 0.010 <string>:1(<module>)
3000 0.000 0.000 0.000 0.000 <string>:2(__init__)
1000 0.001 0.000 0.009 0.000 __init__.py:128(from_feature_dictionary)
1000 0.001 0.000 0.008 0.000 __init__.py:134(<listcomp>)
2000 0.002 0.000 0.004 0.000 conditions.py:193(_condition_from_dict)
3000 0.002 0.000 0.007 0.000 conditions.py:217(from_dict)
3000 0.000 0.000 0.004 0.000 conditions.py:219(<listcomp>)
2000 0.000 0.000 0.000 0.000 enum.py:1093(__new__)
2000 0.000 0.000 0.000 0.000 enum.py:1255(value)
2000 0.000 0.000 0.000 0.000 enum.py:193(__get__)
2000 0.000 0.000 0.001 0.000 enum.py:686(__call__)
1 0.000 0.000 0.010 0.010 flagpole-profile:21(main)
1 0.000 0.000 0.010 0.010 {built-in method builtins.exec}
1000 0.000 0.000 0.000 0.000 {built-in method builtins.isinstance}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
19000 0.001 0.000 0.001 0.000 {method 'get' of 'dict' objects}
```
While significantly more functions were called the overall runtime is
*much* better.
<details>
<summary>flagpole-profile script</summary>
```python
#!/usr/bin/env python
from sentry.runner import configure
from flagpole import Feature as FlagpoleFeature
import cProfile
configure()
feature_names = (
# Largest feature flag is 2688 bytes
"organizations:user-feedback-ui",
# 404 bytes is around median
"organizations:performance-chart-interpolation",
# 97 bytes is smallest feature.
"organizations:new-weekly-report",
)
feature_config = {}
def main():
feature_name = "organizations:user-feedback-ui"
# feature_name = "organizations:performance-chart-interpolation"
# feature_name = "organizations:new-weekly-report"
for _ in range(1000):
FlagpoleFeature.from_feature_dictionary(feature_name, feature_config[feature_name])
if __name__ == "__main__":
from sentry import options
for feature_name in feature_names:
feature_config[feature_name] = options.get(f"feature.{feature_name}")
cProfile.run('main()')
```
</details>
## Micro benchmark
In the microbenchmark I looked at 3 feature flags (the max, approximate
median and smallest features). Each feature would be parsed 10000 times
and I looked at the min, mean and max duration for building a Feature
object from dictionary data loaded from options seeded with the current
feature flag inventory.
*before (pydantic)*
```
Results for organizations:user-feedback-ui
option load_time 0.0018911361694335938
build_time min 0.0004580021
build_time max 0.0011467934
build_time mean 0.0004902341
RSS memory usage 272400384
Results for organizations:performance-chart-interpolation
option load_time 0.0030400753021240234
build_time min 0.0000336170
build_time max 0.0001301765
build_time mean 0.0000367536
RSS memory usage 272400384
Results for organizations:new-weekly-report
option load_time 0.0022568702697753906
build_time min 0.0000057220
build_time max 0.0000231266
build_time mean 0.0000069423
RSS memory usage 272400384
```
*after (dataclasses)*
```
Results for organizations:user-feedback-ui
option load_time 0.0033750534057617188
build_time min 0.0000026226
build_time max 0.0000209808
build_time mean 0.0000032377
RSS memory usage 276054016
Results for organizations:performance-chart-interpolation
option load_time 0.0033571720123291016
build_time min 0.0000016689
build_time max 0.0000610352
build_time mean 0.0000028541
RSS memory usage 276054016
Results for organizations:new-weekly-report
option load_time 0.003008127212524414
build_time min 0.0000000000
build_time max 0.0000047684
build_time mean 0.0000007447
RSS memory usage 276070400
```
<details>
<summary>flagpole-timing script</summary>
```python
#!/usr/bin/env python
from sentry.runner import configure
from flagpole import Feature as FlagpoleFeature
import gc
import statistics
import time
import psutil
configure()
def main():
from sentry import options
gc.disable()
feature_names = (
# Largest feature flag is 2688 bytes
"organizations:user-feedback-ui",
# 404 bytes is around median
"organizations:performance-chart-interpolation",
# 97 bytes is smallest feature.
"organizations:new-weekly-report",
)
for feature_name in feature_names:
load_start = time.time()
option_val = options.get(f"feature.{feature_name}")
load_end = time.time()
build_durations = []
for _ in range(0, 10000):
build_start = time.time()
FlagpoleFeature.from_feature_dictionary(feature_name, option_val)
build_end = time.time()
build_durations.append(build_end - build_start)
load_time = load_end - load_start
rss_mem = psutil.Process().memory_info().rss
print("")
print(f"Results for {feature_name}")
print("")
print(f"option load_time {load_time}")
print("")
print("build_time min", "{:.10f}".format(min(build_durations)))
print("build_time max", "{:.10f}".format(max(build_durations)))
print("build_time mean", "{:.10f}".format(statistics.mean(build_durations)))
print("")
print(f"RSS memory usage {rss_mem}")
if __name__ == "__main__":
main()
```
</details>
Again we see significant improvements in runtime without any impact in
memory usage.
# What we lose?
While moving to dataclasses gives us some gains in performance it has a
few drawbacks:
- The dataclass implementation isn't as typesound as pydantic would be.
Each and every property is not validated for type correctness.
- Updating the jsonschema will be manual going forward. Pydantic has a
great integration with jsonschema that allows you to generate jsonschema
documents from objects. Dataclasses do not.
- Drift between the schema and implementation could occur. Because the
jsonschema and code are not connected they could drift in the future.
0 commit comments