test: Add blue/green deployment test #24142

def- · 2023-12-28T23:33:34Z

Following https://www.notion.so/materialize/Testing-Plan-Blue-Green-Deployments-01528a1eec3b42c3a25d5faaff7a9bf9#f53b51b110b044859bf954afc771c63a

I've seen latencies go to 3 seconds, so hope the 5 seconds limit is ok.

There isn't really a useful check for correctness at the moment.

I'm wondering if the > 60 s when waiting for pending dataflows is acceptable.

Checklist

This PR has adequate test coverage / QA involvement has been duly considered.
This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
This PR includes the following user-facing behavior changes:

def- · 2023-12-28T23:34:23Z

test/cluster/mzcompose.py

@@ -205,7 +205,7 @@ def workflow_test_github_12251(c: Composition) -> None:
    c.down(destroy_volumes=True)
    c.up("materialized")

-    start_time = time.process_time()
+    start_time = time.time()


process_time doesn't include the time waiting for Mz to return results: https://docs.python.org/3/library/time.html#time.process_time

def- · 2023-12-29T11:56:23Z

I now got the test green by bumping the testdrive timeout to 5 minutes and the load generator interval to 1s, with 100ms was often seeing dataflows pending. I'm wondering if the way we measure dataflows pending doesn't work for cases where data comes faster than once a second.

philip-stoev · 2023-12-30T12:38:51Z

test/cluster/blue-green-deployment/deploy.td

+  FOR ALL TABLES
+> CREATE MATERIALIZED VIEW prod_deploy.tpch_mv
+  IN CLUSTER prod_deploy AS
+  SELECT


You may wish to have a dedicated constant column that distinguishes the original view from the swapped-in one. Then check that the value of that column is as expected post-swap.

philip-stoev · 2023-12-30T12:39:59Z

test/cluster/blue-green-deployment/setup.td

+  STORAGE ADDRESSES ['clusterd1:2103'],
+  COMPUTECTL ADDRESSES ['clusterd1:2101'],
+  COMPUTE ADDRESSES ['clusterd1:2102'],
+  WORKERS 1))


Would more workers and/or more storageds make this test more stessful/realistic?

For example, have the two clusters have different sizes/shapes?

I'm not sure. My understanding is that one of the motivations is also changing the schema without any downtime. I'll try a different cluster though.

philip-stoev · 2023-12-30T12:41:41Z

test/cluster/mzcompose.py

+                while running:
+                    total_runtime = 0
+                    queries = [
+                        "SELECT * FROM prod.counter_mv",


do you need to direct those SELECTs to a particular cluster, or the default one is good enough?

In my understanding default should be good enough.

philip-stoev · 2023-12-30T12:46:30Z

test/cluster/mzcompose.py

+        threads = [PropagatingThread(target=fn) for fn in (selects, subscribe)]
+        for thread in threads:
+            thread.start()
+        time.sleep(3)  # some time to make sure the queries run fine


I think this interval may need to be longer. Or, alternatively, some check is needed to confirm that every type of query ran at least once post-swap.

philip-stoev · 2023-12-30T12:47:50Z

test/cluster/blue-green-deployment/deploy.td

+# by the Apache License, Version 2.0.
+
+> CREATE SCHEMA prod_deploy
+> CREATE SOURCE prod_deploy.counter IN CLUSTER prod_deploy FROM LOAD GENERATOR counter (TICK INTERVAL '1s')


You can use a sub-second TICK INTERVAL here, like 0.01s, to make sure stuff actively happens in the system all the time while the swap is in progress.

I tried this before and it always led to the pending dataflows never becoming considered ready, since their lag would stay at > 1 s. I'm not sure if this is a limitation of Mz or the way that we check the pending dataflows.

benesch · 2023-12-31T17:22:19Z

Won't have time to review, sorry!

Following https://www.notion.so/materialize/Testing-Plan-Blue-Green-Deployments-01528a1eec3b42c3a25d5faaff7a9bf9#f53b51b110b044859bf954afc771c63a I've seen latencies go to 3 seconds

def- requested review from benesch, philip-stoev, ggnall, ParkMyCar and teskje December 28, 2023 23:33

def- commented Dec 28, 2023

View reviewed changes

def- force-pushed the pr-blue-green-deployment-test branch from ac8e1a0 to e8f6247 Compare December 29, 2023 11:25

philip-stoev reviewed Dec 30, 2023

View reviewed changes

philip-stoev approved these changes Dec 30, 2023

View reviewed changes

benesch removed their request for review December 31, 2023 17:22

def- added 2 commits January 2, 2024 11:22

test: Add blue/green deployment test

f714764

Following https://www.notion.so/materialize/Testing-Plan-Blue-Green-Deployments-01528a1eec3b42c3a25d5faaff7a9bf9#f53b51b110b044859bf954afc771c63a I've seen latencies go to 3 seconds

Address reviewer comments

323a66b

def- force-pushed the pr-blue-green-deployment-test branch from e8f6247 to 323a66b Compare January 2, 2024 11:25

def- merged commit a4bd149 into MaterializeInc:main Jan 2, 2024
12 checks passed

def- deleted the pr-blue-green-deployment-test branch January 2, 2024 16:48

ParkMyCar approved these changes Jan 2, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: Add blue/green deployment test #24142

test: Add blue/green deployment test #24142

def- commented Dec 28, 2023

def- Dec 28, 2023

def- commented Dec 29, 2023

philip-stoev Dec 30, 2023

philip-stoev Dec 30, 2023 •

edited

Loading

def- Jan 2, 2024

philip-stoev Dec 30, 2023

def- Jan 2, 2024

philip-stoev Dec 30, 2023

philip-stoev Dec 30, 2023

def- Jan 2, 2024

benesch commented Dec 31, 2023

test: Add blue/green deployment test #24142

test: Add blue/green deployment test #24142

Conversation

def- commented Dec 28, 2023

Checklist

Choose a reason for hiding this comment

def- commented Dec 29, 2023

Choose a reason for hiding this comment

philip-stoev Dec 30, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benesch commented Dec 31, 2023

philip-stoev Dec 30, 2023 •

edited

Loading