Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add weekly tests for memory growth #3101

Merged
merged 26 commits into from
Jul 16, 2021
Merged

Add weekly tests for memory growth #3101

merged 26 commits into from
Jul 16, 2021

Conversation

krishung5
Copy link
Contributor

No description provided.

@krishung5 krishung5 requested a review from CoderHam July 9, 2021 17:15
@krishung5 krishung5 changed the title [WIP] Add memory weekly tests [WIP] Add weekly tests for memory growth Jul 9, 2021
qa/L0_client_memory_growth/test.sh Outdated Show resolved Hide resolved
qa/L0_memory_growth/test.sh Outdated Show resolved Hide resolved
Dockerfile.QA Outdated Show resolved Hide resolved
Dockerfile.QA Outdated Show resolved Hide resolved
qa/L0_memory_growth/test.sh Outdated Show resolved Hide resolved
Dockerfile.QA Outdated Show resolved Hide resolved
@krishung5 krishung5 marked this pull request as ready for review July 9, 2021 22:50
CoderHam
CoderHam previously approved these changes Jul 9, 2021
@krishung5 krishung5 changed the title [WIP] Add weekly tests for memory growth Add weekly tests for memory growth Jul 9, 2021
Dockerfile.QA Outdated
@@ -191,6 +191,12 @@ RUN if [ -d qa/L0_model_control_stress ]; then \
cp -r qa/L0_model_control_stress/. qa/L0_model_control_stress_valgrind_massif; \
fi

# Create L0_client_memory_growth_weekly and L0_memory_growth_weekly for weekly test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't need to copy or rename the tests. We are just going to run the existing tests weekly (instead of nightly). We may make changes to the existing tests but we can to that in the existing code.

@@ -35,7 +35,11 @@

if __name__ == '__main__':
today = date.today().strftime("%Y-%m-%d")
subject = "Triton Client Memory Growth Summary: " + today
# Set the subject for weekly and nightly tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can keep this, but for now we will just run this test weekly.

@@ -35,7 +35,11 @@

if __name__ == '__main__':
today = date.today().strftime("%Y-%m-%d")
subject = "Triton Server Memory Growth Summary: " + today
# Set the subject for weekly and nightly tests
if (sys.argv[1] == "weekly"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can keep this, but for now we will just run this test weekly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the argument is now a string, you don't need the conditional, instead, you can just set the subject variable based on the argument subject = "Triton Server Memory Growth " + sys.argv[1] + " Summary: " + today

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

GuanLuo
GuanLuo previously approved these changes Jul 13, 2021
deadeyegoodwin
deadeyegoodwin previously approved these changes Jul 13, 2021
CoderHam
CoderHam previously approved these changes Jul 13, 2021
RET=1
fi
set -e
# The busy op model causes issues when running the CI.
Copy link
Contributor

@CoderHam CoderHam Jul 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call out PTX here so we can quickly re-enable tests that were disabled due to PTX fsilures
Also make this a TODO

# TODO Re-enable after PTX issues are resolved. 

CoderHam
CoderHam previously approved these changes Jul 14, 2021
@@ -37,10 +37,12 @@
today = date.today().strftime("%Y-%m-%d")
subject = "Triton Client Memory Growth " + sys.argv[1] + " Summary: " + today
memory_graphs = glob.glob("client_memory_growth*.log")
html_content = "<html><head></head><body><pre style=\"font-size:11pt;font-family:Consolas;\">"
write_up = "<p>This test is run for both HTTP and GRPC protocols using C++ and Python test scripts. The thresholds of memory growth are set to 10MB and 1MB for C++ and Python tests individually as the max-allowed difference between mean and maximum memory usage.</p>"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The threshold is not 10MB, the max difference between mean and max memory usage is 10MB

for mem_graph in sorted(memory_graphs):
html_content += "\n" + mem_graph + "\n"
with open(mem_graph, "r") as f:
html_content += f.read() + "\n"
html_content += "</pre></body></html>"
nightly_email_helper.send(subject, html_content, is_html=True)
nightly_email_helper.send(subject, html_content, is_html=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add back empty line at end

for mem_graph in sorted(memory_graphs):
memory_graphs_resnet = glob.glob("memory_growth_resnet*.log")
memory_graphs_busyop = glob.glob("memory_growth_busyop.log")
write_up = "<p>This test uses perf_analyzer as clients running on 4 different models. The threshold of memory growth is set to 150MB as the max allowed difference between mean and maximum memory usage.</p>"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same fix here

@krishung5 krishung5 requested a review from CoderHam July 15, 2021 16:22
Copy link
Contributor

@CoderHam CoderHam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor changes, otherwise LGTM

Dockerfile.QA Outdated
@@ -1,4 +1,4 @@
# Copyright (c) 2018-2021, NVIDIA CORPORATION. All rights reserved.
# Copyright (c) 2018-2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove changes for copyright in files that were not modified.

@krishung5 krishung5 requested a review from CoderHam July 15, 2021 17:02
CoderHam
CoderHam previously approved these changes Jul 15, 2021
@krishung5 krishung5 merged commit f235751 into main Jul 16, 2021
@krishung5 krishung5 deleted the krish-memory-weekly-tests branch July 16, 2021 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants