Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not hardcode llama2 model in perf test #4657

Merged
merged 18 commits into from
Aug 12, 2024
Merged

Conversation

huydhn
Copy link
Contributor

@huydhn huydhn commented Aug 10, 2024

This fixes the hack in #4642. While I wanted to do the fix directly in #4642, it becomes more complicated because the hardcoded path is also used in the instrumented test case. So, I spin this out into a different PR.

This PR also fixes the way the test reports the TPS metric by calling sendStatus to correctly populate instrument log. For example,

INSTRUMENTATION_STATUS: class=com.example.executorchllamademo.PerfTest
INSTRUMENTATION_STATUS: current=1
INSTRUMENTATION_STATUS: id=AndroidJUnitRunner
INSTRUMENTATION_STATUS: numtests=1
INSTRUMENTATION_STATUS: stream=
com.example.executorchllamademo.PerfTest:
INSTRUMENTATION_STATUS: test=testTokensPerSecond
INSTRUMENTATION_STATUS_CODE: 1
INSTRUMENTATION_STATUS: TPS=224.24243 <--- Report here
INSTRUMENTATION_STATUS_CODE: 0
INSTRUMENTATION_STATUS: class=com.example.executorchllamademo.PerfTest
INSTRUMENTATION_STATUS: current=1
INSTRUMENTATION_STATUS: id=AndroidJUnitRunner
INSTRUMENTATION_STATUS: numtests=1
INSTRUMENTATION_STATUS: stream=.
INSTRUMENTATION_STATUS: test=testTokensPerSecond
INSTRUMENTATION_STATUS_CODE: 0
INSTRUMENTATION_RESULT: stream=

Time: 0.678

OK (1 test)


INSTRUMENTATION_CODE: -1

Testing

https://github.com/pytorch/executorch/actions/runs/10332366636

Copy link

pytorch-bot bot commented Aug 10, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/4657

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit d08a5eb with merge base a70d070 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 10, 2024
@huydhn huydhn marked this pull request as ready for review August 10, 2024 02:05
@facebook-github-bot
Copy link
Contributor

@huydhn has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@huydhn has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@huydhn has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@huydhn has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@huydhn has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@huydhn has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@huydhn
Copy link
Contributor Author

huydhn commented Aug 10, 2024

@guangy10 https://github.com/pytorch/executorch/actions/runs/10332366636/job/28606203444 instrument log, i.e. https://gha-artifacts.s3.amazonaws.com/device_farm/10332366636/2/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_8e51b233-2a4c-4380-85f5-7ad1c948805b_00002_00001_00000_00001_Customer_Artifacts.zip now highlight the error with stories110M where it fails to load:

INSTRUMENTATION_STATUS: class=com.example.executorchllamademo.PerfTest
INSTRUMENTATION_STATUS: current=1
INSTRUMENTATION_STATUS: id=AndroidJUnitRunner
INSTRUMENTATION_STATUS: numtests=1
INSTRUMENTATION_STATUS: stream=
com.example.executorchllamademo.PerfTest:
INSTRUMENTATION_STATUS: test=testTokensPerSecond
INSTRUMENTATION_STATUS_CODE: 1
INSTRUMENTATION_STATUS: ModelName=stories110M.pt <-- the model name is now reported here
INSTRUMENTATION_STATUS_CODE: 0
INSTRUMENTATION_STATUS: class=com.example.executorchllamademo.PerfTest
INSTRUMENTATION_STATUS: current=1
INSTRUMENTATION_STATUS: id=AndroidJUnitRunner
INSTRUMENTATION_STATUS: numtests=1
INSTRUMENTATION_STATUS: stack=java.lang.AssertionError: expected:<0> but was:<35>
	at org.junit.Assert.fail(Assert.java:89)
	at org.junit.Assert.failNotEquals(Assert.java:835)
	at org.junit.Assert.assertEquals(Assert.java:647)
	at org.junit.Assert.assertEquals(Assert.java:633)
	at com.example.executorchllamademo.PerfTest.lambda$testTokensPerSecond$1$com-example-executorchllamademo-PerfTest(PerfTest.java:51)
	at com.example.executorchllamademo.PerfTest$$ExternalSyntheticLambda1.accept(Unknown Source:6)
	at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:185)
	at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:485)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:475)
	at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
	at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:133)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:236)
	at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:435)
	at com.example.executorchllamademo.PerfTest.testTokensPerSecond(PerfTest.java:43)

@kirklandsign would probably know what this error means.

@@ -11,10 +11,12 @@ phases:
# Prepare the model and the tokenizer
- adb -s $DEVICEFARM_DEVICE_UDID shell "ls -la /sdcard/"
- adb -s $DEVICEFARM_DEVICE_UDID shell "mkdir -p /data/local/tmp/llama/"
- adb -s $DEVICEFARM_DEVICE_UDID shell "mv /sdcard/tokenizer.bin /data/local/tmp/llama/tokenizer.bin"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I just notice that this file is actually has a local copy stored in the demo-apps dir! Curious to know how it is uploaded to s3 so that the link https://ossci-assets.s3.amazonaws.com/android-llama2-device-farm-test-spec-v2.yml could work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, the upload is still done manually, I plan to write a workflow to automatically upload it to S3 in the next PR, so that we can just update this file

# The test spec can be downloaded from https://ossci-assets.s3.amazonaws.com/android-llama2-device-farm-test-spec.yml
test-spec: arn:aws:devicefarm:us-west-2:308535385114:upload:02a2cf0f-6d9b-45ee-ba1a-a086587469e6/abd86868-fa63-467e-a5c7-218194665a77
# The test spec can be downloaded from https://ossci-assets.s3.amazonaws.com/android-llama2-device-farm-test-spec-v2.yml
test-spec: https://ossci-assets.s3.amazonaws.com/android-llama2-device-farm-test-spec-v2.yml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename it to something more generic, i.e. android-llm-device-farm-test-spec-v2.yml?

- adb -s $DEVICEFARM_DEVICE_UDID shell "mv /sdcard/*.pt /data/local/tmp/llama/"
- adb -s $DEVICEFARM_DEVICE_UDID shell "chmod 664 /data/local/tmp/llama/*.bin"
- adb -s $DEVICEFARM_DEVICE_UDID shell "chmod 664 /data/local/tmp/llama/*.pte"
- adb -s $DEVICEFARM_DEVICE_UDID shell "chmod 664 /data/local/tmp/llama/*.pt"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably won't need this line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a bug here in .ci/scripts/test_llama.sh script? I only see 2 files stories110M.pt and the tokenizer, i.e. https://gha-artifacts.s3.amazonaws.com/pytorch/executorch/10332366636/artifact/stories110M_xnnpack/model.zip? That's why I attempt to add the *.pt file here

Comment on lines +48 to +50
int loadResult = mModule.load();
// Check that the model can be load successfully
assertEquals(0, loadResult);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

linter

// Find out the model name
File directory = new File(RESOURCE_PATH);
Arrays.stream(directory.listFiles())
.filter(file -> file.getName().endsWith(".pte") || file.getName().endsWith(".pt"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need .pt file, it's the suffix of the weights for stories110M

Copy link
Contributor Author

@huydhn huydhn Aug 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, let me remove this, but it looks like something is not right where the https://gha-artifacts.s3.amazonaws.com/pytorch/executorch/10332366636/artifact/stories110M_xnnpack/model.zip contains only stories110M.pt and the tokenizer. Should there be an exported stories110M.pte model there? Maybe this explain why it fails to load.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, the fix is in #4642

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you merge the test spec fix and ignore the "failed to load model" issue

Copy link
Contributor Author

@huydhn huydhn Aug 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, the fix is in #4642

Oh, got it, let me push a commit to fix the other comments and land this PR

Copy link
Contributor

@guangy10 guangy10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix. Make sure the comments are addressed before merging

@guangy10
Copy link
Contributor

LGTM. @kirklandsign may want to review the app as he is working on integrating a new activity.

@facebook-github-bot
Copy link
Contributor

@huydhn has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Comment on lines +14 to 18
- adb -s $DEVICEFARM_DEVICE_UDID shell "mv /sdcard/*.bin /data/local/tmp/llama/"
- adb -s $DEVICEFARM_DEVICE_UDID shell "mv /sdcard/*.pte /data/local/tmp/llama/"
- adb -s $DEVICEFARM_DEVICE_UDID shell "chmod 664 /data/local/tmp/llama/*.bin"
- adb -s $DEVICEFARM_DEVICE_UDID shell "chmod 664 /data/local/tmp/llama/*.pte"
- adb -s $DEVICEFARM_DEVICE_UDID shell "ls -la /data/local/tmp/llama/"
Copy link
Contributor

@guangy10 guangy10 Aug 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace "llama" with "llm" for those paths as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I remember this path /data/local/tmp/llama/ was hard-coded in the app before. If that's not the case anymore, I could update it, maybe after #4676 lands to avoid the need to upload the spec manually

@facebook-github-bot facebook-github-bot merged commit d53f8fa into main Aug 12, 2024
48 of 49 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants