Skip to content

Experiment with private rooted Pixel 3 devices #10192

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Apr 15, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .ci/scripts/gather_benchmark_configs.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
"samsung_galaxy_s22": "arn:aws:devicefarm:us-west-2:308535385114:devicepool:02a2cf0f-6d9b-45ee-ba1a-a086587469e6/e59f866a-30aa-4aa1-87b7-4510e5820dfa",
"samsung_galaxy_s24": "arn:aws:devicefarm:us-west-2:308535385114:devicepool:02a2cf0f-6d9b-45ee-ba1a-a086587469e6/98f8788c-2e25-4a3c-8bb2-0d1e8897c0db",
"google_pixel_8_pro": "arn:aws:devicefarm:us-west-2:308535385114:devicepool:02a2cf0f-6d9b-45ee-ba1a-a086587469e6/d65096ab-900b-4521-be8b-a3619b69236a",
"google_pixel_3_private_rooted": "arn:aws:devicefarm:us-west-2:308535385114:devicepool:02a2cf0f-6d9b-45ee-ba1a-a086587469e6/98d23ca8-ea9e-4fb7-b725-d402017b198d",
}

# Predefined benchmark configurations
Expand Down
62 changes: 62 additions & 0 deletions .github/workflows/android-perf-private-device-experiment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
name: android-perf (private devices)

on:
schedule:
- cron: 0 0,4,8,12,16,20 * * *
pull_request:
paths:
- .github/workflows/android-perf-private-device-experiment.yml
push:
branches:
- main
paths:
- .github/workflows/android-perf-private-device-experiment.yml
# Note: GitHub has an upper limit of 10 inputs
workflow_dispatch:
inputs:
models:
description: Models to be benchmarked
required: false
type: string
default: mv3,meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8,meta-llama/Llama-3.2-1B-Instruct-QLORA_INT4_EO8
devices:
description: Target devices to run benchmark
required: false
type: string
default: google_pixel_3_private_rooted
benchmark_configs:
description: The list of configs used the benchmark
required: false
type: string
workflow_call:
inputs:
models:
description: Models to be benchmarked
required: false
type: string
default: mv3,meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8,meta-llama/Llama-3.2-1B-Instruct-QLORA_INT4_EO8
devices:
description: Target devices to run benchmark
required: false
type: string
default: google_pixel_3_private_rooted
benchmark_configs:
description: The list of configs used the benchmark
required: false
type: string

concurrency:
group: android-perf-private-devices-${{ github.event.pull_request.number || github.ref_name }}-${{ github.ref_type == 'branch' && github.sha }}-${{ github.event_name == 'workflow_dispatch' }}-${{ github.event_name == 'schedule' }}
cancel-in-progress: true

jobs:
android:
uses: ./.github/workflows/android-perf.yml
secrets: inherit
permissions:
id-token: write
contents: read
with:
models: ${{ inputs.models }}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@guangy10 @kirklandsign Appreciate if you could pass me the list of models and benchmark configs that you want to cover here. We have only 2 devices there though, so I think let's reduce the number of model we want to cover for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's pick one non-genAI model (mv3, xnnpack_q8) and one genAI model (llama3.2-1b, spinquant & qlora)

mv3,meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8,meta-llama/Llama-3.2-1B-Instruct-QLORA_INT4_EO8

devices: google_pixel_3_private_rooted
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately we don't have data points of using the same type of device on public pool, can't do any comparison, so we would mainly rely on the new data on the private & rooted device and compare those across different runs.

benchmark_configs: ${{ inputs.benchmark_configs }}