-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Usage Stats Collection #2852
Merged
Merged
Usage Stats Collection #2852
Changes from 1 commit
Commits
Show all changes
62 commits
Select commit
Hold shift + click to select a range
e0e1386
Write info to local json
yhu422 739f4a1
Merge branch 'main' of github.com:vllm-project/vllm into usage
yhu422 c33b4cc
add usage context
yhu422 b74e3a6
removed usage_context from Engine_args
yhu422 c988e07
Move IO to another process
yhu422 88c5187
added http request
yhu422 85adbab
Merge branch 'main' of github.com:vllm-project/vllm into usage
yhu422 33c9dff
Added additional arg for from_engine_args
yhu422 ad609f0
comments
yhu422 8a2f18a
Write info to local json
yhu422 8e9e5be
Merge branch 'usage' of https://github.com/yhu422/vllm into usage
yhu422 f537692
Added Comments
yhu422 0f1ba7f
.
yhu422 abc3948
Collect usage info on engine initialization
yhu422 ec54145
Merge branch 'usage' of https://github.com/yhu422/vllm into usage
yhu422 f84ccaa
Write usage to local file for testing
yhu422 b08ba86
Fixed Formatting
yhu422 83ff459
Merge branch 'vllm-project:main' into usage
yhu422 73b689a
formatting changes
yhu422 86da72f
Merge branch 'usage' of https://github.com/yhu422/vllm into usage
yhu422 9c9a188
Minor bug fixed
yhu422 d2f84cf
tmp
yhu422 4e888e0
Merge branch 'main' of github.com:vllm-project/vllm into usage
yhu422 eb48061
Fixed Bug
yhu422 0684c06
Add Google Cloud Run service URL
yhu422 8e9890e
More GPU CPU Mem info
yhu422 5cf652a
Merge branch 'main' of github.com:vllm-project/vllm into usage
yhu422 d910b05
Added context constant
yhu422 8cf264b
Formatting & CPU Info
yhu422 93b8773
Update vllm/usage/usage_lib.py
yhu422 fe39b84
Added CPU info, new stat file path
yhu422 fc6e374
added gpu memory
yhu422 ab23171
added memory
yhu422 686c84a
Distinguish production/testing usage, added custom domain
yhu422 877eb78
formatting
yhu422 bc89a66
Merge branch 'main' of github.com:vllm-project/vllm into usage
yhu422 36fd304
Merge branch 'main' of github.com:vllm-project/vllm into usage
yhu422 e54f15b
test/prod distinction
yhu422 4e35b3b
Remove cpuinfo import
yhu422 a1597fb
ruff
yhu422 c580797
Merge branch 'main' of github.com:vllm-project/vllm into usage
yhu422 84353d4
fixed merge
yhu422 f2e69fc
Pass up model architecture info for GPUExecutor
yhu422 4e19967
formatting
yhu422 f327f3c
formatting
yhu422 d9c8a44
Get architecture directly from configs
yhu422 59f0f10
Merge branch 'main' of github.com:vllm-project/vllm into usage
simon-mo f34259a
edits round
simon-mo 30df77c
ruff
simon-mo 60b652b
Merge branch 'main' of github.com:vllm-project/vllm into usage
simon-mo be91bab
fix format
simon-mo 4f04743
finish all code level functionality
simon-mo f4bf862
add wip doc
simon-mo 6b968db
Merge branch 'main' of github.com:vllm-project/vllm into usage
simon-mo 2006788
revert some fixes
simon-mo db715c8
more fixes
simon-mo 2c1e557
finish doc, readability pass
simon-mo 42e66b8
edit pass
simon-mo a4e5742
fix doc and isort
simon-mo 9652830
Merge branch 'main' of github.com:vllm-project/vllm into usage
simon-mo ba63b44
bad merge
simon-mo 58fb78d
add to amd req txt
simon-mo File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Added CPU info, new stat file path
- Loading branch information
commit fe39b8410ca962d986994f9babecc484e378e494
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,17 +6,22 @@ | |
import requests | ||
import datetime | ||
import psutil | ||
import cpuinfo | ||
from threading import Thread | ||
from pathlib import Path | ||
from typing import Optional | ||
from enum import Enum | ||
|
||
_xdg_config_home = os.getenv('XDG_CONFIG_HOME', | ||
os.path.expanduser('~/.config')) | ||
_vllm_internal_path = 'vllm/usage_stats.json' | ||
|
||
_USAGE_STATS_FILE = os.path.join( | ||
os.path.dirname(os.path.abspath(__file__)), | ||
'usage_stats.json') #File path to store usage data locally | ||
_xdg_config_home, | ||
_vllm_internal_path) #File path to store usage data locally | ||
_USAGE_STATS_ENABLED = None | ||
_USAGE_STATS_SEVER = os.environ.get('VLLM_USAGE_STATS_SERVER', | ||
'https://stats.vllm.ai') | ||
_USAGE_STATS_SERVER = os.environ.get('VLLM_USAGE_STATS_SERVER', | ||
'https://stats.vllm.ai') | ||
_USAGE_STATS_URL = "https://vector-dev-server-uzyrqjjayq-uc.a.run.app" #Placeholder for sending usage data to vector.dev http server | ||
|
||
|
||
|
@@ -60,7 +65,7 @@ def _detect_cloud_provider() -> str: | |
'google': "GCP", | ||
'oraclecloud': "OCI", | ||
} | ||
|
||
for vendor_file in vendor_files: | ||
path = Path(vendor_file) | ||
if path.is_file(): | ||
|
@@ -115,7 +120,7 @@ def _report_usage(self, model: str, context: UsageContext) -> None: | |
self.model = model | ||
self.log_time = _get_current_timestamp_ns() | ||
self.num_cpu = os.cpu_count() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be good to get the type of CPU as well, such as it's product name so you can be aware of what ISA extensions are available for performance There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1! |
||
self.cpu_type = platform.processor() | ||
self.cpu_type = cpuinfo.get_cpu_info()['brand_raw'] | ||
self.total_memory = psutil.virtual_memory().total | ||
self._write_to_file() | ||
headers = {'Content-type': 'application/json'} | ||
|
@@ -126,7 +131,8 @@ def _report_usage(self, model: str, context: UsageContext) -> None: | |
print("Usage Log Request Failed") | ||
|
||
def _write_to_file(self): | ||
with open(_USAGE_STATS_FILE, "w") as outfile: | ||
os.makedirs(os.path.dirname(_USAGE_STATS_FILE), exist_ok=True) | ||
with open(_USAGE_STATS_FILE, "w+") as outfile: | ||
json.dump(vars(self), outfile) | ||
|
||
|
||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it makes more sense to get the model architecture vs the model name?
E.g. its probably more useful to know its the Llama architecture with size X over the name of the string for tracking purposes. Otherwise youll have to do this on the backend + it may not be recoverable for local models
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+100