-
-
Notifications
You must be signed in to change notification settings - Fork 8.4k
[Misc]add coding benchmark for speculative decoding #15303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
e5e1c73
807b771
e47203b
9e7c597
18dc0c0
6bbe72e
d393769
50ad733
de192e5
a6e95c7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -53,8 +53,9 @@ | |
from argparse import ArgumentParser as FlexibleArgumentParser | ||
|
||
from benchmark_dataset import (BurstGPTDataset, HuggingFaceDataset, | ||
RandomDataset, SampleRequest, ShareGPTDataset, | ||
SonnetDataset, VisionArenaDataset) | ||
InstructCoderDataset, RandomDataset, | ||
SampleRequest, ShareGPTDataset, SonnetDataset, | ||
VisionArenaDataset) | ||
from benchmark_utils import convert_to_pytorch_benchmark_format, write_to_json | ||
|
||
MILLISECONDS_TO_SECONDS_CONVERSION = 1000 | ||
|
@@ -588,9 +589,14 @@ def main(args: argparse.Namespace): | |
elif args.dataset_name == "hf": | ||
# Choose between VisionArenaDataset | ||
# and HuggingFaceDataset based on provided parameters. | ||
dataset_class = (VisionArenaDataset if args.dataset_path | ||
== VisionArenaDataset.VISION_ARENA_DATASET_PATH | ||
and args.hf_subset is None else HuggingFaceDataset) | ||
dataset_class = HuggingFaceDataset | ||
if args.dataset_path == VisionArenaDataset.VISION_ARENA_DATASET_PATH: | ||
assert args.hf_subset is None, "VisionArenaDataset needs hf_subset to be None." #noqa: E501 | ||
dataset_class = VisionArenaDataset | ||
elif args.dataset_path == "likaixin/InstructCoder": | ||
dataset_class = InstructCoderDataset | ||
args.hf_split = "train" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. curious why was this hardcoded to train and not paramterized by args.hf_split as done in other cases? If we are eval a model then should it be tested on non train split? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it is hardcoded for now, sure we could add it to args.hf_split. For train/eval split, everyone's method of split might be slightly different, so this is just an example. |
||
|
||
input_requests = dataset_class( | ||
dataset_path=args.dataset_path, | ||
dataset_subset=args.hf_subset, | ||
|
Uh oh!
There was an error while loading. Please reload this page.