Skip to content

Commit

Permalink
API: Fix CFG reporting
Browse files Browse the repository at this point in the history
THe model endpoint wasn't reporting if CFG is on.

Signed-off-by: kingbri <bdashore3@proton.me>
  • Loading branch information
bdashore3 committed Jan 2, 2024
1 parent bbd4ee5 commit 6b04463
Show file tree
Hide file tree
Showing 3 changed files with 3 additions and 1 deletion.
1 change: 1 addition & 0 deletions OAI/types/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ class ModelCardParameters(BaseModel):
cache_mode: Optional[str] = "FP16"
prompt_template: Optional[str] = None
num_experts_per_token: Optional[int] = None
use_cfg: Optional[bool] = None
draft: Optional["ModelCard"] = None


Expand Down
2 changes: 1 addition & 1 deletion config_sample.yml
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ model:

# Enables CFG support (default: False)
# WARNING: This flag disables Flash Attention! (a stopgap fix until it's fixed in upstream)
use_cfg: False
#use_cfg: False

# Options for draft models (speculative decoding). This will use more VRAM!
#draft:
Expand Down
1 change: 1 addition & 0 deletions main.py
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,7 @@ async def get_current_model():
cache_mode="FP8" if MODEL_CONTAINER.cache_fp8 else "FP16",
prompt_template=prompt_template.name if prompt_template else None,
num_experts_per_token=MODEL_CONTAINER.config.num_experts_per_token,
use_cfg=MODEL_CONTAINER.use_cfg,
),
logging=gen_logging.PREFERENCES,
)
Expand Down

0 comments on commit 6b04463

Please sign in to comment.