Skip to content

grammar doesn't work in parallel decoding even when np = 1 #3650

Closed
@ibehnam

Description

@ibehnam

Expected Behavior

When using ./main with the --grammar flag, llama.cpp successfully generates an output according to the grammar string.

It is expected that this behavior transfers to ./parallel as well.

Current Behavior

./parallel <args> ... --grammar <grammar_string> doesn't respect the grammar, so llama.cpp generates free-form text.

Environment and Context

MacBook Pro, M1 Pro chip, macOS Sonoma

  • Operating System, e.g. for Linux:

$ uname -a

Darwin <my_username>.local 23.0.0 Darwin Kernel Version 23.0.0: Fri Sep 15 14:41:43 PDT 2023; root:xnu-10002.1.13~1/RELEASE_ARM64_T6000 arm64

  • SDK version, e.g. for Linux:
$ python3 --version
$ make --version

GNU Make 3.81
Copyright (C) 2006  Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.

This program built for i386-apple-darwin11.3.0

$ g++ --version

Apple clang version 15.0.0 (clang-1500.0.40.1)
Target: arm64-apple-darwin23.0.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

Failure Information (for bugs)

I'm not sure if it's related, but I noticed parallel decoding treats each line of the prompt as a separate prompt (for separate sequences).

Also, parallel decoding seems to take place ina chat settings, not completion settings.

Steps to Reproduce

For example, try this:

./parallel --prompt 'What's your favorite number?' --in-prefix '' --in-suffix '' --model <model_path> --ctx-size 8192 --color --n-predict 128 --keep 0 --temp 0.8 --repeat-penalty 1.1 --repeat-last-n 64 --grammar '# `root` specifies the pattern for the overall output
root ::= (
    value
)

value ::= "1" | "2" | "3"
' --parallel 1 --sequences 1 --threads 10 --n-gpu-layers 128 --main-gpu 0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions