`grammar` doesn't work in `parallel` decoding even when `np = 1`

# Expected Behavior

When using `./main` with the `--grammar` flag, llama.cpp successfully generates an output according to the grammar string.

It is expected that this behavior transfers to `./parallel` as well.

# Current Behavior

`./parallel <args> ... --grammar <grammar_string>` doesn't respect the grammar, so llama.cpp generates free-form text.

# Environment and Context

MacBook Pro, M1 Pro chip, macOS Sonoma

* Operating System, e.g. for Linux:

`$ uname -a`

Darwin <my_username>.local 23.0.0 Darwin Kernel Version 23.0.0: Fri Sep 15 14:41:43 PDT 2023; root:xnu-10002.1.13~1/RELEASE_ARM64_T6000 arm64

* SDK version, e.g. for Linux:

```
$ python3 --version
$ make --version

GNU Make 3.81
Copyright (C) 2006  Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.

This program built for i386-apple-darwin11.3.0

$ g++ --version

Apple clang version 15.0.0 (clang-1500.0.40.1)
Target: arm64-apple-darwin23.0.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
```

# Failure Information (for bugs)

I'm not sure if it's related, but I noticed `parallel` decoding treats each line of the prompt as a separate prompt (for separate sequences). 

Also, `parallel` decoding seems to take place ina **chat** settings, not **completion** settings.

# Steps to Reproduce

For example, try this:


```
./parallel --prompt 'What's your favorite number?' --in-prefix '' --in-suffix '' --model <model_path> --ctx-size 8192 --color --n-predict 128 --keep 0 --temp 0.8 --repeat-penalty 1.1 --repeat-last-n 64 --grammar '# `root` specifies the pattern for the overall output
root ::= (
    value
)

value ::= "1" | "2" | "3"
' --parallel 1 --sequences 1 --threads 10 --n-gpu-layers 128 --main-gpu 0
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`grammar` doesn't work in `parallel` decoding even when `np = 1` #3650

Expected Behavior

Current Behavior

Environment and Context

Failure Information (for bugs)

Steps to Reproduce

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

grammar doesn't work in parallel decoding even when np = 1 #3650

Description

Expected Behavior

Current Behavior

Environment and Context

Failure Information (for bugs)

Steps to Reproduce

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`grammar` doesn't work in `parallel` decoding even when `np = 1` #3650