Eval bug: Ctx shifting (llama_memory_seq_rm + llama_memory_seq_add) no longer works with GLM-4-32B-0414

### Name and Version

latest build b7921

### Operating systems

Windows

### GGML backends

Vulkan

### Hardware

RTX 4090

### Models

GLM-4-32B-0414

### Problem description & steps to reproduce

By bisecting the commits, I can isolate this regression which occurs after merging https://github.com/ggml-org/llama.cpp/pull/18986 

Before, shifting (llama_memory_seq_rm + llama_memory_seq_add in the middle of ctx) works fine. After this PR, this capability no longer works. 

To reproduce: 

- Populate KV with a long example prompt
- Remove some tokens in the middle by using `llama_memory_seq_rm` + `llama_memory_seq_add`
- Generate from the end and observe the output

The loss of coherence is not immediately obvious, I have attached an example output to illustrate this below.

### First Bad Commit

https://github.com/ggml-org/llama.cpp/commit/a5eaa1d6a3732bc0f460b02b61c95680bba5a012

### Relevant log output

This is a partial example, but it shows the effect of this regression on the text completion: 

Partial prompt:
```
The edge of the town loomed closer, its crooked shapes stark against the deepening indigo of the night sky. The air grew heavier with the smell of woodsmoke and 
```

Ctx Shift + Continuation, Before a5eaa1 (good): 
```
damp cobblestones. He couldn't risk entering the main streets just yet; dawn was still hours away, and Silas Vane’s hunters would likely be patrolling with purpose. He needed a different approach.
```

Ctx Shift + Continuation, After a5eaa1 (bad):
```
the pile of shivering documents would have been better under the lock picked. He had one more thing: the only thing he ownedle in the world. He had to pick up the pieces he had just found out of the basket, trying to get of course of all his pockets, watching the ground of his possessions with a wary hand.
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eval bug: Ctx shifting (llama_memory_seq_rm + llama_memory_seq_add) no longer works with GLM-4-32B-0414 #19292

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Eval bug: Ctx shifting (llama_memory_seq_rm + llama_memory_seq_add) no longer works with GLM-4-32B-0414 #19292

Description

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions