Request: Allow for adjustments at the layer-level, for a practically two-fold increase in LLM handling ability by prompters

# Feature Description

The project ["Brain Hacking Chip"](https://github.com/SoylentMithril/BrainHackingChip) demonstrates a sophisticated, albeit conceptually simple method of manipulating LLM inference, for a powerful increase in obedience. It has great potential to practically double a prompter's ability to guide an LLM toward desirable behaviors, because it allows for a prompter to *directly discourage* undesirable behaviors, without implying those undesirable behaviors are even possibilities.

It is my understanding that this kind of feature is currently very difficult to implement into LLaMA-CPP.

# Motivation

The "Brain Hacking Chip" project allows for negative prompts, which have been [demonstrated](https://github.com/SoylentMithril/BrainHackingChip#explain-softmax-with-an-instruction-to-type-in-l33t-sp34k) by the creator to allow for immediate gains in model obedience. I think this is significant, because negative prompting is relatively intuitive and accessible, especially for non-technical prompters.

Negative prompts are *especially* useful when trying to discourage the LLM from undesirable behaviors via prompting, because it circumvents the "Don't think of a pink elephant" problem - wherein explicitly mentioning the thing the LLM *shouldn't* do, *necessarily* puts that idea into mind, and thus pollutes the LLM's inference with the implication that this undesired idea is a possibility in the first place.

It is akin to the difference between telling a child, "Eat the vegetables on your plate, but don't take the candy inside the jar next to your plate," and telling a child, "Eat the vegetables on your plate" and erasing the jar from existence. 

If one's ability to command an LLM's behavior could be measured with a scalar, I'd say this could double it.

# Possible Implementation

I don't understand the details outside the ideas of vector manipulation, I assume those details are elaborated upon in the [repo](https://github.com/SoylentMithril/BrainHackingChip).

But, as someone who has spent a lot of time trying to guide LLM behavior through prompting, I recognize this as an extremely powerful way to improve the consistency and usefulness LLMs for end users, and think the community could greatly benefit from these kinds of experiments being easier to implement into LLaMA-CPP.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Request: Allow for adjustments at the layer-level, for a practically two-fold increase in LLM handling ability by prompters #4843

Feature Description

Motivation

Possible Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Request: Allow for adjustments at the layer-level, for a practically two-fold increase in LLM handling ability by prompters #4843

Description

Feature Description

Motivation

Possible Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions