-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Full DynaTemp implementation + UI #600
Full DynaTemp implementation + UI #600
Conversation
Controllable through experts.txt
Checkbox partial implemented, Min and Max Temp implemented
Trigger DynaTemp on checkbox
Hell Yeah! DynaTemp!
Fixes broken presets and mirostat
Fixed broken presets and miro
Also removed unnecessary softmax double precision
Reintroduce unused rep pen function, move temp functions first before entropy dynamic temp
and also delete experts.txt since adjustable routing is also being removed for the PR
@@ -723,6 +723,15 @@ extern "C" { | |||
float p, | |||
size_t min_keep); | |||
|
|||
/// @details DYNATEMP! #TODO KALO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the assumption he made is there'd be documentation / explanation in line with the other samplers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that is the assumption I made
gpttype_adapter.cpp
Outdated
@@ -1943,7 +1958,7 @@ generation_outputs gpttype_generate(const generation_inputs inputs, generation_o | |||
|
|||
id = SampleLogits(logitsPtr, nctx, n_vocab, last_n_size, repeat_penalty, presence_penalty, | |||
top_k, top_a, top_p, min_p, typical_p, tfs_z, temp, rng, | |||
kcpp_params->mirostat, kcpp_params->mirostat_tau, kcpp_params->mirostat_eta, sampler_order, grammar); | |||
kcpp_params->mirostat, kcpp_params->mirostat_tau, kcpp_params->mirostat_eta, sampler_order, grammar, dynatemp, min_temp, max_temp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LostRuins Not related to this PR but it feels like kcpp_params should just be passed in instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, will probably refactor it soon.
void llama_sample_temperature(struct llama_context * ctx, llama_token_data_array * candidates_p, float temp) { | ||
llama_sample_temp(ctx, candidates_p, temp); | ||
} | ||
|
||
void llama_sample_entropy(struct llama_context * ctx, llama_token_data_array * candidates_p, float temp, float min_temp = 0, float max_temp = 2.0f) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be good to hide all of the debugging printfs in this function behind the debugmode flag.
... although that's not accessible from here given that this is llama.cpp
void llama_sample_temperature(struct llama_context * ctx, llama_token_data_array * candidates_p, float temp) { | ||
llama_sample_temp(ctx, candidates_p, temp); | ||
} | ||
|
||
void llama_sample_entropy(struct llama_context * ctx, llama_token_data_array * candidates_p, float temp, float min_temp = 0, float max_temp = 2.0f) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess temp just isn't used at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's not ran with dynamic temp turned on to avoid obvious conflicts
common/sampling.h
Outdated
@@ -25,6 +25,9 @@ typedef struct llama_sampling_params { | |||
int32_t mirostat = 0; // 0 = disabled, 1 = mirostat, 2 = mirostat 2.0 | |||
float mirostat_tau = 5.00f; // target entropy | |||
float mirostat_eta = 0.10f; // learning rate | |||
bool dynatemp = false; // dynamic temperature |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the first thing that stands out first is - I don't think a bool for toggling dynatemp is necessary. I think this can be simplified and there are 2 very good ways of doing it.
First way is to make it part of the sampler, so having dynatemp_min
and dynatemp_max
both be 0 would leave dynatemp inactive, otherwise it would be active and applied (overriding temperature). That puts it in line with all other samplers that have an "inactive" value and an "active" value.
The second option which I am strongly in support is a single value dynatemp_range
. This is a float which represents the allowed deviation from the min and max temperature when using dynatemp.
Thus, if we want a value of dynatemp_min=0.3, dynatemp_max=0.5
, then we would simply set temperature=0.4
and dynatemp_range=0.1
If you want a dynatemp_min=0.8, dynatemp_max=2.0
, then you set temperature=1.4
and dynatemp_range=0.6
.
It will work for any value of min and max
To disable, set dynatemp_range=0
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this idea and it's nicely intuitive.
…which represents the allowed deviation from the min and max temperature when using dynatemp. Thus, if we want a value of dynatemp_min=0.3, dynatemp_max=0.5, then we would simply set temperature=0.4 and dynatemp_range=0.1. Functionally dynatemp would operate the same, but it would simplify usage and make it a single easy to adjust value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kalomaze please take a look see if I've missed anything - this is dynatemp refactored into a single value dynatemp_range
. This is a float which represents the allowed deviation from the min and max temperature when using dynatemp. Thus, if we want a value of dynatemp_min=0.3, dynatemp_max=0.5, then we would simply set temperature=0.4 and dynatemp_range=0.1. Functionally dynatemp would operate the same, but it would simplify usage and make it a single easy to adjust value.
Fully implemented in the UI + backend. Allows the user to use a Dynamic Temperature that scales based on the entropy of token probabilities (normalized by the maximum possible entropy for a distribution so it scales well across different K values).
You can set a minimum Temperature and a maximum Temperature like so:
Allows for more unique and varied outputs, especially when paired with Min P and/or a reasonable Top K.