llama : model-based max number of graph nodes #8622

ggerganov · 2024-07-22T06:52:22Z

Propose to determine the max number of nodes based on the model info (arch, hparams, etc.)

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

ggml-ci

slaren

This should solve the immediate problem. It would be good to be able to detect the number of nodes necessary automatically and always use the lowest possible value, because there is an overhead from clearing some buffers that is proportional to the number of nodes. I have been doing some work to reduce this overhead, but it is not ready yet.

ggerganov · 2024-07-22T13:17:59Z

Ok, will merge this after the 405B model is release and the need for this change is confirmed. Likely the proposed n_layer > 400 check would have to be updated, because this number seems too big to me

ceddybi · 2024-07-26T22:33:30Z

@ggerganov what's left? up to now?

ggml-ci

ggerganov · 2024-07-27T10:34:07Z

I haven't noticed any reports of 405B failing, so removed the increased max nodes limit for now and planning to merge just the new llama_model_max_nodes function

…anov#8622) * llama : model-based max number of graph nodes ggml-ci * llama : disable 405B max_nodes path due to lack of complaints ggml-ci

llama : model-based max number of graph nodes

7e27c17

ggml-ci

ggerganov requested a review from slaren July 22, 2024 06:52

ggerganov mentioned this pull request Jul 22, 2024

Bug: LLAMA_MAX_NODES must be increased to run 405B Mega merge #8615

Closed

mofosyne added the Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix label Jul 22, 2024

slaren approved these changes Jul 22, 2024

View reviewed changes

llama : disable 405B max_nodes path due to lack of complaints

2d74714

ggml-ci

ggerganov merged commit 92090ec into master Jul 27, 2024
59 checks passed

ggerganov deleted the gg/custom-max-nodes branch July 27, 2024 11:59

This was referenced Aug 9, 2024

Bug: BigLlama-3.1-681B-Instruct requires llama_model_max_nodes to return a higher value #8950

Closed

llama : model-based max number of graph nodes calculation #8970

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama : model-based max number of graph nodes #8622

llama : model-based max number of graph nodes #8622

ggerganov commented Jul 22, 2024

slaren left a comment

ggerganov commented Jul 22, 2024

ceddybi commented Jul 26, 2024

ggerganov commented Jul 27, 2024

llama : model-based max number of graph nodes #8622

llama : model-based max number of graph nodes #8622

Conversation

ggerganov commented Jul 22, 2024

slaren left a comment

Choose a reason for hiding this comment

ggerganov commented Jul 22, 2024

ceddybi commented Jul 26, 2024

ggerganov commented Jul 27, 2024