-
Notifications
You must be signed in to change notification settings - Fork 12.3k
Smoldocling support #14597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Smoldocling support #14597
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for tackling this one @ryan-mangeno! A couple of whitespace NITs, and one request since you're coming out of IBM with me: Our OSS policy requires that we sign our commits when committing to open source. To do this for your branch, you should be able to simply do:
git fetch origin && git rebase -i --signoff --rebase-merges origin/master
Co-authored-by: Gabe Goodhart <gabe.l.hart@gmail.com>
Co-authored-by: Gabe Goodhart <gabe.l.hart@gmail.com>
Signed-off-by: ryan-mangeno <ryanmangeno@gmail.com>
* v1 * push more fixes * another fix * fix * more fixes * minor fix * more cleaning on python code * python fixes * changed precision for multipliers float 32->64 * fixes * another fix * fix * pre-norm -> norm * fix * Revert "fix" This reverts commit 243e4d1. * fix * small fix ffn_norm * try * mix instead of max * fix vocab size * conflict solve * fixed multipliers * falcon-h1 specefic vocab resolved * read arch from gguf.MODEL_ARCH * mamba_d_ssm added to d_inner find_hparam * remove unused functions from gguf_writer.py * override modify_tensors instead of get_tensors * fix conversion and d_inner * added some cb functions for debugging puposes * inp_out_ids moved outside of layers loop * mup_vec create as float64 * fix rope_theta * injected mup * clean ups * rm extra space * rm unused MAMBA_CHUNK_SIZE * rm unused key * add bos False * changed ROPE_TYPE * cleaning debugging stuff * cleaning debug quant * fix comment * some cleanups * some cleanups * Update src/llama-model-loader.cpp * more cleanups * moe cleanuips * d_ssm -> d_inner; * cleaning unused hparams * cleanup * more cleanups * more cleanups on python conversion; * minor cleanups * Apply suggestions from code review Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * remove todo * added falcon-h1 * tensor not required * clean * remove unneeded attributes * more cleanups and fixed conversion * remove final_norm * flake8 fixes * Update src/llama-model.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * flake8 fixes * Update src/llama-hparams.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update src/llama-model.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update src/llama-model.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update src/llama-arch.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update convert_hf_to_gguf.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * added hashes * Update src/llama-arch.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Update src/llama-vocab.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * update the update file * Revert "update the update file" This reverts commit 082ab4a. * fix: address suggestions * fix: update convert_hf_to_gguf.py * Update gguf-py/gguf/constants.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update src/llama-model-loader.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * d_inner fixed * Update src/llama-model.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * reshaping ssm_norm for 34B * removing generate_mup * remove duplicates metadata keys * rm comment * final comment * fix unused args * fix constants * fix bad merge * Update src/llama-model.cpp Co-authored-by: compilade <git@compilade.net> * falcon-h1: remove unused ssm_in_b and bad merge * Update src/llama-model.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * falcon-h1: fix last comment * Update convert_hf_to_gguf.py Co-authored-by: compilade <git@compilade.net> * falcon-h1: revert add_add_bos(False) * falcon-h1: fix tied weights * falcon-h1: remove whitespace * falcon-h1: fix wrong size param * falcon-h1: fix whitespace issues --------- Co-authored-by: younesbelkada <younes.belkada@tii.ae> Co-authored-by: Younes B <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> Co-authored-by: compilade <git@compilade.net> Signed-off-by: ryan-mangeno <ryanmangeno@gmail.com>
Signed-off-by: ryan-mangeno <ryanmangeno@gmail.com>
Signed-off-by: ryan-mangeno <ryanmangeno@gmail.com>
Signed-off-by: ryan-mangeno <ryanmangeno@gmail.com>
…-org#14595) Signed-off-by: ryan-mangeno <ryanmangeno@gmail.com>
Signed-off-by: ryan-mangeno <ryanmangeno@gmail.com>
Signed-off-by: ryan-mangeno <ryanmangeno@gmail.com>
Signed-off-by: ryan-mangeno <ryanmangeno@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I missed something, SmolDocling-256M-preview is fine tuned from smolVLM, so it should work out of the box. Are you 100% sure that we need to modify the conversion script?
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Signed-off-by: ryan-mangeno <ryanmangeno@gmail.com>
I changed the tensormappings for the safetensors file on hugging face, also all the tensors are the same for smolvlm and smoldocling but noticed that there werent tensor mappings for the text model for smolvlm ... https://huggingface.co/ds4sd/SmolDocling-256M-preview?show_file_info=model.safetensors. I noticed when I didn't have |
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
…pp into smoldocling-support
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Please re-check that conversion and inference is ok. |
will do right now 👍 |
Actually, according to this, you should not need to add most of the tensor mappings either: llama.cpp/convert_hf_to_gguf.py Lines 1963 to 1964 in 4a5686d
|
ahh ok that makes sense, the tensor names were the same for the text model so those can be removed I am pretty sure |
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
I reran conversion and inferencing and looks the same |
Thank you, hope you enjoyed the ride, will merge when CI goes green. :) |
Thank you too! I had a lot of fun making my first contribution to llama cpp and hope to make more :) |
* origin/master: Smoldocling support (ggml-org#14597) Docs: script to auto-generate ggml operations docs (ggml-org#14598)
added end of token check for smoldocling - <end_of_utterance>, also added tensor names after dumping tensors with llama-eval-callback to tensor_mappings.py, added regex for pre tokenizer defined in the hugging face config file. Smoldocling text model architecture is based on llama, so there wasn't a need to impliment its own architecture. I compared both the hugging face tensors and the gguf tensors and both checked out to be the same