Tags: strangedove/mergekit
Tags
Fix plumbing of apply_chat_template/fewshot_as_multiturn in mergekit-… …evolve (arcee-ai#535)
LoRA extraction fixes (arcee-ai#522) Addresses arcee-ai#521. Also adds: * `--lora-merge-dtype` to specify dtype to use when applying LoRA adapters to models * `--gpu-rich` alias for convenience * Organize display of options in `--help`
Compute-graph based `mergekit-extract-lora` (arcee-ai#505) Now with better embedding handling, multi-gpu execution, and lazy loading/saving of tensors. When extracting a LoRA from an 8B model, execution time goes from ~6 minutes down to 40 seconds with `--cuda --multi-gpu` on an 8-GPU machine. Additionally, the `--sv-epsilon` flag can be used to set a tolerance for singular values to opportunistically reduce rank when the fine tuned difference is inherently lower rank. Also reimplement a couple of merge methods using the `@easy_define` decorator and add some missing tests.