Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

convert-*.py: GGUF Naming Convention Refactor and Metadata Override Refactor #7499

Merged
merged 66 commits into from
Jul 18, 2024
Merged
Show file tree
Hide file tree
Changes from 59 commits
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
d3a936f
convert-*.py: licence -> license
mofosyne May 28, 2024
dbb1b47
convert-*.py: add --get-outfile command and refactor
mofosyne May 23, 2024
a42c2b7
convert-*.py: add basename and finetune metadata
mofosyne May 30, 2024
916872f
convert-*.py: model card metadata
mofosyne May 31, 2024
4d5f18a
convert-*.py: metadata class moved to utility
mofosyne Jun 1, 2024
5c263cb
convert-*.py: encoding_scheme --> output_type
mofosyne Jun 1, 2024
b36e391
convert-*.py: parse model card in metadata util. Add license_link and…
mofosyne Jun 2, 2024
8f73408
convert-*.py: add base_version and add tags
mofosyne Jun 2, 2024
0f1d50f
convert-*.py: add parameter size class
mofosyne Jun 2, 2024
684c604
convert-*.py: add datasets and language to KV store
mofosyne Jun 2, 2024
b1927ee
convert-*.py: move per model weight estimation away from util back to…
mofosyne Jun 2, 2024
f7c2079
convert-*.py: enable --model-name direct metadata override
mofosyne Jun 2, 2024
5a86dfa
convert-*.py: add general.organization to kv store
mofosyne Jun 2, 2024
dd15712
convert-*.py: add quantized_by and enhance heuristics
mofosyne Jun 3, 2024
b0553f4
convert-*.py: adjust help message
mofosyne Jun 3, 2024
4d5cd06
convert-*.py: use heuristics to parse _name_or_path
mofosyne Jun 3, 2024
32e80e0
convert-*.py: base_model is actually in spec for model cards
mofosyne Jun 3, 2024
54918ad
convert-*.py: refactor parameter weight class
mofosyne Jun 3, 2024
39472a0
convert-*.py: need to include self in per_model_weight_count_estimati…
mofosyne Jun 3, 2024
3625a42
convert-*.py: add heuristic to directory name fallback
mofosyne Jun 3, 2024
91e65d9
convert-*.py: add unittest to metadata class
mofosyne Jun 4, 2024
d060fcd
convert-*.py: adjusted authorship KV store
mofosyne Jun 6, 2024
eaa47f5
convert-*.py: separated unit test, hf_repo to repo_url
mofosyne Jun 8, 2024
e973443
convert-*.py: Remove self.model_name that was left in since last rebase
mofosyne Jun 9, 2024
5011eef
convert_hf_to_gguf.py: optional, dataclass removed from type as it wa…
mofosyne Jul 7, 2024
2f23927
convert_hf_to_gguf.py: rebase error correction
mofosyne Jul 7, 2024
4dc8ddd
convert_hf_to_gguf.py: Remove code that is already in fill_templated_…
mofosyne Jul 7, 2024
007708e
gguf_writer.py: generate tensor uuid if missing
mofosyne Jul 8, 2024
7ecb8f0
test: remove test_gguf.py and remove test_generate_any_missing_uuid()
mofosyne Jul 9, 2024
fdc5a3f
convert-*.py: autogenerate general.uuid if missing
mofosyne Jul 9, 2024
2a976e1
convert-*.py: write_tensors() --> prepare_tensors_for_writing()
mofosyne Jul 10, 2024
59a01df
convert-*.py: refactor per model weight count estimation
mofosyne Jul 10, 2024
dd14b8f
convert-*.py: pyright type fixes
mofosyne Jul 10, 2024
74383ba
Apply suggestions from code review
mofosyne Jul 11, 2024
4c91d07
convert-*.py: cast not required if Metadata.load_metadata_override re…
mofosyne Jul 11, 2024
6eb08ac
convert-*.py: Removing the redundant metadata is not None from all co…
mofosyne Jul 11, 2024
f8b5931
convert-*.py: parameter_class_attribute --> size_label
mofosyne Jul 11, 2024
64707b6
convert-*.py: remove redundant gguf_writer.add_name() calls
mofosyne Jul 11, 2024
04c4fff
convert-*.py: prepare_tensors_for_writing() --> prepare_tensors()
mofosyne Jul 11, 2024
f2b425c
convert-*.py: import cast from typing and other refactor
mofosyne Jul 11, 2024
ad217d7
convert-*.py: remove autogenerated uuid
mofosyne Jul 13, 2024
60278e4
Update convert_hf_to_gguf.py
mofosyne Jul 13, 2024
aa4e589
Update convert_hf_to_gguf.py
mofosyne Jul 13, 2024
2c06030
Update constants.py : spacing correction
mofosyne Jul 13, 2024
8156835
constants.py : Revert removal of backward compatibility KEY_GENERAL_S…
mofosyne Jul 13, 2024
ccff6c7
convert-*.py: remove reference to uuid generation
mofosyne Jul 13, 2024
455c0e5
Apply suggestions from code review
mofosyne Jul 14, 2024
5ab1a84
convert-*.py: dict_item --> Iterable
mofosyne Jul 14, 2024
5cdb03b
convert-*.py: update nix package to add python frontmatter
mofosyne Jul 14, 2024
9954b64
convert-*.py: add logger and refactor load_model_card()
mofosyne Jul 14, 2024
abc351c
convert-*.py: quantized_by in model card is not relevant for converte…
mofosyne Jul 14, 2024
144a7ec
convert-*.py: pathlib.Path exist() --> is_file() or is_dir()
mofosyne Jul 14, 2024
8629b7b
covert-*.py: per_model_weight_count_estimation() tensor arg type is I…
mofosyne Jul 14, 2024
4e37611
covert-*.py: flake8 newline missing
mofosyne Jul 14, 2024
f98f109
convert-*.py: more rigorous regexp for get_model_id_components()
mofosyne Jul 14, 2024
3b1766a
convert-*.py: flake8 remove blank line
mofosyne Jul 14, 2024
78a42fb
gguf-py : use pyyaml instead of python-frontmatter
compilade Jul 14, 2024
417d7a7
convert_hf : use GGUFWriter to count model parameters
compilade Jul 15, 2024
9a925b5
metadata.py: account for decimal point in size label within model id …
mofosyne Jul 15, 2024
c7b3616
Update convert_hf_to_gguf.py
mofosyne Jul 15, 2024
5da16bb
Merge branch 'master' into refactor-convert-py
mofosyne Jul 16, 2024
eb0bf6b
convert-*.py: Add naming_convention_vocab_only()
mofosyne Jul 16, 2024
7e9271c
convert_lora_to_gguf.py: remove model_name parameter. Doesn't exist i…
mofosyne Jul 16, 2024
2c18a9a
gguf-py : extract metadata from model name more resiliently
compilade Jul 18, 2024
4c9932c
gguf-py : fix flake8 lint
compilade Jul 18, 2024
73899f7
gguf-py : handle more name metadata extraction edge cases
compilade Jul 18, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
175 changes: 100 additions & 75 deletions convert_hf_to_gguf.py

Large diffs are not rendered by default.

235 changes: 128 additions & 107 deletions examples/convert_legacy_llama.py

Large diffs are not rendered by default.

8 changes: 8 additions & 0 deletions gguf-py/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,5 +78,13 @@ python -m build
python -m twine upload dist/*
```

## Run Unit Tests

From root of this repository you can run this command to run all the unit tests

```bash
python -m unittest discover ./gguf-py -v
```

## TODO
- [ ] Include conversion scripts as command line entry points in this package.
2 changes: 2 additions & 0 deletions gguf-py/gguf/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,5 @@
from .quants import *
from .tensor_mapping import *
from .vocab import *
from .utility import *
from .metadata import *
68 changes: 54 additions & 14 deletions gguf-py/gguf/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,19 +19,60 @@

class Keys:
class General:
TYPE = "general.type"
ARCHITECTURE = "general.architecture"
QUANTIZATION_VERSION = "general.quantization_version"
ALIGNMENT = "general.alignment"
NAME = "general.name"
AUTHOR = "general.author"
VERSION = "general.version"
URL = "general.url"
DESCRIPTION = "general.description"
LICENSE = "general.license"
SOURCE_URL = "general.source.url"
SOURCE_HF_REPO = "general.source.huggingface.repository"
FILE_TYPE = "general.file_type"
TYPE = "general.type"
ARCHITECTURE = "general.architecture"
QUANTIZATION_VERSION = "general.quantization_version"
ALIGNMENT = "general.alignment"
FILE_TYPE = "general.file_type"

# Authorship Metadata
NAME = "general.name"
AUTHOR = "general.author"
VERSION = "general.version"
ORGANIZATION = "general.organization"

FINETUNE = "general.finetune"
BASENAME = "general.basename"

DESCRIPTION = "general.description"
QUANTIZED_BY = "general.quantized_by"

SIZE_LABEL = "general.size_label"

# Licensing details
LICENSE = "general.license"
LICENSE_NAME = "general.license.name"
LICENSE_LINK = "general.license.link"

# Typically represents the converted GGUF repo (Unless native)
URL = "general.url" # Model Website/Paper
DOI = "general.doi"
UUID = "general.uuid"
REPO_URL = "general.repo_url" # Model Source Repository (git/svn/etc...)
mofosyne marked this conversation as resolved.
Show resolved Hide resolved

# Model Source during conversion
SOURCE_URL = "general.source.url" # Model Website/Paper
SOURCE_DOI = "general.source.doi"
SOURCE_UUID = "general.source.uuid"
SOURCE_REPO_URL = "general.source.repo_url" # Model Source Repository (git/svn/etc...)

# Base Model Source. There can be more than one source if it's a merged
# model like with 'Mistral-7B-Merge-14-v0.1'. This will assist in
# tracing linage of models as it is finetuned or merged over time.
BASE_MODEL_COUNT = "general.base_model.count"
BASE_MODEL_NAME = "general.base_model.{id}.name"
BASE_MODEL_AUTHOR = "general.base_model.{id}.author"
BASE_MODEL_VERSION = "general.base_model.{id}.version"
BASE_MODEL_ORGANIZATION = "general.base_model.{id}.organization"
BASE_MODEL_URL = "general.base_model.{id}.url" # Model Website/Paper
BASE_MODEL_DOI = "general.base_model.{id}.doi"
BASE_MODEL_UUID = "general.base_model.{id}.uuid"
BASE_MODEL_REPO_URL = "general.base_model.{id}.repo_url" # Model Source Repository (git/svn/etc...)

# Array based KV stores
TAGS = "general.tags"
LANGUAGES = "general.languages"
DATASETS = "general.datasets"

class LLM:
VOCAB_SIZE = "{arch}.vocab_size"
Expand Down Expand Up @@ -1233,7 +1274,6 @@ def get_type(val: Any) -> GGUFValueType:
KEY_GENERAL_DESCRIPTION = Keys.General.DESCRIPTION
KEY_GENERAL_LICENSE = Keys.General.LICENSE
KEY_GENERAL_SOURCE_URL = Keys.General.SOURCE_URL
KEY_GENERAL_SOURCE_HF_REPO = Keys.General.SOURCE_HF_REPO
KEY_GENERAL_FILE_TYPE = Keys.General.FILE_TYPE

# LLM
Expand Down
139 changes: 121 additions & 18 deletions gguf-py/gguf/gguf_writer.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
from .constants import (
GGUF_DEFAULT_ALIGNMENT,
GGUF_MAGIC,
GGML_QUANT_SIZES,
GGUF_VERSION,
GGMLQuantizationType,
GGUFEndian,
Expand Down Expand Up @@ -106,6 +107,36 @@ def __init__(

self.add_architecture()

def get_total_parameter_count(self) -> tuple[int, int, int, int]:
total_params = 0
shared_params = 0
expert_params = 0

expert_sum = 0
n_expert_tensors = 0

for tensors in self.tensors:
for name, info in tensors.items():

block_size, type_size = GGML_QUANT_SIZES[info.dtype]

size = (info.nbytes // type_size) * block_size

if "_exps." in name:
expert_params += (size // info.shape[-3])
expert_sum += info.shape[-3]
n_expert_tensors += 1
else:
shared_params += size

total_params += size

# Hopefully this should work even for variable-expert-count models
expert_count = (expert_sum // n_expert_tensors) if n_expert_tensors > 0 else 0

# NOTE: keep the output in the same order as accepted by 'size_label' in gguf-py/gguf/utility.py
return total_params, shared_params, expert_params, expert_count

def format_shard_names(self, path: Path) -> list[Path]:
if len(self.tensors) == 1:
return [path]
Expand All @@ -115,6 +146,7 @@ def open_output_file(self, path: Path | None = None) -> None:
if self.state is WriterState.EMPTY and self.fout is not None and (path is None or path == self.path):
# allow calling this multiple times as long as the path is the same
return

if self.state is not WriterState.NO_FILE:
raise ValueError(f'Expected output file to be not yet opened, got {self.state}')

Expand Down Expand Up @@ -430,43 +462,114 @@ def add_type(self, type_name: str) -> None:
def add_architecture(self) -> None:
self.add_string(Keys.General.ARCHITECTURE, self.arch)

def add_quantization_version(self, quantization_version: int) -> None:
self.add_uint32(Keys.General.QUANTIZATION_VERSION, quantization_version)

def add_custom_alignment(self, alignment: int) -> None:
self.data_alignment = alignment
self.add_uint32(Keys.General.ALIGNMENT, alignment)

def add_file_type(self, ftype: int) -> None:
self.add_uint32(Keys.General.FILE_TYPE, ftype)

def add_name(self, name: str) -> None:
self.add_string(Keys.General.NAME, name)

def add_author(self, author: str) -> None:
self.add_string(Keys.General.AUTHOR, author)

def add_version(self, version: str) -> None:
self.add_string(Keys.General.VERSION, version)

def add_tensor_data_layout(self, layout: str) -> None:
self.add_string(Keys.LLM.TENSOR_DATA_LAYOUT.format(arch=self.arch), layout)
def add_organization(self, organization: str) -> None:
self.add_string(Keys.General.ORGANIZATION, organization)

def add_url(self, url: str) -> None:
self.add_string(Keys.General.URL, url)
def add_finetune(self, finetune: str) -> None:
self.add_string(Keys.General.FINETUNE, finetune)

def add_basename(self, basename: str) -> None:
self.add_string(Keys.General.BASENAME, basename)

def add_description(self, description: str) -> None:
self.add_string(Keys.General.DESCRIPTION, description)

def add_licence(self, licence: str) -> None:
self.add_string(Keys.General.LICENSE, licence)
def add_quantized_by(self, quantized: str) -> None:
self.add_string(Keys.General.QUANTIZED_BY, quantized)

def add_size_label(self, size_label: str) -> None:
self.add_string(Keys.General.SIZE_LABEL, size_label)

def add_license(self, license: str) -> None:
self.add_string(Keys.General.LICENSE, license)

def add_license_name(self, license: str) -> None:
self.add_string(Keys.General.LICENSE_NAME, license)

def add_license_link(self, license: str) -> None:
self.add_string(Keys.General.LICENSE_LINK, license)

def add_url(self, url: str) -> None:
self.add_string(Keys.General.URL, url)

def add_doi(self, doi: str) -> None:
self.add_string(Keys.General.DOI, doi)

def add_uuid(self, uuid: str) -> None:
self.add_string(Keys.General.UUID, uuid)

def add_repo_url(self, repo_url: str) -> None:
self.add_string(Keys.General.REPO_URL, repo_url)

def add_source_url(self, url: str) -> None:
self.add_string(Keys.General.SOURCE_URL, url)

def add_source_hf_repo(self, repo: str) -> None:
self.add_string(Keys.General.SOURCE_HF_REPO, repo)
def add_source_doi(self, doi: str) -> None:
self.add_string(Keys.General.SOURCE_DOI, doi)

def add_file_type(self, ftype: int) -> None:
self.add_uint32(Keys.General.FILE_TYPE, ftype)
def add_source_uuid(self, uuid: str) -> None:
self.add_string(Keys.General.SOURCE_UUID, uuid)

def add_name(self, name: str) -> None:
self.add_string(Keys.General.NAME, name)
def add_source_repo_url(self, repo_url: str) -> None:
self.add_string(Keys.General.SOURCE_REPO_URL, repo_url)

def add_quantization_version(self, quantization_version: int) -> None:
self.add_uint32(
Keys.General.QUANTIZATION_VERSION, quantization_version)
def add_base_model_count(self, source_count: int) -> None:
self.add_uint32(Keys.General.BASE_MODEL_COUNT, source_count)

def add_custom_alignment(self, alignment: int) -> None:
self.data_alignment = alignment
self.add_uint32(Keys.General.ALIGNMENT, alignment)
def add_base_model_name(self, source_id: int, name: str) -> None:
self.add_string(Keys.General.BASE_MODEL_NAME.format(id=source_id), name)

def add_base_model_author(self, source_id: int, author: str) -> None:
self.add_string(Keys.General.BASE_MODEL_AUTHOR.format(id=source_id), author)

def add_base_model_version(self, source_id: int, version: str) -> None:
self.add_string(Keys.General.BASE_MODEL_VERSION.format(id=source_id), version)

def add_base_model_organization(self, source_id: int, organization: str) -> None:
self.add_string(Keys.General.BASE_MODEL_ORGANIZATION.format(id=source_id), organization)

def add_base_model_url(self, source_id: int, url: str) -> None:
self.add_string(Keys.General.BASE_MODEL_URL.format(id=source_id), url)

def add_base_model_doi(self, source_id: int, doi: str) -> None:
self.add_string(Keys.General.BASE_MODEL_DOI.format(id=source_id), doi)

def add_base_model_uuid(self, source_id: int, uuid: str) -> None:
self.add_string(Keys.General.BASE_MODEL_UUID.format(id=source_id), uuid)

def add_base_model_repo_url(self, source_id: int, repo_url: str) -> None:
self.add_string(Keys.General.BASE_MODEL_REPO_URL.format(id=source_id), repo_url)

def add_tags(self, tags: Sequence[str]) -> None:
self.add_array(Keys.General.TAGS, tags)

def add_languages(self, languages: Sequence[str]) -> None:
self.add_array(Keys.General.LANGUAGES, languages)

def add_datasets(self, datasets: Sequence[str]) -> None:
self.add_array(Keys.General.DATASETS, datasets)

def add_tensor_data_layout(self, layout: str) -> None:
self.add_string(Keys.LLM.TENSOR_DATA_LAYOUT.format(arch=self.arch), layout)

def add_vocab_size(self, size: int) -> None:
self.add_uint32(Keys.LLM.VOCAB_SIZE.format(arch=self.arch), size)
Expand Down
Loading
Loading