gguf : track writer state, free unneeded tensors, cleanup #3871

cebtenzzre · 2023-10-31T16:09:57Z

The idea of checking for incorrect use of GGUFWriter was described here: #3838 (comment)

The methods of GGUFWriter are designed to be called in a specific order. This PR makes sure this order is respected, otherwise it will raise an exception. Previously, incorrect use would silently fail in most cases, leaving the user wondering why they got a file with the correct size that fails to load.

Also, when use_temp_file is False, we should remove tensors from the list as soon as we are done with them in order to free the associated memory sooner. This can reduce unnecessary swapping when converting large models with scripts that set that parameter, such as convert-llama-ggml-to-gguf.py.

Making an assignment in a class outside of a method does not set the default value, it actually sets the attribute on the class itself. Instances of the class inherit these, but it's incorrect to expose these fields here.

gguf-py/gguf/gguf.py

Galunid

Are we good to merge?

monatis

Let's also bump the version please

@cebtenzzre

Credit to @cebtenzzre for that pull

@cebtenzzre

) * gguf-py: Refactor and add file reading support * Replay changes from #3871 Credit to @cebtenzzre for that pull * Various type annotation fixes. * sort imports with isort (again) * Fix missing return statement in add_tensor * style cleanup with flake8 * fix NamedTuple and Enum usage * Fix an issue with state init in GGUFReader Move examples to an examples/ directory Clean up examples Add an example of modifying keys in a GGUF file Update documentation with info on examples Try to support people importing gguf/gguf.py directly * Damagage is not a word. * Clean up gguf-py/examples/modify_gguf.py whitespace Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com> * Update gguf-py/examples/modify_gguf.py formatting Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com> * Update gguf-py/gguf/gguf_reader.py type hint Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com> * Make examples executable, formatting changes * Add more information to GGUFReader and examples comments * Include a gguf Python package version bump * Add convert-gguf-endian.py script * cleanup * gguf-py : bump minor version * Reorganize scripts * Make GGUFReader endian detection less arbitrary * Add JSON dumping support to gguf-dump.py Which I kind of regret now * A few for gguf-dump.py cleanups * Murder accidental tuple in gguf-py/scripts/gguf-dump.py Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com> * cleanup * constants : remove unneeded type annotations * fix python 3.8 compat * Set up gguf- scripts in pyproject.toml * And include scripts/__init__.py, derp * convert.py: We can't currently support Q8_0 on big endian. * gguf-py: SpecialVocab: Always try available sources for special token ids gguf-py: SpecialVocab: Try to load merges from merges.txt if not in tokenizer.json gguf-py: SpecialVocab: Add 'add_bos_token' type bools to GGUF metadata u * cleanup * Promote add_X_token to GGUF metadata for BOS and EOS --------- Co-authored-by: Jared Van Bortel <jared@nomic.ai> Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>

…3871)

@cebtenzzre

…erganov#3981) * gguf-py: Refactor and add file reading support * Replay changes from ggerganov#3871 Credit to @cebtenzzre for that pull * Various type annotation fixes. * sort imports with isort (again) * Fix missing return statement in add_tensor * style cleanup with flake8 * fix NamedTuple and Enum usage * Fix an issue with state init in GGUFReader Move examples to an examples/ directory Clean up examples Add an example of modifying keys in a GGUF file Update documentation with info on examples Try to support people importing gguf/gguf.py directly * Damagage is not a word. * Clean up gguf-py/examples/modify_gguf.py whitespace Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com> * Update gguf-py/examples/modify_gguf.py formatting Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com> * Update gguf-py/gguf/gguf_reader.py type hint Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com> * Make examples executable, formatting changes * Add more information to GGUFReader and examples comments * Include a gguf Python package version bump * Add convert-gguf-endian.py script * cleanup * gguf-py : bump minor version * Reorganize scripts * Make GGUFReader endian detection less arbitrary * Add JSON dumping support to gguf-dump.py Which I kind of regret now * A few for gguf-dump.py cleanups * Murder accidental tuple in gguf-py/scripts/gguf-dump.py Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com> * cleanup * constants : remove unneeded type annotations * fix python 3.8 compat * Set up gguf- scripts in pyproject.toml * And include scripts/__init__.py, derp * convert.py: We can't currently support Q8_0 on big endian. * gguf-py: SpecialVocab: Always try available sources for special token ids gguf-py: SpecialVocab: Try to load merges from merges.txt if not in tokenizer.json gguf-py: SpecialVocab: Add 'add_bos_token' type bools to GGUF metadata u * cleanup * Promote add_X_token to GGUF metadata for BOS and EOS --------- Co-authored-by: Jared Van Bortel <jared@nomic.ai> Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>

cebtenzzre added 5 commits October 31, 2023 12:02

gguf : do not store defaults in class vars

6df988d

Making an assignment in a class outside of a method does not set the default value, it actually sets the attribute on the class itself. Instances of the class inherit these, but it's incorrect to expose these fields here.

gguf : cleanup tensor padding

3fcdc93

gguf : track writer state

d97afcf

gguf : free tensors as they are written

389d2e6

gguf : prevent adding tensors after header is written

d09772a

Galunid reviewed Oct 31, 2023

View reviewed changes

gguf-py/gguf/gguf.py Outdated Show resolved Hide resolved

remove commented out code

9de3e6c

Galunid approved these changes Nov 7, 2023

View reviewed changes

monatis requested changes Nov 7, 2023

View reviewed changes

gguf : bump version to 0.4.6

beb986c

cebtenzzre requested a review from monatis November 7, 2023 16:26

monatis approved these changes Nov 7, 2023

View reviewed changes

cebtenzzre merged commit 0a7c980 into ggerganov:master Nov 7, 2023
6 checks passed

cebtenzzre deleted the gguf-writer-checks branch November 7, 2023 17:43

KerfuffleV2 added a commit to KerfuffleV2/llama.cpp that referenced this pull request Nov 7, 2023

Replay changes from ggerganov#3871

8047aa1

Credit to @cebtenzzre for that pull

olexiyb pushed a commit to Sanctum-AI/llama.cpp that referenced this pull request Nov 23, 2023

gguf : track writer state, free unneeded tensors, cleanup (ggerganov#…

893ac07

…3871)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gguf : track writer state, free unneeded tensors, cleanup #3871

gguf : track writer state, free unneeded tensors, cleanup #3871

cebtenzzre commented Oct 31, 2023

Galunid left a comment

monatis left a comment

gguf : track writer state, free unneeded tensors, cleanup #3871

gguf : track writer state, free unneeded tensors, cleanup #3871

Conversation

cebtenzzre commented Oct 31, 2023

Galunid left a comment

Choose a reason for hiding this comment

monatis left a comment

Choose a reason for hiding this comment