Description
So back when project started, we had the first "unversioned" model format without the embedded tokens, with the magic 0x67676d6c (ggml).
Problem with that was that it didn't have any versioning support, so newer/older versions would just think "I don't know what this is, this is not a model file".
Then on this commit 074bea2, adding the embedded the tokens we got a new versioned model format, with magic 0x67676d66 (ggmf), along with versioning, so it could now say "this is definitely a model file, but a wrong version" as shown here:
https://github.com/ggerganov/llama.cpp/blob/3bcc129ba881c99795e850b0a23707a4dfdabe9d/llama.h#L22
That was definitely a good move towards future proofing. Any breaking changes could just add +1 to that version and all would be fine and dandy for the next 4294967295 versions of the model format.
But then came this commit: 78ca983
Which for absolutely no good reason changed the magic to 0x67676a74 (ggjt), kept the version at 1, completely breaking the whole versioning system and made it worthless.
Now we're back to the system where the different versions of llama.cpp
don't understand that "yes , these are indeed models but older/newer versions". We already fixed this problem, why the absolute f the need to break something that is already perfectly fine?
I just cannot understand the reasoning behind this except maybe vanity, I guess (as the new magic uses the initials of the one who did the commit as the magic) ? Absolutely ridiculous to break a perfectly functional system. Or is there actually some proper reason for this that I'm completely missing?
It is already a struggle since various older forks like alpaca.cpp / gpt4all uses the unversioned format, then the move to the versioned format already fractured the community a bit, but was a good and necessary change overall and fixed the version confusion problem for the future versions. But now the third format change, which is made intentionally worse by changing the magic instead of doing it properly and using the versioning system put in place back then and causing even more confusion as now all the commits since 074bea2 , where this whole problem was already fixed, is now broken again and those versions would say "I do now know what this is, it is not a model file" of the new format. WHY?
Again, the proper way of updating the model as envisioned by the versioning system is to:
-#define LLAMA_FILE_VERSION 1
+#define LLAMA_FILE_VERSION 2
#define LLAMA_FILE_MAGIC 0x67676d66 // 'ggmf' in hex
and not
#define LLAMA_FILE_VERSION 1
-#define LLAMA_FILE_MAGIC 0x67676d66 // 'ggmf' in hex
+#define LLAMA_FILE_MAGIC 0x67676a74 // 'ggjt' in hex
like it was committed 78ca983.
What is actually the line of thinking here, we just going to keep the version at 1, completely disuse the versioning system and keep changing the magic to whoever's initials who is doing that change? How the everliving F does that make any sense?!
If this actually was done by accident, not understanding the versioning system and not by intention, sorry for my scathing remarks. If it's intentional and breaking a perfectly functional system for vanity's sake, all the scathe is well deserved.
Pulling dumb shit like this is a good way to make a fantastic open-source project fall apart quickly.