Description
Please include information about your system, the steps to reproduce the bug, and the version of llama.cpp that you are using. If possible, please provide a minimal code example that reproduces the bug.
System: Mac M2 Max, OS version Sonoma 14.2.1
llama.cpp version: the latest main branch as of today -- Feb 29 2024
Steps to reproduce:
- get the miniLM v6 embedding model from https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2
- convert it to a gguf:
python convert-hf-to-gguf.py --outfile minilm.gguf --outtype f16 all-MiniLM-L6-v2
- get embedding by running the embedding.cpp example:
./embedding -m minilm.gguf --log-disable -p "prince" -ngl 99
output:
embedding 0: -0.036860 0.041229 0.041869 0.041696 -0.024387 0.021268 0.099094 0.011098 0.004876 -0.046105 -0.059572 -0.029206 -0.002308 -0.083870 0.013792 0.045119 0.036514 0.097176 -0.023481 0.003588 -0.043461 -0.074656 -0.040720 0.011558 -0.078210 0.046951 0.077435 0.004489 -0.060391 -0.080243 0.012820 -0.069973 0.019214 0.052969 -0.030262 -0.031453 -0.038022 0.017023 -0.020902 0.049196 0.017024 0.016393 0.008936 0.048631 0.069907 -0.020548 -0.046252 -0.026483 0.003169 0.013299 -0.079572 0.067772 0.040068 -0.022476 -0.014747 0.048846 0.033857 -0.032799 0.091509 0.051690 0.008189 -0.021081 0.001400 0.008993 0.021849 0.031127 -0.006536 0.050509 0.000690 0.028135 -0.014092 0.036171 -0.006388 -0.098974 -0.053224 0.001512 0.039952 -0.103960 -0.017700 0.057037 -0.057424 -0.036943 -0.117668 -0.062323 -0.055580 -0.013580 0.019112 -0.006367 -0.044186 0.028640 0.034396 0.007887 -0.003876 0.035907 -0.105881 0.043552 0.109067 -0.079995 -0.056017 0.249003 -0.012027 0.053932 0.008201 0.036213 -0.010334 -0.036057 0.002546 0.002683 0.025382 -0.012716 -0.020140 -0.078044 -0.039887 0.008628 0.067469 -0.022117 0.026651 0.109164 0.025518 0.054249 0.032891 0.058545 -0.060185 -0.029968 -0.064054 -0.055134 -0.032062 -0.000000 0.023324 -0.049649 0.117273 0.017088 0.019652 0.014610 -0.022676 0.011467 -0.011677 0.041203 0.031052 -0.027259 -0.110914 -0.038720 0.008130 0.052588 -0.028568 -0.003301 0.003100 0.006353 -0.040994 0.127952 0.021877 0.031906 -0.059635 -0.035798 0.031328 -0.068779 0.069636 0.045009 0.014252 0.011030 0.031737 0.002799 -0.033560 -0.047101 -0.037738 -0.087675 -0.027649 -0.025608 -0.060913 -0.039024 0.059457 0.010664 -0.121365 0.004126 0.012507 -0.000979 0.027688 -0.006558 -0.002946 -0.058397 -0.050696 0.001921 -0.053137 -0.115931 -0.066889 0.010602 0.062312 -0.006016 0.122319 0.041626 0.040150 0.072560 0.002546 -0.014622 0.038451 0.046488 0.053678 -0.031259 -0.035950 0.078569 0.064906 0.009808 0.041488 0.000068 -0.116803 0.023502 -0.065634 0.007458 -0.072649 0.051103 0.002858 0.019553 0.069425 -0.018859 -0.049330 -0.055800 0.067243 0.079522 -0.080934 -0.026456 0.001355 0.035326 -0.037046 0.000000 -0.017859 -0.090733 0.059320 0.116348 0.035735 -0.031313 -0.003781 0.047099 -0.047402 -0.008223 -0.070947 -0.049794 0.117659 -0.091668 0.087372 0.001555 0.072384 0.017297 -0.044615 0.050339 -0.011038 -0.085056 -0.026181 -0.050121 0.014113 0.036018 -0.056591 -0.023850 -0.032202 0.011580 0.039819 0.107372 -0.033940 0.037161 -0.045220 0.065391 -0.047808 -0.007598 0.053954 0.021879 -0.010463 -0.055802 0.013686 0.107812 0.052527 0.006023 0.043839 0.066618 0.125267 0.018725 -0.106315 0.037015 -0.106218 -0.013532 0.039010 -0.004414 -0.057585 -0.003809 0.003101 0.074716 0.009683 0.002036 0.006178 -0.003161 -0.073237 0.124757 0.086564 0.037509 0.049667 -0.044155 0.084439 0.070531 -0.017895 0.068558 -0.018488 0.014648 -0.024090 0.032648 -0.036987 -0.051995 -0.023524 -0.045418 -0.060536 0.016920 -0.006316 -0.036464 0.091840 0.011222 -0.040349 -0.045818 0.034356 -0.018599 -0.035700 -0.064079 -0.020323 -0.000000 -0.027605 -0.003925 0.010733 -0.067407 0.036873 0.058545 -0.006928 -0.027731 0.018732 0.016576 0.076662 -0.074990 0.019847 -0.015870 0.035534 -0.048891 0.000796 -0.033112 -0.000590 0.017791 -0.023484 -0.021673 0.042539 -0.044049 -0.049730 -0.017081 -0.047250 -0.016705 0.033407 -0.013550 0.081062 -0.000081 -0.044371 0.035021 0.037153 0.030417 -0.021360 -0.015494 0.048064 -0.026997 -0.028125 0.004956 -0.014228 0.006874 -0.015346 -0.013070 0.011872 -0.036261 -0.044620 0.003815 -0.025334 -0.007049 0.077197 0.041967 0.083626 0.010578 0.028892 0.044975 -0.127511 0.023095 0.127018 -0.046558 -0.012191 0.023901
- run the model as server mode:
./server -ngl 99 -m minilm.gguf --port 8019 --host 0.0.0.0 --embedding
- make a curl request for embedding the same word 'prince':
curl http://localhost:8019/v1/embeddings \
-H "Content-Type: application/json" \
-H "Authorization: Bearer no-key" \
-d '{
"input": ["prince"],
"model":"minilm",
"encoding_format": "float"
}'
Output:
{"data":[{"embedding":[-0.9113406538963318,0.20497266948223114,-0.05761261284351349,-0.1636194884777069,0.44455260038375854,0.08682599663734436,0.744450569152832,-0.03699329495429993,-0.5454420447349548,-0.4367120862007141,-0.07723181694746017,-0.19148540496826172,0.13116762042045593,-0.11302244663238525,0.22115185856819153,-0.2790590822696686,-0.35611769556999207,-0.3604397475719452,-0.046874552965164185,0.2406463325023651,-0.3163539171218872,0.10239306837320328,0.19693052768707275,-0.0763457864522934,-0.09630294144153595,-0.05589187890291214,-0.09276457875967026,0.21844173967838287,-0.0804717019200325,-0.36773544549942017,-0.34730562567710876,0.15269294381141663,0.525522768497467,-0.07095099985599518,-0.23811766505241394,-0.006564701441675425,-0.4374021589756012,-0.46858465671539307,0.12920492887496948,0.33534786105155945,0.11649847030639648,-0.35319891571998596,-0.1711725890636444,0.4761241674423218,0.413016676902771,-0.5310176014900208,0.13408483564853668,0.10075130313634872,0.42842817306518555,-0.2796638011932373,-0.15863917768001556,-0.14105407893657684,0.008249428123235703,-0.20821517705917358,0.3040767014026642,-0.14677953720092773,0.02599211037158966,-0.43027985095977783,0.18312017619609833,-0.13386058807373047,0.04302752763032913,0.07234133034944534,-0.9491630792617798,0.5000106692314148,-0.2805364727973938,0.13198977708816528,0.2868933081626892,-0.023559710010886192,-0.10929538309574127,0.3662813603878021,0.4353457987308502,-0.2235320508480072,0.03139911964535713,-0.6918119192123413,0.2975662350654602,0.20270520448684692,0.36246049404144287,0.1681171953678131,0.382007896900177,0.2189856618642807,-0.03305833786725998,0.07860100269317627,-0.5726684331893921,-0.016888771206140518,0.05296405777335167,-0.2583688795566559,-0.10431863367557526,-0.17162971198558807,-0.20041799545288086,0.3255464434623718,-0.33213892579078674,-0.38952669501304626,0.5833108425140381,-0.3714039921760559,-0.5180072784423828,-0.3262985646724701,0.19199618697166443,0.2610461413860321,-0.5567361116409302,2.5617096424102783,0.30518174171447754,0.17281579971313477,0.6737371683120728,-0.06384599953889847,0.2474653720855713,-0.3317680358886719,-0.3517586588859558,-0.1164378821849823,0.042621731758117676,0.059900663793087006,-0.0794406533241272,-0.17526094615459442,-0.08849326521158218,0.26080191135406494,0.23311978578567505,0.5595173239707947,-0.2197084128856659,-0.03350114822387695,0.20333600044250488,0.0007340013980865479,-0.04374226927757263,-0.23907612264156342,0.050346773117780685,0.06973200291395187,0.521949827671051,-0.774983286857605,0.026607058942317963,-4.3562683350689554e-32,0.04155222326517105,0.09096819907426834,0.053881049156188965,0.06294772773981094,0.2517699599266052,0.2299978882074356,0.039326563477516174,-0.26331254839897156,-0.106712207198143,0.3398314118385315,0.18540745973587036,0.18217933177947998,-0.09577855467796326,0.2027318924665451,0.3415830135345459,-0.0406864732503891,0.029755856841802597,-0.07023705542087555,0.01803077943623066,0.050523675978183746,-0.01507059670984745,-0.10759429633617401,-0.21249070763587952,0.25786375999450684,-0.14086893200874329,-0.34865590929985046,0.20274148881435394,-0.5245954990386963,0.19924187660217285,0.05390845239162445,-0.0734206885099411,0.31001031398773193,-0.2408670336008072,0.07897859811782837,-0.28300997614860535,-0.10916518419981003,0.11163628846406937,-0.46925920248031616,-0.21956908702850342,0.24989114701747894,-0.2724396586418152,0.07712733000516891,-0.18821056187152863,0.03654399886727333,-0.11255963891744614,1.0593624114990234,-0.11500044167041779,-0.03266199678182602,-0.3653186559677124,0.3872261941432953,-0.04578177630901337,-0.12280713766813278,-0.7358079552650452,0.20281924307346344,-0.1127614974975586,-0.21654832363128662,0.12830042839050293,-0.16045168042182922,0.02366318553686142,-0.005510266870260239,0.07435282319784164,0.17047753930091858,-0.01223795861005783,-0.0014442875981330872,-0.5768333077430725,-0.33469757437705994,0.20493263006210327,-0.0976196825504303,0.15795372426509857,-0.1524704098701477,-0.3397308588027954,0.1795063316822052,0.2294308990240097,0.2730967402458191,-0.1142972856760025,-0.13708257675170898,-0.12865528464317322,0.15601131319999695,0.44888660311698914,0.015956934541463852,0.37817105650901794,-0.34698113799095154,-0.0024727322161197662,-0.09719957411289215,0.6553595662117004,0.31680095195770264,-0.04055440053343773,-0.1605125516653061,0.15430790185928345,-0.23208746314048767,-0.2697872519493103,0.42774200439453125,-0.10204043984413147,0.5559654831886292,0.11096914112567902,2.616153190820014e-32,-0.5045391917228699,0.11107780784368515,-0.18468350172042847,0.5339704155921936,-0.08558398485183716,0.03379639610648155,0.2631123960018158,0.7414908409118652,-0.5430196523666382,0.5138211250305176,0.12716256082057953,-0.31557697057724,0.19252167642116547,0.253345787525177,-0.02899402379989624,0.11646998673677444,0.39367929100990295,-0.03387301415205002,-0.28333616256713867,-0.21347321569919586,-0.11846308410167694,-0.20535527169704437,-0.016133740544319153,0.18760168552398682,-0.2259252518415451,0.1494227647781372,0.18264584243297577,0.3958255350589752,-0.14623713493347168,0.2270394116640091,0.1714852750301361,-0.07063024491071701,-0.2905765771865845,-0.36629199981689453,0.3616235852241516,0.5592570304870605,0.10643155872821808,0.024111144244670868,0.21289567649364471,-0.31754186749458313,0.15855254232883453,-0.008792846463620663,0.004980940371751785,0.6545747518539429,-0.19380448758602142,-0.03998925909399986,-0.20742420852184296,0.12010551989078522,0.28698107600212097,0.5303142666816711,-0.6662395000457764,-0.17525595426559448,-0.09595256298780441,0.3452081084251404,-0.152899831533432,0.10215282440185547,-0.20828397572040558,0.5832223296165466,-0.18213698267936707,0.0345468744635582,-0.10129953920841217,0.16032829880714417,-0.13427498936653137,0.047849901020526886,-0.0865342766046524,0.22245003283023834,-0.1609944850206375,-0.1925918459892273,-0.07725381851196289,-0.37833285331726074,0.05138351023197174,-0.012934232130646706,-0.8680322766304016,0.10170803964138031,-0.2544623613357544,-0.32966041564941406,-0.08313576132059097,0.008441970683634281,-0.03992752358317375,-0.2733607590198517,-0.2615377604961395,0.10742823779582977,-0.06549906730651855,-0.1308857947587967,-0.09799693524837494,-0.08787287026643753,0.06873519718647003,0.09506669640541077,-0.3629777133464813,0.15921449661254883,-0.06500373035669327,0.17909352481365204,0.8790280818939209,0.03772491589188576,0.08673910051584244,-8.565214670852583e-08,-0.33929646015167236,0.3118705451488495,-0.4805440902709961,-0.032953061163425446,0.34920498728752136,0.24195101857185364,-0.1646021157503128,0.2519652247428894,0.1799640655517578,-0.22356748580932617,0.5443459749221802,0.13451939821243286,-0.29193437099456787,0.3693276643753052,0.6018487811088562,-0.07137903571128845,-0.7007991671562195,-0.38815778493881226,-0.09892711043357849,-0.21787768602371216,-0.07027873396873474,-0.118606798350811,0.10142236948013306,-0.5028542280197144,0.061366863548755646,-0.010358983650803566,-0.3588971197605133,0.42797258496284485,0.46163269877433777,-0.1796817183494568,0.044185034930706024,0.10516093671321869,0.07580085843801498,0.1000712662935257,0.17967012524604797,-0.46841052174568176,-0.4112969636917114,0.10569800436496735,-0.0003655441105365753,0.034334391355514526,-0.37717923521995544,-0.11426356434822083,0.39737415313720703,0.1742405742406845,-0.43300509452819824,-0.09926651418209076,0.07001541554927826,-0.06793393939733505,-0.1917962282896042,-0.4800603985786438,0.06117616593837738,-0.08069508522748947,-0.025915905833244324,0.5345154404640198,0.6164687275886536,0.056552939116954803,0.11888577044010162,-0.17393869161605835,0.52826988697052,0.18463830649852753,1.0523483753204346,0.8015890717506409,0.259770929813385,-0.11053761094808578],"index":0,"object":"embedding"}],"model":"minilm","object":"list","usage":{"prompt_tokens":0,"total_tokens":0}}
Expected Behavior: the embedding from these 2 approaches should yield the same output
Actual Behavior: As you can see, the output embedding looks completely different from the one from step 3, not only the values, but the scales are different too.
=============================================================
And by the way, the embedding output I get from step 3 is almost the same with the one I got from using sentence_transformer python library, for example:
from sentence_transformers import SentenceTransformer
sentences = ["prince"]
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')
embeddings = model.encode(sentences)
print(embeddings)
This indicates that the model conversion works correctly.
I think there's something wrong with the Bert Embedding of server mode.