Closed
Description
Type of issue
- I conducted some benchmarks on Intel Core Ultra 7 155H about 3 months ago using this release: b2568, and these are the results I obtain for llama-2-7B-Q4_0.gguf:
system_info: n_threads = 18 / 22 | AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 0 | VSX = 0 | MATMUL_INT8 = 0 |
sampling:
repeat_last_n = 64, repeat_penalty = 1.000, frequency_penalty = 0.000, presence_penalty = 0.000
top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
sampling order:
CFG -> Penalties -> top_k -> tfs_z -> typical_p -> top_p -> min_p -> temperature
generate: n_ctx = 512, n_batch = 2048, n_predict = 256, n_keep = 1
Building a website can be done in 10 simple steps:\nStep 1: Choosing a Web Hosting Company\n\nStep 2: Creating an Account\n\nStep 3: Choose a Domain\n\nStep 4: Design Your Website\n\nStep 5: Register a Domain Name\n\nStep 6: Set Up a Website Template\n\nStep 7: Link Your Domain to the Website\n\nStep 8: Use a Content Management System\n\nStep 9: Set Up Your Website's Email Addresses\n\nStep 10: Check the Website's SEO and Security\n\nStep 11: Link to Other Social Media Platforms\n\nStep 12: Add SEO to Your Website\n\nStep 13: Add More Content to Your Website\n\nStep 14: Start Publishing Regularly\n\nStep 15: Get Feedback and Make Improvements\n\n\n
Whether you’re a beginner or an expert, you can build a website using a simple tool like Wix. Wix makes it easy to design and publish your website, regardless of your technical expertise.
llama_print_timings: load time = 1416.75 ms
llama_print_timings: sample time = 6.75 ms / 256 runs ( 0.03 ms per token, 37897.85 tokens per second)
llama_print_timings: prompt eval time = 857.20 ms / 19 tokens ( 45.12 ms per token, 22.17 tokens per second)
llama_print_timings: eval time = 22132.23 ms / 255 runs ( 86.79 ms per token, 11.52 tokens per second)
llama_print_timings: total time = 23052.86 ms / 274 tokens
Log end
- Using the latest branch, I observe a drop in performance for next token generation Tpt (abt 2.4 tok/s):
system_info: n_threads = 18 / 22 | AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 0 |
sampling:
repeat_last_n = 64, repeat_penalty = 1.000, frequency_penalty = 0.000, presence_penalty = 0.000
top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
sampling order:
CFG -> Penalties -> top_k -> tfs_z -> typical_p -> top_p -> min_p -> temperature
generate: n_ctx = 4096, n_batch = 2048, n_predict = 256, n_keep = 1
Building a website can be done in 10 simple steps:
Step 1: Select your website type
Step 2: Choose a domain name
Step 4: Build your site
Step 5: Connect your site to your domain
Step 6: Install a content management system (CMS)
Step 7: Optimize the website
Step 8: Promote the website
Step 10: Maintain the site
How long does it take to create a website from scratch?
Can you learn to code in 10 days?
How can I build a website in 7 days?
Can I learn how to code in 10 days?
How long does it take to learn HTML and CSS?
How long does it take to learn JavaScript?
A website is a collection of web pages and associated files that are hosted on a server. The pages are typically written in HTML (hypertext markup language) and linked to each other by hypertext links.
Websites can be either static or dynamic. A static website consists of a single page with no interactive components, while a dynamic website can be updated and changed without the need for a web developer.
There are many different types of websites, but the most common are:
-Personal websites: These are typically created
llama_print_timings: load time = 2072.39 ms
llama_print_timings: sample time = 10.37 ms / 256 runs ( 0.04 ms per token, 24686.60 tokens per second)
llama_print_timings: prompt eval time = 876.59 ms / 19 tokens ( 46.14 ms per token, 21.67 tokens per second)
llama_print_timings: eval time = 27753.15 ms / 255 runs ( 108.84 ms per token, 9.19 tokens per second)
llama_print_timings: total time = 28721.61 ms / 274 tokens
Log end
- Is performance even being monitored across different HWs when you introduce new code changes ? Because it's great to get better performance, but not at the cost of degrading performance on other range of HWs...
Name and Version
./llama-cli.exe release b3317 vs ./main.exe release b2568
What operating system are you seeing the problem on?
Windows 11
Relevant log output
See issue description
Activity