Skip to content

Huge performance degradation using latest branch on Intel Core Ultra 7 155H #8328

Closed
@aahouzi

Description

Type of issue

  • I conducted some benchmarks on Intel Core Ultra 7 155H about 3 months ago using this release: b2568, and these are the results I obtain for llama-2-7B-Q4_0.gguf:
system_info: n_threads = 18 / 22 | AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 0 | VSX = 0 | MATMUL_INT8 = 0 |
sampling:
        repeat_last_n = 64, repeat_penalty = 1.000, frequency_penalty = 0.000, presence_penalty = 0.000
        top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
        mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
sampling order:
CFG -> Penalties -> top_k -> tfs_z -> typical_p -> top_p -> min_p -> temperature
generate: n_ctx = 512, n_batch = 2048, n_predict = 256, n_keep = 1


 Building a website can be done in 10 simple steps:\nStep 1: Choosing a Web Hosting Company\n\nStep 2: Creating an Account\n\nStep 3: Choose a Domain\n\nStep 4: Design Your Website\n\nStep 5: Register a Domain Name\n\nStep 6: Set Up a Website Template\n\nStep 7: Link Your Domain to the Website\n\nStep 8: Use a Content Management System\n\nStep 9: Set Up Your Website's Email Addresses\n\nStep 10: Check the Website's SEO and Security\n\nStep 11: Link to Other Social Media Platforms\n\nStep 12: Add SEO to Your Website\n\nStep 13: Add More Content to Your Website\n\nStep 14: Start Publishing Regularly\n\nStep 15: Get Feedback and Make Improvements\n\n\n
Whether you’re a beginner or an expert, you can build a website using a simple tool like Wix. Wix makes it easy to design and publish your website, regardless of your technical expertise.
llama_print_timings:        load time =    1416.75 ms
llama_print_timings:      sample time =       6.75 ms /   256 runs   (    0.03 ms per token, 37897.85 tokens per second)
llama_print_timings: prompt eval time =     857.20 ms /    19 tokens (   45.12 ms per token,    22.17 tokens per second)
llama_print_timings:        eval time =   22132.23 ms /   255 runs   (   86.79 ms per token,    11.52 tokens per second)
llama_print_timings:       total time =   23052.86 ms /   274 tokens
Log end
  • Using the latest branch, I observe a drop in performance for next token generation Tpt (abt 2.4 tok/s):
system_info: n_threads = 18 / 22 | AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 0 |
sampling:
        repeat_last_n = 64, repeat_penalty = 1.000, frequency_penalty = 0.000, presence_penalty = 0.000
        top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
        mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
sampling order:
CFG -> Penalties -> top_k -> tfs_z -> typical_p -> top_p -> min_p -> temperature
generate: n_ctx = 4096, n_batch = 2048, n_predict = 256, n_keep = 1


 Building a website can be done in 10 simple steps:
Step 1: Select your website type
Step 2: Choose a domain name
Step 4: Build your site
Step 5: Connect your site to your domain
Step 6: Install a content management system (CMS)
Step 7: Optimize the website
Step 8: Promote the website
Step 10: Maintain the site
How long does it take to create a website from scratch?
Can you learn to code in 10 days?
How can I build a website in 7 days?
Can I learn how to code in 10 days?
How long does it take to learn HTML and CSS?
How long does it take to learn JavaScript?
A website is a collection of web pages and associated files that are hosted on a server. The pages are typically written in HTML (hypertext markup language) and linked to each other by hypertext links.
Websites can be either static or dynamic. A static website consists of a single page with no interactive components, while a dynamic website can be updated and changed without the need for a web developer.
There are many different types of websites, but the most common are:
-Personal websites: These are typically created
llama_print_timings:        load time =    2072.39 ms
llama_print_timings:      sample time =      10.37 ms /   256 runs   (    0.04 ms per token, 24686.60 tokens per second)
llama_print_timings: prompt eval time =     876.59 ms /    19 tokens (   46.14 ms per token,    21.67 tokens per second)
llama_print_timings:        eval time =   27753.15 ms /   255 runs   (  108.84 ms per token,     9.19 tokens per second)
llama_print_timings:       total time =   28721.61 ms /   274 tokens
Log end
  • Is performance even being monitored across different HWs when you introduce new code changes ? Because it's great to get better performance, but not at the cost of degrading performance on other range of HWs...

Name and Version

./llama-cli.exe release b3317 vs ./main.exe release b2568

What operating system are you seeing the problem on?

Windows 11

Relevant log output

See issue description

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    bug-unconfirmedmedium severityUsed to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)stale

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions