Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-nkvo causes IO_PAGE_FAULT on ROCm #4983

Closed
Artefact2 opened this issue Jan 16, 2024 · 1 comment
Closed

-nkvo causes IO_PAGE_FAULT on ROCm #4983

Artefact2 opened this issue Jan 16, 2024 · 1 comment

Comments

@Artefact2
Copy link
Collaborator

I am running c37b347 compiled with make LLAMA_HIPBLAS=1 AMDGPU_TARGETS=gfx1030 main on Arch Linux. My GPU is a 6750XT, using ROCm 5.7.

I run into IO_PAGE_FAULT from amdgpu when trying to run Mixtral at high contexts with -nkvo option.

I can reproduce the issue with: ./main -m ~/KoboldCpp/models/mixtral-instruct-8x7b-q4k-small.gguf -c 32768 -ngl 2 -nkvo -p "The quick brown fox jumps over " -n 128

system_info: n_threads = 8 / 16 | AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | 
sampling: 
        repeat_last_n = 64, repeat_penalty = 1.100, frequency_penalty = 0.000, presence_penalty = 0.000
        top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
        mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
sampling order: 
CFG -> Penalties -> top_k -> tfs_z -> typical_p -> top_p -> min_p -> temp 
generate: n_ctx = 512, n_batch = 512, n_predict = 128, n_keep = 0


 The quick brown fox jumps over  [end of text]

And I see lots of errors like these in my journal:

Jan 16 18:11:16 Silmeria kernel: amd_iommu_report_page_fault: 821417 callbacks suppressed
Jan 16 18:11:16 Silmeria kernel: amdgpu 0000:0a:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0013 address=0x0 flags=0x0000]
Jan 16 18:11:16 Silmeria kernel: amdgpu 0000:0a:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0013 address=0x700 flags=0x0000]
Jan 16 18:11:16 Silmeria kernel: amdgpu 0000:0a:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0013 address=0xe00 flags=0x0000]
Jan 16 18:11:16 Silmeria kernel: amdgpu 0000:0a:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0013 address=0x500 flags=0x0000]
Jan 16 18:11:16 Silmeria kernel: amdgpu 0000:0a:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0013 address=0xc00 flags=0x0000]
Jan 16 18:11:16 Silmeria kernel: amdgpu 0000:0a:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0013 address=0x300 flags=0x0000]
Jan 16 18:11:16 Silmeria kernel: amdgpu 0000:0a:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0013 address=0xa00 flags=0x0000]
Jan 16 18:11:16 Silmeria kernel: amdgpu 0000:0a:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0013 address=0xf00 flags=0x0000]
Jan 16 18:11:16 Silmeria kernel: amdgpu 0000:0a:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0013 address=0x400 flags=0x0000]
Jan 16 18:11:16 Silmeria kernel: amdgpu 0000:0a:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0013 address=0x500 flags=0x0000]
Jan 16 18:11:16 Silmeria kernel: amd_iommu_restart_log: 1430 callbacks suppressed
Jan 16 18:11:16 Silmeria kernel: AMD-Vi: IOMMU Event log restarting
Jan 16 18:11:16 Silmeria kernel: AMD-Vi: IOMMU Event log restarting
Jan 16 18:11:16 Silmeria kernel: AMD-Vi: IOMMU Event log restarting
Jan 16 18:11:16 Silmeria kernel: AMD-Vi: IOMMU Event log restarting
Jan 16 18:11:16 Silmeria kernel: AMD-Vi: IOMMU Event log restarting
Jan 16 18:11:16 Silmeria kernel: AMD-Vi: IOMMU Event log restarting
Jan 16 18:11:16 Silmeria kernel: AMD-Vi: IOMMU Event log restarting
Jan 16 18:11:16 Silmeria kernel: AMD-Vi: IOMMU Event log restarting
Jan 16 18:11:16 Silmeria kernel: AMD-Vi: IOMMU Event log restarting
Jan 16 18:11:16 Silmeria kernel: AMD-Vi: IOMMU Event log restarting

Without -nkvo, the model operates normally: ./main -m ~/KoboldCpp/models/mixtral-instruct-8x7b-q4k-small.gguf -c 32768 -ngl 2 -p "The quick brown fox jumps over " -n 128

system_info: n_threads = 8 / 16 | AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | 
sampling: 
        repeat_last_n = 64, repeat_penalty = 1.100, frequency_penalty = 0.000, presence_penalty = 0.000
        top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
        mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
sampling order: 
CFG -> Penalties -> top_k -> tfs_z -> typical_p -> top_p -> min_p -> temp 
generate: n_ctx = 512, n_batch = 512, n_predict = 128, n_keep = 0


 The quick brown fox jumps over  lazy dog.

It’s a phrase we learned to type in elementary school, and one I’ve never forgotten because it includes every letter of the alphabet – A to Z!

But there are other phrases that include all the

I think this is a llama.cpp bug. Can anyone else reproduce?

@ggerganov
Copy link
Owner

Yes, -nkvo is currently broken (#4766)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants