Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

intel_adsp: many tests fail due to change to timer interrupts on secondary cores not being enabled #70494

Closed
majunkier opened this issue Mar 20, 2024 · 4 comments · Fixed by #70579
Assignees
Labels
bug The issue is a bug, or the PR is fixing a bug platform: Intel ADSP Intel Audio platforms priority: high High impact/importance bug

Comments

@majunkier
Copy link
Collaborator

Describe the bug
tested on intel_adsp/ boards (cavs25, ace15_mtpm)

tests/kernel/semaphore/semaphore/kernel.semaphore

test hangs despite initially promising results. Result is timeout.

To Reproduce
twister --device-testing --hardware-map PATH-TO-HWMAP -s tests/kernel/semaphore/semaphore/kernel.semaphore

Expected behavior
Test should pass

Logs and console output
handler.log

Running TESTSUITE semaphore
===================================================================
START - test_k_sem_correct_count_limit
PASS - test_k_sem_correct_count_limit in 0.001 seconds
===================================================================
START - test_k_sem_define
PASS - test_k_sem_define in 0.001 seconds
===================================================================
START - test_k_sem_init
PASS - test_k_sem_init in 0.001 seconds
===================================================================
START - test_sem_count_get
PASS - test_sem_count_get in 0.001 seconds
===================================================================
START - test_sem_give_from_isr
PASS - test_sem_give_from_isr in 0.001 seconds
===================================================================
START - test_sem_give_from_thread
PASS - test_sem_give_from_thread in 0.001 seconds
===================================================================
START - test_sem_give_take_from_isr
PASS - test_sem_give_take_from_isr in 0.001 seconds
===================================================================
START - test_sem_measure_timeout_from_thread
PASS - test_sem_measure_timeout_from_thread in 0.001 seconds
===================================================================
START - test_sem_measure_timeouts
PASS - test_sem_measure_timeouts in 1.001 seconds
===================================================================
START - test_sem_multi_take_timeout_diff_sem
SKIP - test_sem_multi_take_timeout_diff_sem in 0.001 seconds
===================================================================
START - test_sem_multiple_threads_wait
PASS - test_sem_multiple_threads_wait in 2.001 seconds
===================================================================
START - test_sem_reset
PASS - test_sem_reset in 0.101 seconds
===================================================================
START - test_sem_reset_waiting
PASS - test_sem_reset_waiting in 0.002 seconds
===================================================================
START - test_sem_take_multiple
PASS - test_sem_take_multiple in 0.941 seconds
===================================================================
START - test_sem_take_no_wait
PASS - test_sem_take_no_wait in 0.001 seconds
===================================================================
START - test_sem_take_no_wait_fails
PASS - test_sem_take_no_wait_fails in 0.001 seconds
===================================================================
START - test_sem_take_timeout
PASS - test_sem_take_timeout in 0.001 seconds
===================================================================
START - test_sem_take_timeout_fails
PASS - test_sem_take_timeout_fails in 0.501 seconds
===================================================================
START - test_sem_take_timeout_forever
PASS - test_sem_take_timeout_forever in 0.101 seconds
===================================================================
START - test_sem_thread2isr
PASS - test_sem_thread2isr in 0.001 seconds
===================================================================
START - test_sem_thread2thread
PASS - test_sem_thread2thread in 0.001 seconds
===================================================================
TESTSUITE semaphore succeeded
Running TESTSUITE semaphore_1cpu
===================================================================
START - test_sem_multiple_take_and_timeouts
SKIP - test_sem_multiple_take_and_timeouts in 0.002 seconds
===================================================================
START - test_sem_queue_mutual_exclusion

Environment (please complete the following information):

  • OS: Linux
  • Toolchain: Zephyr SDK
  • Version used: v3.6.0-1236-g8f4ac0d4ab45
@majunkier majunkier added bug The issue is a bug, or the PR is fixing a bug platform: Intel ADSP Intel Audio platforms labels Mar 20, 2024
@nashif nashif changed the title intel_adsp: tests/kernel/semaphore/semaphore/kernel.semaphore fails intel_adsp: tmany tests fail due to change to timer interrupts on secondary cores not being enabled Mar 20, 2024
@nashif
Copy link
Member

nashif commented Mar 20, 2024

commit 315ee38b95a2356eb3cd266ce4bc789f48d7c59c
Author: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Date:   Tue Feb 27 14:13:10 2024 +0100

    ADSP: don't use timer interrupts on secondary cores

    When running SOF on Intel ADSP we choose to only serve the timer
    interrupt on the primary core.

    Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>

diff --git a/drivers/timer/intel_adsp_timer.c b/drivers/timer/intel_adsp_timer.c
index 330e5bbd425..d1f37b123c2 100644
--- a/drivers/timer/intel_adsp_timer.c
+++ b/drivers/timer/intel_adsp_timer.c
@@ -210,7 +210,6 @@ static void irq_init(void)

 void smp_timer_init(void)
 {
-       irq_init();
 }

 /* Runs on core 0 only */

Is causing all of those failures, @lyakh can you please take a look?

@nashif
Copy link
Member

nashif commented Mar 20, 2024

@kv2019i FYI

@marc-hb marc-hb changed the title intel_adsp: tmany tests fail due to change to timer interrupts on secondary cores not being enabled intel_adsp: many tests fail due to change to timer interrupts on secondary cores not being enabled Mar 20, 2024
@peter-mitsis
Copy link
Collaborator

peter-mitsis commented Mar 20, 2024

For what it is worth, I have a theory that may explain the reported failure.

Background
The failing test (test_sem_queue_mutual_exclusion) is part of the semaphore_1cpu test suite. This test suite "disables" other CPUs. It does this by having those other CPUs busy wait with interrupts locked for a long period of time.

The Theory
I suspect that core 0, the only core to which timer interrupts are being delivered, is one of those CPUs that has interrupts locked for a long duration. If this is the case, then the timer interrupt is never processed by core 0. If the timer interrupt is never processed by core 0, then anything that sleeps or waits on queue will never time out.

@andyross
Copy link
Contributor

I was a little worried about this in the original PR. There's a lot of weird assumptions in some of the first era of tests that were ported to SMP, and broadcast timer interrupts were a requirement for a long time. (Even now it's a little ambiguous, riscv64 for example will deliver the timer interrupt on the CPU where it happened to be scheduled, leaving any other scheduled interrupts still pending on other CPUs, which is fine too, but neither is it this "cpu0 only" scheme or broadcast).

FWIW: given the one-line nature of the patch, seems like a easy fix would be to define a CONFIG_SYS_CLOCK_MP_BROADCAST (or whatever) kconfig that selects whether or not the timer is broadcast, just gate that line with an #if, and set it for whatever test cases are failing. Extra credit would put it into the platform layers as a "HAS_BROADCAST" flag they could expose maybe, etc...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The issue is a bug, or the PR is fixing a bug platform: Intel ADSP Intel Audio platforms priority: high High impact/importance bug
Projects
None yet
5 participants