Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel: sched: Fix meta-IRQ preemption tracking for the idle thread #84436

Merged
merged 1 commit into from
Jan 27, 2025

Conversation

kietavainen
Copy link
Collaborator

When the PM subsystem is enabled, the idle thread locks the scheduler for the duration the system is suspended. If a meta-IRQ preempts the idle thread in this state, the idle thread is tracked in metairq_preempted. However, when returning from the preemption, the idle thread is not removed from metairq_preempted, unlike all the other threads. As a result, the scheduler keeps running the idle thread even if there are higher priority threads ready to run.

This change treats the idle thread the same way as all other threads when returning from a meta-IRQ preemption.

Fixes #64705

When the PM subsystem is enabled, the idle thread locks the scheduler for
the duration the system is suspended. If a meta-IRQ preempts the idle
thread in this state, the idle thread is tracked in `metairq_preempted`.
However, when returning from the preemption, the idle thread is not removed
from `metairq_preempted`, unlike all the other threads. As a result, the
scheduler keeps running the idle thread even if there are higher priority
threads ready to run.

This change treats the idle thread the same way as all other threads when
returning from a meta-IRQ preemption.

Fixes zephyrproject-rtos#64705

Signed-off-by: Kalle Kietäväinen <[email protected]>
Copy link
Contributor

@andyross andyross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, though there has historically been an edge case here. Zephyr used to be able to be configured with only cooperative priorities, meaning that the idle thread would be cooperative, which obviously doesn't work (you couldn't preempt it with actual work!). So there was special handling for that. IIRC that all got fixed long ago, so this check is likely just vestigial. Nonetheless we should keep this in mind if something starts breaking.

That said, isn't the root cause here the fact that the PM layer is using a sched_lock and not a spinlock to prevent preemption on the suspend path? That seems wrong, even if you don't wake up a thread you could have device and subsystem and even app ISRs (c.f. k_timer) stepping in to make a mess of things in a context where the system isn't going to stay awake like they expect.

Can you point to the lock you're seeing? We should probably fix that.

@kietavainen
Copy link
Collaborator Author

That said, isn't the root cause here the fact that the PM layer is using a sched_lock and not a spinlock to prevent preemption on the suspend path? That seems wrong, even if you don't wake up a thread you could have device and subsystem and even app ISRs (c.f. k_timer) stepping in to make a mess of things in a context where the system isn't going to stay awake like they expect.

Can you point to the lock you're seeing? We should probably fix that.

The scheduler is locked in the PM in pm_system_suspend():

zephyr/subsys/pm/pm.c

Lines 218 to 229 in 441112f

k_sched_lock();
pm_stats_start();
/* Enter power state */
pm_state_notify(true);
atomic_set_bit(z_post_ops_required, id);
pm_state_set(z_cpus_pm_state[id].state, z_cpus_pm_state[id].substate_id);
pm_stats_stop();
/* Wake up sequence starts here */
pm_stats_update(z_cpus_pm_state[id].state);
pm_system_resume();
k_sched_unlock();

@kartben kartben merged commit a2eb78c into zephyrproject-rtos:main Jan 27, 2025
34 checks passed
@kietavainen kietavainen deleted the fix-meta-irq-pm branch January 27, 2025 12:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Kernel size: XS A PR changing only a single line of code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PM subsystem bug when using Meta IRQ threads
6 participants