NAME¶
curpriority_cmp
,
maybe_resched
,
resetpriority
,
roundrobin
,
roundrobin_interval
,
sched_setup
,
schedclock
,
schedcpu
,
setrunnable
,
updatepri
—
perform round-robin scheduling of runnable
processes
SYNOPSIS¶
#include
<sys/param.h>
#include
<sys/proc.h>
int
curpriority_cmp
(
struct
proc *p);
void
maybe_resched
(
struct
thread *td);
void
propagate_priority
(
struct
proc *p);
void
resetpriority
(
struct
ksegrp *kg);
void
roundrobin
(
void
*arg);
int
roundrobin_interval
(
void);
void
sched_setup
(
void
*dummy);
void
schedclock
(
struct
thread *td);
void
schedcpu
(
void
*arg);
void
setrunnable
(
struct
thread *td);
void
updatepri
(
struct
thread *td);
DESCRIPTION¶
Each process has three different priorities stored in
struct proc:
p_usrpri,
p_nativepri, and
p_priority.
The
p_usrpri member is the user priority of the
process calculated from a process' estimated CPU time and nice level.
The
p_nativepri member is the saved priority
used by
propagate_priority
(). When a
process obtains a mutex, its priority is saved in
p_nativepri. While it holds the mutex, the
process's priority may be bumped by another process that blocks on the mutex.
When the process releases the mutex, then its priority is restored to the
priority saved in
p_nativepri.
The
p_priority member is the actual priority of
the process and is used to determine what
runqueue(9) it runs on, for example.
The
curpriority_cmp
() function compares the
cached priority of the currently running process with process
p. If the currently running process has a
higher priority, then it will return a value less than zero. If the current
process has a lower priority, then it will return a value greater than zero.
If the current process has the same priority as
p, then
curpriority_cmp
() will return zero. The
cached priority of the currently running process is updated when a process
resumes from
tsleep(9) or returns to userland in
userret
() and is stored in the private
variable
curpriority.
The
maybe_resched
() function compares the
priorities of the current thread and
td. If
td has a higher priority than the current
thread, then a context switch is needed, and
KEF_NEEDRESCHED
is set.
The
propagate_priority
() looks at the process
that owns the mutex
p is blocked on. That
process's priority is bumped to the priority of
p if needed. If the process is currently
running, then the function returns. If the process is on a
runqueue(9), then the process is moved to the
appropriate
runqueue(9) for its new priority. If
the process is blocked on a mutex, its position in the list of processes
blocked on the mutex in question is updated to reflect its new priority. Then,
the function repeats the procedure using the process that owns the mutex just
encountered. Note that a process's priorities are only bumped to the priority
of the original process
p, not to the
priority of the previously encountered process.
The
resetpriority
() function recomputes the
user priority of the ksegrp
kg (stored in
kg_user_pri) and calls
maybe_resched
() to force a reschedule of
each thread in the group if needed.
The
roundrobin
() function is used as a
timeout(9) function to force a reschedule every
sched_quantum ticks.
The
roundrobin_interval
() function simply
returns the number of clock ticks in between reschedules triggered by
roundrobin
(). Thus, all it does is return
the current value of
sched_quantum.
The
sched_setup
() function is a
SYSINIT(9) that is called to start the callout
driven scheduler functions. It just calls the
roundrobin
() and
schedcpu
() functions for the first time.
After the initial call, the two functions will propagate themselves by
registering their callout event again at the completion of the respective
function.
The
schedclock
() function is called by
statclock
() to adjust the priority of the
currently running thread's ksegrp. It updates the group's estimated CPU time
and then adjusts the priority via
resetpriority
().
The
schedcpu
() function updates all process
priorities. First, it updates statistics that track how long processes have
been in various process states. Secondly, it updates the estimated CPU time
for the current process such that about 90% of the CPU usage is forgotten in 5
* load average seconds. For example, if the load average is 2.00, then at
least 90% of the estimated CPU time for the process should be based on the
amount of CPU time the process has had in the last 10 seconds. It then
recomputes the priority of the process and moves it to the appropriate
runqueue(9) if necessary. Thirdly, it updates the
%CPU estimate used by utilities such as
ps(1) and
top(1) so that 95% of the CPU usage is forgotten
in 60 seconds. Once all process priorities have been updated,
schedcpu
() calls
vmmeter
() to update various other
statistics including the load average. Finally, it schedules itself to run
again in
hz clock ticks.
The
setrunnable
() function is used to change
a process's state to be runnable. The process is placed on a
runqueue(9) if needed, and the swapper process is
woken up and told to swap the process in if the process is swapped out. If the
process has been asleep for at least one run of
schedcpu
(), then
updatepri
() is used to adjust the priority
of the process.
The
updatepri
() function is used to adjust
the priority of a process that has been asleep. It retroactively decays the
estimated CPU time of the process for each
schedcpu
() event that the process was
asleep. Finally, it calls
resetpriority
()
to adjust the priority of the process.
SEE ALSO¶
mi_switch(9),
runqueue(9),
sleepqueue(9),
tsleep(9)
BUGS¶
The
curpriority variable really should be
per-CPU. In addition,
maybe_resched
()
should compare the priority of
chk with that
of each CPU, and then send an IPI to the processor with the lowest priority to
trigger a reschedule if needed.
Priority propagation is broken and is thus disabled by default. The
p_nativepri variable is only updated if a
process does not obtain a sleep mutex on the first try. Also, if a process
obtains more than one sleep mutex in this manner, and had its priority bumped
in between, then
p_nativepri will be
clobbered.