NAME¶
sdiag - Scheduling diagnostic tool for SLURM
SYNOPSIS¶
sview
DESCRIPTION¶
sdiag shows information related to slurmctld execution about: threads, agents,
jobs, and scheduling algorithms. The goal is to obtain data from slurmctld
behaviour helping to adjust configuration parameters or queues policies. The
main reason behind is to know SLURM behaviour under systems with a high
throughput.
It has two execution modes. The default mode
--all shows several counters
and statistics explained later, and there is another execution option
--reset for resetting those values.
Values are reset at midnight UTC time by default.
The first block of information is related to global slurmctld execution:
- Server thread count
- The number of current active slurmctld threads. A high number would mean a
high load processing events like job submissions, jobs dispatching, jobs
completing, etc. If this is often close to MAX_SERVER_THREADS it could
point to a potential bottleneck.
- Agent queue size
- SLURM design has scalability in mind and sending messages to thousands of
nodes is not a trivial task. The agent mechanism helps to control
communication between the slurm daemons and the controller for a best
effort. If this values is close to MAX_AGENT_CNT there could be some
delays affecting jobs management.
- Jobs submitted
- Number of jobs submitted since last reset
- Jobs started
- Number of jobs started since last reset. This includes backfilled jobs.
- Jobs completed
- Number of jobs completed since last reset.
- Jobs canceled
- Number of jobs canceled since last reset.
- Jobs failed
- Number of jobs failed since last reset.
The second block of information is related to main scheduling algorithm based on
jobs priorities. A scheduling cycle implies to get the job_write_lock lock,
then trying to get resources for jobs pending, starting from the most priority
one and going in descendent order. Once a job can not get the resources the
loop keeps going but just for jobs requesting other partitions. Jobs with
dependencies or affected by accounts limits are not processed.
- Last cycle
- Time in microseconds for last scheduling cycle.
- Max cycle
- Time in microseconds for the maximum scheduling cycle since last reset.
- Total cycles
- Number of scheduling cycles since last reset. Scheduling is done in
periodically and when a job is submitted or a job is completed.
- Mean cycle
- Mean of scheduling cycles since last reset
- Mean depth cycle
- Mean of cycle depth. Depth means number of jobs processed in a scheduling
cycle.
- Cycles per minute
- Counter of scheduling executions per minute
- Last queue length
- Length of jobs pending queue.
The third block of information is related to backfilling scheduling algorithm. A
backfilling scheduling cycle implies to get locks for jobs, nodes and
partitions objects then trying to get resources for jobs pending. Jobs are
processed based on priorities. If a job can not get resources the algorithm
calculates when it could get them obtaining a future start time for the job.
Then next job is processed and the algorithm tries to get resources for that
job but avoiding to affect the
previous ones, and again it calculates
the future start time if not current resources available. The backfilling
algorithm takes more time for each new job to process since more priority jobs
can not be affected. The algorithm itself takes measures for avoiding a long
execution cycle and for taking all the locks for too long.
- Total backfilled jobs (since last slurm start)
- Number of jobs started thanks to backfilling since last slurm start.
- Total backfilled jobs (since last stats cycle start)
- Number of jobs started thanks to backfilling since last time stats where
reset. By default these values are reset at midnight UTC time.
- Total cycles
- Number of scheduling cycles since last reset
- Last cycle when
- Time when last execution cycle happened in format "weekday Month
MonthDay hour:minute.seconds year"
- Last cycle
- Time in microseconds of last backfilling cycle. It counts only execution
time removing sleep time inside a scheduling cycle when it takes too much
time. Note that locks are released during the sleep time so that other
work can proceed.
- Max cycle
- Time in microseconds of maximum backfilling cycle execution since last
reset. It counts only execution time removing sleep time inside a
scheduling cycle when it takes too much time. Note that locks are released
during the sleep time so that other work can proceed.
- Mean cycle
- Mean of backfilling scheduling cycles in microseconds since last reset
- Last depth cycle
- Number of processed jobs during last backfilling scheduling cycle. It
counts every process even if it has no option to execute due to
dependencies or limits.
- Last depth cycle (try sched)
- Number of processed jobs during last backfilling scheduling cycle. It
counts only processes with a chance to run waiting for available
resources. These jobs are which makes the backfilling algorithm heavier.
- Depth Mean
- Mean of processed jobs during backfilling scheduling cycles since last
reset.
- Depth Mean (try sched)
- Mean of processed jobs during backfilling scheduling cycles since last
reset. It counts only processes with a chance to run waiting for available
resources. These jobs are which makes the backfilling algorithm heavier.
- Last queue length
- Number of jobs pending to be processed by backfilling algorithm. A job
appears as much times as partitions it requested.
- Queue length Mean
- Mean of jobs pending to be processed by backfilling algorithm.
OPTIONS¶
- -a, --all
- Get and report information. This is the default mode of operation.
- -h, --help
- Print description of options and exit.
- -r, --reset
- Reset counters. Only used by user SlurmUser or root.
- --usage
- Print list of options and exit.
- -V, --version
- Print current version number and exit.
COPYING¶
Copyright (C) 2010-2011 Barcelona Supercomputing Center.
Copyright (C) 2010-2013 SchedMD LLC.
SLURM is free software; you can redistribute it and/or modify it under the terms
of the GNU General Public License as published by the Free Software
Foundation; either version 2 of the License, or (at your option) any later
version.
SLURM is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
A PARTICULAR PURPOSE. See the GNU General Public License for more details.
SEE ALSO¶
sinfo(1),
squeue(1),
scontrol(1),
slurm.conf(5),