SQUEUE(1)

Slurm components

SQUEUE(1)

NAME¶

squeue - view information about jobs located in the SLURM scheduling queue.

SYNOPSIS¶

squeue [OPTIONS...]

DESCRIPTION¶

squeue is used to view job and job step information for jobs managed by SLURM.

OPTIONS¶

-A <account_list>, --account=<account_list>: Specify the accounts of the jobs to view. Accepts a comma separated list of account names. This has no effect when listing job steps.

-a, --all: Display information about jobs and job steps in all partitions. This causes information to be displayed about partitions that are configured as hidden and partitions that are unavailable to user's group.

-r, --array: Display one job array element per line. Without this option, the display will be optimized for use with job arrays (pending job array elements will be combined on one line of output with the array index values printed using a regular expression).

-h, --noheader: Do not print a header on the output.

--help: Print a help message describing all options squeue.

--hide: Do not display information about jobs and job steps in all partitions. By default, information about partitions that are configured as hidden or are not available to the user's group will not be displayed (i.e. this is the default behavior).

-i <seconds>, --iterate=<seconds>: Repeatedly gather and report the requested information at the interval specified (in seconds). By default, prints a time stamp with the header.

-j <job_id_list>, --jobs=<job_id_list>: Requests a comma separated list of job IDs to display. Defaults to all jobs. The --jobs=<job_id_list> option may be used in conjunction with the --steps option to print step information about specific jobs. Note: If a list of job IDs is provided, the jobs are displayed even if they are on hidden partitions. Since this option's argument is optional, for proper parsing the single letter option must be followed immediately with the value and not include a space between them. For example "-j1008" and not "-j 1008". The job ID format is "job_id[_array_id]". Performance of the command can be measurably improved for systems with large numbers of jobs when a single job ID is specified.

-l, --long: Report more of the available information for the selected jobs or job steps, subject to any constraints specified.

-L, --licenses=<license_list>: Request jobs requesting or using one or more of the named licenses. The license list consists of a comma separated list of license names.

-M, --clusters=<string>: Clusters to issue commands to. Multiple cluster names may be comma separated. A value of of ' all' will query to run on all clusters.

-n, --name=<name_list>: Request jobs or job steps having one of the specified names. The list consists of a comma separated list of job names.

-o <output_format>, --format=<output_format>: Specify the information to be displayed, its size and position (right or left justified). The default formats with various options are

default: "%.18i %.9P %.8j %.8u %.2t %.10M %.6D %R"

-l, --long: "%.18i %.9P %.8j %.8u %.8T %.10M %.9l %.6D %R"

-s, --steps: "%.15i %.8j %.9P %.8u %.9M %N"

: The format of each field is "%[.][size]type".

size: is the minimum field size. If no size is specified, whatever is needed to print the information will be used.

.: indicates the output should be right justified and size must be specified. By default, output is left justified.

: Note that many of these type specifications are valid only for jobs while others are valid only for job steps. Valid type specifications include:

%all: Print all fields available for this data type with a vertical bar separating each field.

%a: Account associated with the job. (Valid for jobs only)

%A: Number of tasks created by a job step. This reports the value of the srun --ntasks option. (Valid for job steps only)

%A: Job id. This will have a unique value for each element of job arrays. (Valid for jobs only)

%b: Generic resources (gres) required by the job or step. (Valid for jobs and job steps)

%B: Executing (batch) host. For an allocated session, this is the host on which the session is executing (i.e. the node from which the the srun or the salloc command was executed). For a batch job, this is the node executing the batch script. In the case of a typical Linux cluster, this would be the compute node zero of the allocation. In the case of a BlueGene or a Cray system, this would be the front-end host whose slurmd daemon executes the job script.

%c: Minimum number of CPUs (processors) per node requested by the job. This reports the value of the srun --mincpus option with a default value of zero. (Valid for jobs only)

%C: Number of CPUs (processors) requested by the job or allocated to it if already running. As a job is completing this number will reflect the current number of CPUs allocated. (Valid for jobs only)

%d: Minimum size of temporary disk space (in MB) requested by the job. (Valid for jobs only)

%D: Number of nodes allocated to the job or the minimum number of nodes required by a pending job. The actual number of nodes allocated to a pending job may exceed this number if the job specified a node range count (e.g. minimum and maximum node counts) or the the job specifies a processor count instead of a node count and the cluster contains nodes with varying processor counts. As a job is completing this number will reflect the current number of nodes allocated. (Valid for jobs only)

%e: Time at which the job ended or is expected to end (based upon its time limit). (Valid for jobs only)

%E: Job dependency. This job will not begin execution until the dependent job completes. A value of zero implies this job has no dependencies. (Valid for jobs only)

%f: Features required by the job. (Valid for jobs only)

%F: Job array's job ID. This is the base job ID. For non-array jobs, this is the job ID. (Valid for jobs only)

%g: Group name of the job. (Valid for jobs only)

%G: Group ID of the job. (Valid for jobs only)

%h: Can the resources allocated to the job be shared with other jobs. The resources to be shared can be nodes, sockets, cores, or hyperthreads depending upon configuration. The value will be "yes" if the job was submitted with the shared option or the partition is configured with Shared=Force. (Valid for jobs only)

%H: Number of sockets per node requested by the job. This reports the value of the srun --sockets-per-node option. When --sockets-per-node has not been set, "*" is displayed. (Valid for jobs only)

%i: Job or job step id. In the case of job arrays, the job ID format will be of the form "<base_job_id>_<index>". (Valid for jobs and job steps)

%I: Number of cores per socket requested by the job. This reports the value of the srun --cores-per-socket option. When --cores-per-socket has not been set, "*" is displayed. (Valid for jobs only)

%j: Job or job step name. (Valid for jobs and job steps)

%J: Number of threads per core requested by the job. This reports the value of the srun --threads-per-core option. When --threads-per-core has not been set, "*" is displayed. (Valid for jobs only)

%k: Comment associated with the job. (Valid for jobs only)

%K: Job array index. (Valid for jobs only)

%l: Time limit of the job or job step in days-hours:minutes:seconds. The value may be "NOT_SET" if not yet established or "UNLIMITED" for no limit. (Valid for jobs and job steps)

%L: Time left for the job to execute in days-hours:minutes:seconds. This value is calculated by subtracting the job's time used from its time limit. The value may be "NOT_SET" if not yet established or "UNLIMITED" for no limit. (Valid for jobs only)

%m: Minimum size of memory (in MB) requested by the job. (Valid for jobs only)

%M: Time used by the job or job step in days-hours:minutes:seconds. The days and hours are printed only as needed. For job steps this field shows the elapsed time since execution began and thus will be inaccurate for job steps which have been suspended. Clock skew between nodes in the cluster will cause the time to be inaccurate. If the time is obviously wrong (e.g. negative), it displays as "INVALID". (Valid for jobs and job steps)

%n: List of node names (or base partitions on BlueGene systems) explicitly requested by the job. (Valid for jobs only)

%N: List of nodes allocated to the job or job step. In the case of a COMPLETING job, the list of nodes will comprise only those nodes that have not yet been returned to service. (Valid for jobs and job steps)

%o: The command to be executed.

%O: Are contiguous nodes requested by the job. (Valid for jobs only)

%p: Priority of the job (converted to a floating point number between 0.0 and 1.0). Also see %Q. (Valid for jobs only)

%P: Partition of the job or job step. (Valid for jobs and job steps)

%q: Quality of service associated with the job. (Valid for jobs only)

%Q: Priority of the job (generally a very large unsigned integer). Also see %p. (Valid for jobs only)

%r: The reason a job is in its current state. See the JOB REASON CODES section below for more information. (Valid for jobs only)

%R: For pending jobs: the reason a job is waiting for execution is printed within parenthesis. For terminated jobs with failure: an explanation as to why the job failed is printed within parenthesis. For all other job states: the list of allocate nodes. See the JOB REASON CODES section below for more information. (Valid for jobs only)

%s: Node selection plugin specific data for a job. Possible data includes: Geometry requirement of resource allocation (X,Y,Z dimensions), Connection type (TORUS, MESH, or NAV == torus else mesh), Permit rotation of geometry (yes or no), Node use (VIRTUAL or COPROCESSOR), etc. (Valid for jobs only)

%S: Actual or expected start time of the job or job step. (Valid for jobs and job steps)

%t: Job state, compact form: PD (pending), R (running), CA (cancelled), CF(configuring), CG (completing), CD (completed), F (failed), TO (timeout), NF (node failure) and SE (special exit state). See the JOB STATE CODES section below for more information. (Valid for jobs only)

%T: Job state, extended form: PENDING, RUNNING, SUSPENDED, CANCELLED, COMPLETING, COMPLETED, CONFIGURING, FAILED, TIMEOUT, PREEMPTED, NODE_FAIL and SPECIAL_EXIT. See the JOB STATE CODES section below for more information. (Valid for jobs only)

%u: User name for a job or job step. (Valid for jobs and job steps)

%U: User ID for a job or job step. (Valid for jobs and job steps)

%v: Reservation for the job. (Valid for jobs only)

%V: The job's submission time.

%w: Workload Characterization Key (wckey). (Valid for jobs only)

%W: Licenses reserved for the job. (Valid for jobs only)

%x: List of node names explicitly excluded by the job. (Valid for jobs only)

%X: Count of cores reserved on each node for system use (core specialization). (Valid for jobs only)

%y: Nice value (adjustment to a job's scheduling priority). (Valid for jobs only)

%z: Number of requested sockets, cores, and threads (S:C:T) per node for the job. When (S:C:T) has not been set, "*" is displayed. (Valid for jobs only)

%Z: The job's working directory.

-p <part_list>, --partition=<part_list>: Specify the partitions of the jobs or steps to view. Accepts a comma separated list of partition names.

-q <qos_list>, --qos=<qos_list>: Specify the qos(s) of the jobs or steps to view. Accepts a comma separated list of qos's.

-R, --reservation=reservation_name: Specify the reservation of the jobs to view.

-s, --steps: Specify the job steps to view. This flag indicates that a comma separated list of job steps to view follows without an equal sign (see examples). The job step format is "job_id[_array_id].step_id". Defaults to all job steps. Since this option's argument is optional, for proper parsing the single letter option must be followed immediately with the value and not include a space between them. For example "-s1008.0" and not "-s 1008.0".

-S <sort_list>, --sort=<sort_list>: Specification of the order in which records should be reported. This uses the same field specification as the <output_format>. Multiple sorts may be performed by listing multiple sort fields separated by commas. The field specifications may be preceded by "+" or "-" for ascending (default) and descending order respectively. For example, a sort value of "P,U" will sort the records by partition name then by user id. The default value of sort for jobs is "P,t,-p" (increasing partition name then within a given partition by increasing job state and then decreasing priority). The default value of sort for job steps is "P,i" (increasing partition name then within a given partition by increasing step id).

--start: Report the expected start time of pending jobs in order of increasing start time. This is equivalent to the following options: --format="%.7i %.9P %.8j %.8u %.2t %.19S %.6D %R", --sort=S and --states=PENDING. Any of these options may be explicitly changed as desired by combining the --start option with other option values (e.g. to use a different output format). The expected start time of pending jobs is only available if the SLURM is configured to use the backfill scheduling plugin.

-t <state_list>, --states=<state_list>: Specify the states of jobs to view. Accepts a comma separated list of state names or "all". If "all" is specified then jobs of all states will be reported. If no state is specified then pending, running, and completing jobs are reported. Valid states (in both extended and compact form) include: PENDING (PD), RUNNING (R), SUSPENDED (S), COMPLETING (CG), COMPLETED (CD), CONFIGURING (CF), CANCELLED (CA), FAILED (F), TIMEOUT (TO), PREEMPTED (PR), BOOT_FAIL (BF) , NODE_FAIL (NF) and SPECIAL_EXIT (SE). Note the <state_list> supplied is case insensitive ("pd" and "PD" are equivalent). See the JOB STATE CODES section below for more information.

-u <user_list>, --user=<user_list>: Request jobs or job steps from a comma separated list of users. The list can consist of user names or user id numbers. Performance of the command can be measurably improved for systems with large numbers of jobs when a single user is specified.

--usage: Print a brief help message listing the squeue options.

-v, --verbose: Report details of squeues actions.

-V , --version: Print version information and exit.

-w <hostlist>, --nodelist=<hostlist>: Report only on jobs allocated to the specified node or list of nodes. This may either be the NodeName or NodeHostname as defined in slurm.conf(5) in the event that they differ. A node_name of localhost is mapped to the current host name.

JOB REASON CODES¶

These codes identify the reason that a job is waiting for execution. A job may be waiting for more than one reason, in which case only one of those reasons is displayed.

AssociationJobLimit: The job's association has reached its maximum job count.

AssociationResourceLimit: The job's association has reached some resource limit.

AssociationTimeLimit: The job's association has reached its time limit.

BadConstraints: The job's constraints can not be satisfied.

BeginTime: The job's earliest start time has not yet been reached.

BlockFreeAction: An IBM BlueGene block is being freedand can not allow more jobs to start.

BlockMaxError: An IBM BlueGene block has too many cnodes in error state to allow more jobs to start.

Cleaning: The job is being requeued and still cleaning up from its previous execution.

Dependency: This job is waiting for a dependent job to complete.

FrontEndDown: No front end node is available to execute this job.

InactiveLimit: The job reached the system InactiveLimit.

InvalidAccount: The job's account is invalid.

InvalidQOS: The job's QOS is invalid.

JobHeldAdmin: The job is held by a system administrator.

JobHeldUser: The job is held by the user.

JobLaunchFailure: The job could not be launched. This may be due to a file system problem, invalid program name, etc.

Licenses: The job is waiting for a license.

NodeDown: A node required by the job is down.

NonZeroExitCode: The job terminated with a non-zero exit code.

PartitionDown: The partition required by this job is in a DOWN state.

PartitionInactive: The partition required by this job is in an Inactive state and not able to start jobs.

PartitionNodeLimit: The number of nodes required by this job is outside of it's partitions current limits. Can also indicate that required nodes are DOWN or DRAINED.

PartitionTimeLimit: The job's time limit exceeds it's partition's current time limit.

Priority: One or more higher priority jobs exist for this partition or advanced reservation.

Prolog: It's PrologSlurmctld program is still running.

QOSJobLimit: The job's QOS has reached its maximum job count.

QOSResourceLimit: The job's QOS has reached some resource limit.

QOSTimeLimit: The job's QOS has reached its time limit.

ReqNodeNotAvail: Some node specifically required by the job is not currently available.

Reservation: The job is waiting its advanced reservation to become available.

Resources: The job is waiting for resources to become available.

SystemFailure: Failure of the SLURM system, a file system, the network, etc.

TimeLimit: The job exhausted its time limit.

QOSUsageThreshold: Required QOS threshold has been breached.

WaitingForScheduling: No reason has been set for this job yet. Waiting for the scheduler to determine the appropriate reason.

JOB STATE CODES¶

Jobs typically pass through several states in the course of their execution. The typical states are PENDING, RUNNING, SUSPENDED, COMPLETING, and COMPLETED. An explanation of each state follows.

BF BOOT_FAIL: Job terminated due to launch failure, typically due to a hardware failure (e.g. unable to boot the node or block and the job can not be requeued).

CA CANCELLED: Job was explicitly cancelled by the user or system administrator. The job may or may not have been initiated.

CD COMPLETED: Job has terminated all processes on all nodes.

CF CONFIGURING: Job has been allocated resources, but are waiting for them to become ready for use (e.g. booting).

CG COMPLETING: Job is in the process of completing. Some processes on some nodes may still be active.

F FAILED: Job terminated with non-zero exit code or other failure condition.

NF NODE_FAIL: Job terminated due to failure of one or more allocated nodes.

PD PENDING: Job is awaiting resource allocation.

PR PREEMPTED: Job terminated due to preemption.

R RUNNING: Job currently has an allocation.

S SUSPENDED: Job has an allocation, but execution has been suspended.

TO TIMEOUT: Job terminated upon reaching its time limit.

SE SPECIAL_EXIT: The job was requeued in a special state. This state can be set by users, typically in EpilogSlurmctld, if the job has terminated with a particular exit value.

ENVIRONMENT VARIABLES¶

Some squeue options may be set via environment variables. These environment variables, along with their corresponding options, are listed below. (Note: Commandline options will always override these settings.)

SLURM_CLUSTERS: Same as --clusters

SLURM_CONF: The location of the SLURM configuration file.

SLURM_TIME_FORMAT: Specify the format used to report time stamps. A value of standard, the default value, generates output in the form "year-month-dateThour:minute:second". A value of relative returns only "hour:minute:second" if the current day. For other dates in the current year it prints the "hour:minute" preceded by "Tomorr" (tomorrow), "Ystday" (yesterday), the name of the day for the coming week (e.g. "Mon", "Tue", etc.), otherwise the date (e.g. "25 Apr"). For other years it returns a date month and year without a time (e.g. "6 Jun 2012"). All of the time stamps use a 24 hour format.
A valid strftime() format can also be specified. For example, a value of "%a %T" will report the day of the week and a time stamp (e.g. "Mon 12:34:56").

SQUEUE_ACCOUNT: -A <account_list>, --account=<account_list>

SQUEUE_ALL: -a, --all

SQUEUE_ARRAY: -r, --array

SQUEUE_NAMES: --name=<name_list>

SQUEUE_FORMAT: -o <output_format>, --format=<output_format>

SQUEUE_LICENSES: -p-l <license_list>, --license=<license_list>

SQUEUE_PARTITION: -p <part_list>, --partition=<part_list>

SQUEUE_QOS: -p <qos_list>, --qos=<qos_list>

SQUEUE_SORT: -S <sort_list>, --sort=<sort_list>

SQUEUE_STATES: -t <state_list>, --states=<state_list>

SQUEUE_USERS: -u <user_list>, --users=<user_list>

EXAMPLES¶

Print the jobs scheduled in the debug partition and in the COMPLETED state in the format with six right justified digits for the job id followed by the priority with an arbitrary fields size:

# squeue -p debug -t COMPLETED -o "%.6i %p"

JOBID PRIORITY

65543 99993

65544 99992

65545 99991

Print the job steps in the debug partition sorted by user:

# squeue -s -p debug -S u

STEPID NAME PARTITION USER TIME NODELIST

65552.1 test1 debug alice 0:23 dev[1-4]

65562.2 big_run debug bob 0:18 dev22

65550.1 param1 debug candice 1:43:21 dev[6-12]

Print information only about jobs 12345,12345, and 12348:

# squeue --jobs 12345,12346,12348

JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)

12345 debug job1 dave R 0:21 4 dev[9-12]

12346 debug job2 dave PD 0:00 8 (Resources)

12348 debug job3 ed PD 0:00 4 (Priority)

Print information only about job step 65552.1:

# squeue --steps 65552.1

STEPID NAME PARTITION USER TIME NODELIST

65552.1 test2 debug alice 12:49 dev[1-4]

COPYING¶

This file is part of SLURM, a resource management program. For details, see <http://slurm.schedmd.com/>.

SLURM is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

SLURM is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

Source file:	squeue.1.en.gz (from slurm-client 14.03.9-5+deb8u2)
Source last updated:	2018-06-15T21:01:58Z
Converted to HTML:	2019-03-10T04:07:51Z