table of contents
condor_q(1) | General Commands Manual | condor_q(1) |
Name¶
condor_q Display - information about jobs in queueSynopsis¶
condor_q [-help [Universe | State]] condor_q[-debug] [general options] [restriction list] [output options] [analyze options]Description¶
condor_q displays information about jobs in the HTCondor job queue. By default, condor_q queries the local job queue, but this behavior may be modified by specifying one of the general options. To restrict the display to jobs of interest, a list of zero or more restriction options may be supplied. Each restriction may be one of:- * a clusterand a processmatches jobs which belong to the specified cluster and have the specified process number
- * a clusterwithout a processmatches all jobs belonging to the specified cluster
- * an ownermatches all jobs owned by the specified owner
- * a -constraint expressionwhich matches all jobs that satisfy the specified ClassAd expression. If no restrictions are present in the list to specify an owner, the job matches the restriction list if it matches at least one restriction in the list. If ownerrestrictions are present, the job matches the list if it matches one of the ownerrestrictions andat least one non-ownerrestriction.
- The cluster/process id of the condor job.
- The owner of the job.
- The month, day, hour, and minute the job was submitted to the queue.
- Wall-clock time accumulated by the job to date in days, hours, minutes, and seconds.
- Current status of the job, which varies somewhat according to the job universe and the timing of updates. H = on hold, R = running, I = idle (waiting for a machine to execute on), C = completed, X = removed, < = transferring input (or queued to do so), and > = transferring output (or queued to do so).
- User specified priority of the job, displayed as an integer, with higher numbers corresponding to greater priority.
- The value of job ClassAd attribute MemoryUsage (in Mbytes), when the attribute is defined, and ImageSize (in Kbytes), otherwise.
- The name of the executable.
- The host where the job is running.
- The state that HTCondor believes the job is in. Possible values are
- PENDING
- The job is waiting for resources to become available in order to run.
- ACTIVE
- The job has received resources, and the application is executing.
- FAILED
- The job terminated before completion because of an error, user-triggered cancel, or system-triggered cancel.
- DONE
- The job completed successfully.
- SUSPENDED
- The job has been suspended. Resources which were allocated for this job may have been released due to a scheduler-specific reason.
- UNSUBMITTED
- The job has not been submitted to the scheduler yet, pending the reception of the GLOBUS_GRAM_PROTOCOL_JOB_SIGNAL_COMMIT_REQUEST signal from a client.
- STAGE_IN
- The job manager is staging in files, in order to run the job.
- STAGE_OUT
- The job manager is staging out files generated by the job.
- UNKNOWN
- A guess at what remote batch system is running the job. It is a guess, because HTCondor looks at the Globus jobmanager contact string to attempt identification. If the value is fork, the job is running on the remote host without a jobmanager. Values may also be condor, lsf, or pbs.
- The host to which the job was submitted.
- The job as specified as the executable in the submit description file.
- The percentage of RUN_TIME for this job which has been saved in a checkpoint. A low GOODPUT value indicates that the job is failing to checkpoint. If a job has not yet attempted a checkpoint, this column contains [?????] .
- The ratio of CPU_TIME to RUN_TIME for checkpointed work. A low CPU_UTIL indicates that the job is not running efficiently, perhaps because it is I/O bound or because the job requires more memory than available on the remote workstations. If the job has not (yet) checkpointed, this column contains [??????] .
- The network usage of this job, in Megabits per second of run-time.
- READ The total number of bytes the application has read from files and sockets.
- WRITE The total number of bytes the application has written to files and sockets.
- SEEK The total number of seek operations the application has performed on files.
- XPUT The effective throughput (average bytes read and written per second) from the application's point of view.
- BUFSIZE The maximum number of bytes to be buffered per file.
- BLOCKSIZE The desired block size for large data transfers.
- The remote CPU time accumulated by the job to date (which has been stored in a checkpoint) in days, hours, minutes, and seconds. (If the job is currently running, time accumulated during the current run is notshown. If the job has not produced a checkpoint, this column contains 0+00:00:00.)
Options¶
-debug- Causes debugging information to be sent to stderr , based on the value of the configuration variable TOOL_DEBUG
- (general option) Queries all job queues in the pool.
- (general option) List jobs of a specific submitter.
- (general option) Query only the job queue of the named condor_schedddaemon.
- (general option) Use the centralmanagerhostnameas the central manager to locate condor_schedddaemons. The default is the COLLECTOR_HOST , as specified in the configuration.
- (general option) Display jobs from a list of ClassAds from a file, instead of the real ClassAds from the condor_schedddaemon. This is most useful for debugging purposes. The ClassAds appear as if condor_q -longis used with the header stripped out.
- (general option) Display jobs, with job information coming from a job event log, instead of from the real ClassAds from the condor_schedddaemon. This is most useful for automated testing of the status of jobs known to be in the given job event log, because it reduces the load on the condor_schedd. A job event log does not contain all of the job information, so some fields in the normal output of condor_q will be blank.
- (output option) Instead of wall-clock allocation time (RUN_TIME), display remote CPU time accumulated by the job to date in days, hours, minutes, and seconds. If the job is currently running, time accumulated during the current run is notshown.
- (output option) Normally, RUN_TIME contains all the time accumulated during the current run plus all previous runs. If this option is specified, RUN_TIME only displays the time accumulated so far on this current run.
- (output option) Display DAG node jobs under their condor_dagmaninstance. Child nodes are listed using indentation to show the structure of the DAG.
- (output option) Display shorter error messages.
- (output option) Get information only about jobs submitted to grid resources described as gt2or gt5.
- (output option) Display job goodput statistics.
- (output option) Print usage info, and additionally print job universes or job states.
- (output option) Get information about jobs in the hold state. Also displays the time the job was placed into the hold state and the reason why the job was placed in the hold state.
- (output option) Display job input/output summaries.
- (output option) Display entire job ClassAds in long format.
- (output option) Get information about running jobs.
- (output option) Display results as jobs are fetched from the job queue rather than storing results in memory until all jobs have been fetched. This can reduce memory consumption when fetching large numbers of jobs, but if condor_q is paused while displaying results, this could result in a timeout in communication with condor_schedd.
- (output option) Print the HTCondor version and exit.
- (output option) If this option is specified, and the command portion of the output would cause the output to extend beyond 80 columns, display beyond the 80 columns.
- (output option) Display entire job ClassAds in XML format. The XML format is fully defined in the reference manual, obtained from the ClassAds web page, with a link at http://research.cs.wisc.edu/htcondor/research.html.
- (output option) Explicitly list the attributes, by name in a comma separated list, which should be displayed when using the -xmlor -longoptions. Limiting the number of attributes increases the efficiency of the query.
- (output option) Display attribute or expression attrin format fmt. To display the attribute or expression the format must contain a single printf(3) -style conversion specifier. Attributes must be from the job ClassAd. Expressions are ClassAd expressions and may refer to attributes in the job ClassAd. If the attribute is not present in a given ClassAd and cannot be parsed as an expression, then the format option will be silently skipped. The conversion specifier must match the type of the attribute or expression. %s is suitable for strings such as Owner , %d for integers such as ClusterId , and %f for floating point numbers such as RemoteWallClockTime . %v identifies the type of the attribute, and then prints the value in an appropriate format. %V identifies the type of the attribute, and then prints the value in an appropriate format as it would appear in the -longformat. As an example, strings used with %V will have quote marks. An incorrect format will result in undefined behavior. Do not use more than one conversion specifier in a given format. More than one conversion specifier will result in undefined behavior. To output multiple attributes repeat the -formatoption once for each desired attribute. Like printf(3) style formats, one may include other text that will be reproduced directly. A format without any conversion specifiers may be specified, but an attribute is still required. Include n to specify a line break.
- (output option) Display machine ClassAd attribute values formatted in a default way according to their attribute types. This option takes an arbitrary number of attribute names as arguments, and prints out their values. It is like the -formatoption, but no format strings are required. It is assumed that no attribute names begin with a dash character, so that the next word that begins with dash is the start of the next option. The autoformatoption may be followed by a colon character and formatting qualifiers:
- tadd a tab character before each field instead of the default space character,
- nadd a newline character after each field,
- ,add a comma character after each field,
- llabel each field,
- Vuse %V rather than %v for formatting,
- hprint headings before the first line of output.
- The newline and comma characters may notbe used together.
- (analyze option) Perform a matchmaking analysis on why the requested jobs are not running. First a simple analysis determines if the job is not running due to not being in a runnable state. If the job is in a runnable state, then this option is equivalent to -better-analyze. <qual>is a comma separated list containing one or more of
- priorityto consider user priority during the analysis
- summaryto show a one line summary for each job or machine
- reverseto analyze machines, rather than jobs
- (analyze option) Perform a more detailed matchmaking analysis to determine how many resources are available to run the requested jobs. This option is never meaningful for Scheduler universe jobs and only meaningful for grid universe jobs doing matchmaking. <qual>is a comma separated list containing one or more of
- priorityto consider user priority during the analysis
- summaryto show a one line summary for each job or machine
- reverseto analyze machines, rather than jobs
- (analyze option) When doing matchmaking analysis, analyze only machine ClassAds that have slot or machine names that match the given name.
- (analyze option) When doing matchmaking analysis, match only machine ClassAds which match the ClassAd expression constraint.
- (analyze option) When doing matchmaking analysis, use the machine ClassAds from the file instead of the ones from the condor_collectordaemon. This is most useful for debugging purposes. The ClassAds appear as if condor_status-longis used.
- (analyze option) When doing matchmaking analysis with priority, read user priorities from the file rather than the ones from the condor_negotiatordaemon. This is most useful for debugging purposes or to speed up analysis in situations where the condor_negotiatordaemon is slow to respond to condor_userpriorequests. The file should be in the format produced by condor_userprio-long.
- (analyze option) Do not consider user priority during the analysis.
- (analyze option) Analyze machine requirements against jobs.
- (analyze option) When doing analysis, show progress and include the names of specific machines in the output.
General Remarks¶
The default output from condor_qis formatted to be human readable, not script readable. In an effort to make the output fit within 80 characters, values in some fields might be truncated. Furthermore, the HTCondor Project can (and does) change the formatting of this default output as we see fit. Therefore, any script that is attempting to parse data from condor_qis strongly encouraged to use the -formatoption (described above, examples given below). Although -analyzeprovides a very good first approximation, the analyzer cannot diagnose all possible situations, because the analysis is based on instantaneous and local information. Therefore, there are some situations such as when several submitters are contending for resources, or if the pool is rapidly changing state which cannot be accurately diagnosed. Options -goodput, -cputime, and -ioare most useful for standard universe jobs, since they rely on values computed when a job produces a checkpoint. It is possible to to hold jobs that are in the X state. To avoid this it is best to construct a -constraint expressionthat option contains JobStatus != 3 if the user wishes to avoid this condition.Examples¶
The -formatoption provides a way to specify both the job attributes and formatting of those attributes. There must be only one conversion specification per -formatoption. As an example, to list only Jane Doe's jobs in the queue, choosing to print and format only the owner of the job, the command line arguments for the job, and the process ID of the job: %condor_q -submitter jdoe -format "%s" Owner -format " %s " Args -format "ProcId = %d\n" ProcIdAutocluster Matches Machine Running Serving
Slot Slot's Req Job's Req Both
Exit Status¶
condor_qwill exit with a status value of 0 (zero) upon success, and it will exit with the value 1 (one) upon failure.Author¶
Center for High Throughput Computing, University of Wisconsin-MadisonCopyright¶
Copyright (C) 1990-2014 Center for High Throughput Computing, Computer Sciences Department, University of Wisconsin-Madison, Madison, WI. All Rights Reserved. Licensed under the Apache License, Version 2.0.January 2015 |