Scroll to navigation

ACCOUNTING(5) Grid Engine File Formats ACCOUNTING(5)

NAME

accounting - Grid Engine accounting file format

DESCRIPTION

An accounting record is written to the Grid Engine accounting file $SGE_ROOT/$SGE_CELL/common/reporting for each finished job if accounting=true is specified in the sge_conf(5) reporting_params. This occurs at intervals of the accounting_flush_time specified in the same place. The accounting file is processed by qacct(1) to derive accounting statistics.

If output to the reporting(5) file is enabled, accounting records containing similar data are written there. They include "intermediate" records written at midnight for long-running jobs, not just ones written at the end of the jobs, and so may be more appropriate to process for some purposes than the accounting file.

FORMAT

Each job is represented by a line in the accounting file. Empty lines, and lines which contain one character or less are ignored by qacct. Accounting record entries are separated by colon (':') characters. The entries denote in their order of appearance:

1. qname

Name of the cluster queue in which the job has run.

2. hostname

Name of the execution host.

3. group

The effective group id of the job owner when executing the job.

4. owner

Owner of the Grid Engine job.

5. job_name

Job name.

6. job_number

Job identifier (job number).

7. account

An account string as specified by the qsub(1) or qalter(1) -A option.

8. priority

Priority value assigned to the job, corresponding to the priority parameter in the queue configuration (see queue_conf(5)).

9. submission_time

Submission time in seconds since the Unix epoch (1970-01-01 00:00:00 UTC).

10. start_time

Start time in seconds since the epoch.

11. end_time

End time in seconds since the epoch.

12. failed

Indicates the problem which occurred in case a job failed (at the system level, as opposed to the job script or binary having non-zero exit status, see below). Possibly the job could not be started on the execution host (e.g. because the owner of the job did not have a valid account on that machine), or didn't finish successfully (e.g. because an execution host crashed). If Grid Engine tries to start a job multiple times, there may be multiple entries in the reporting file corresponding to the same job ID. See sge_status(5) for a list.

13. exit_status

Exit status of the job script (or Grid Engine-specific status in case of certain error conditions). The exit status is determined by following the normal shell conventions. If the command terminates normally the value of the command is its exit status. However, in the case that the command exits abnormally, a value of 0200 (octal), 128 (decimal) is added to the value of the command to make up the exit status.

For example: If a job dies through signal 9 (SIGKILL) - probably issued by Grid Engine through qdel(1), or because the job exceeded time or memory hard limits - then the exit status is 128 + 9 = 137. The reason Grid Engine killed a job is recorded in the execd messages file at "W" or "I" level, depending on why it was killed.

14. ru_wallclock

Difference between end_time and start_time (see above), except that if the job fails, it is zero.

15. ru_utime

16. ru_stime

17. ru_maxrss

18. ru_ixrss

19. ru_ismrss

20. ru_idrss

21. ru_isrss

22. ru_minflt

23. ru_majflt

24. ru_nswap

25. ru_inblock

26. ru_oublock

27. ru_msgsnd

28. ru_msgrcv

29. ru_nsignals

30. ru_nvcsw

31. ru_nivcsw

These entries follow the contents of the standard Unix rusage structure as described in getrusage(2). Depending on the operating system where the job was executed, some of the fields may be 0.

32. project

The project which was assigned to the job.

33. department

The department which was assigned to the job.

34. granted_pe

The parallel environment which was selected for the job.

35. slots

The number of slots which were dispatched to the job by the scheduler.

36. task_number

Array job task index number.

37. cpu

The CPU time usage in seconds. The value may be affected by the ACCT_RESERVED_USAGE execd parameter (see sge_conf(5)).

38. mem

The integral memory usage in Gbytes seconds. The value may be affected by the ACCT_RESERVED_USAGE execd parameter (see sge_conf(5)).

39. io

The amount of data transferred in input/output operations in GB (if available, otherwise 0). On Linux, this is summed over calls to read(2), pread(2), write(2), and pwrite(2); thus it includes i/o via cache, and may not reflect data actually written to filing system.

40. category

A string specifying the job category. This contains a space-separated pseudo options list for the job, with components as follows:

An owner/group ACL list composed from host_conf(5), sge_pe(5), And queue_conf(5) user_lists/xuser_lists entries. Entries from sge_conf(5) are not considered since they can only cause a job to be accepted/rejected at submit time. Omitted if there are no such configuration entries.
Like -U, but for project/xproject entries.
The owner's user name, if it was referenced in any RQS (see sge_resource_quota(5)). Omitted if there was no such reference.
The hard queue list (only if one was specified).
The master queue list (only if one was specified).
The hard resource list (only if hard resources were specified).
The soft resource list (only if soft resources were specified).
The parallel environment specified for the job (only for parallel jobs).
The job's checkpointing environment (only if one was specified).
Present only for interactive jobs.
The advance reservation into which the job was submitted (only if one was specified).

41. iow

The input/output wait time in seconds (if available, otherwise 0).

42. pe_taskid

If this identifier is not equal to NONE, the task was part of a parallel job, and was passed to Grid Engine via the qrsh -inherit interface. Such records are not produced if the PE's accounting_summary parameter is false (see sge_pe(5)).

43. maxvmem

The maximum vmem size in bytes. The value may be affected by the ACCT_RESERVED_USAGE execd parameter (see sge_conf(5)).

44. arid

Advance reservation identifier. If the job used the resources of an advance reservation, then this field contains a positive integer identifier; otherwise the value is "0".

45. ar_sub_time

Advance reservation submission time if the job uses the resources of an advance reservation; otherwise "0".

FILES

$SGE_ROOT/$SGE_CELL/common/accounting

SEE ALSO

sge_intro(1), qacct(1), qalter(1), qsub(1), getrusage(2), queue_conf(5), sge_conf(5), sge_pe(5), sge_status(5), reporting(5).

COPYRIGHT

See sge_intro(1) for a full statement of rights and permissions.

2011-11-17 SGE 8.1.3pre