NAME
reporting - Grid Engine reporting file format
DESCRIPTION
A Grid Engine system writes a reporting file to
$SGE_ROOT/default/common/reporting. The reporting file con-
tains data that can be used for accounting, monitoring and
analysis purposes. It contains information about the clus-
ter (hosts, queues, load values, consumables, etc.), about
the jobs running in the cluster and about sharetree confi-
guration and usage. All information is time related, events
are dumped to the reporting file in a configurable interval.
It allows to monitor a "real time" status of the cluster as
well as historical analysis.
FORMAT
The reporting file is an ASCII file. Each line contains one
record, and the fields of a record are separated by a delim-
iter (:). The reporting file contains records of different
type. Each record type has a specific record structure.
The first two fields are common to all reporting records:
time Time (GMT unix timestamp) when the record was created.
record type
Type of the accounting record. The different types of
records and their structure are described in the fol-
lowing text.
new_job
The new_job record is written whenever a new job enters the
system (usually by a submitting command). It has the follow-
ing fields:
submission_time
Time (GMT unix time stamp) when the job was submitted.
job_number
The job number.
task_number
The array task id. Always has the value -1 for new_job
records (as we don't have array tasks yet).
pe_taskid
The task id of parallel tasks. Always has the value
"none" for new_job records.
job_name
The job name (from -N submission option)
owner
The job owner.
group
The unix group of the job owner.
project
The project the job is running in.
department
The department the job owner is in.
account
The account string specified for the job (from -A sub-
mission option).
priority
The job priority (from -p submission option).
job_log
The job_log record is written whenever a job, an array task
or a pe tasks is changing status. A status change can be the
transition from pending to running, but can also be trig-
gered by user actions like suspension of a job. It has the
following fields:
event_time
Time (GMT unix time stamp) when the event was gen-
erated.
event
A one word description of the event.
job_number
The job number.
task_number
The array task id. Always has the value -1 for new_job
records (as we don't have array tasks yet).
pe_taskid
The task id of parallel tasks. Always has the value
"none" for new_job records.
state
The state of the job after the event was processed.
user The user who initiated the event (or special usernames
"qmaster", "scheduler" and "execd" for actions of the
system itself like scheduling jobs, executing jobs
etc.).
host The host from which the action was initiated (e.g. the
submit host, the qmaster host, etc.).
state_time
Reserved field for later use.
submission_time
Time (GMT unix time stamp) when the job was submitted.
job_name
The job name (from -N submission option)
owner
The job owner.
group
The unix group of the job owner.
project
The project the job is running in.
department
The department the job owner is in.
account
The account string specified for the job (from -A sub-
mission option).
priority
The job priority (from -p submission option).
message
A message describing the reported action.
acct
Records of type acct are accounting records. They are writ-
ten whenever a job, a task of an array job or the task of a
parallel job terminates. Accounting records comprise the
following fields:
qname
Name of the cluster queue in which the job has run.
hostname
Name of the execution host.
group
The effective group id of the job owner when executing
the job.
owner
Owner of the Grid Engine job.
job_name
Job name.
job_number
Job identifier - job number.
account
An account string as specified by the qsub(1) or
qalter(1) -A option.
priority
Priority value assigned to the job corresponding to the
priority parameter in the queue configuration (see
queue_conf(5)).
submission_time
Submission time (GMT unix time stamp). For slave tasks
of tightly integrated parallel jobs, the
submission_time is always set to 0.
start_time
Start time (GMT unix time stamp).
end_time
End time (GMT unix time stamp).
failed
Indicates the problem which occurred in case a job
could not be started on the execution host (e.g.
because the owner of the job did not have a valid
account on that machine). If Grid Engine tries to start
a job multiple times, this may lead to multiple entries
in the accounting file corresponding to the same job
ID.
exit_status
Exit status of the job script (or Grid Engine specific
status in case of certain error conditions).
ru_wallclock
Difference between end_time and start_time (see above).
The remainder of the accounting entries follows the contents
of the standard UNIX rusage structure as described in
getrusage(2). Depending on the operating system where the
job was executed some of the fields may be 0. The following
entries are provided:
ru_utime
ru_stime
ru_maxrss
ru_ixrss
ru_ismrss
ru_idrss
ru_isrss
ru_minflt
ru_majflt
ru_nswap
ru_inblock
ru_oublock
ru_msgsnd
ru_msgrcv
ru_nsignals
ru_nvcsw
ru_nivcsw
project
The project which was assigned to the job.
department
The department which was assigned to the job.
granted_pe
The parallel environment which was selected for that
job.
slots
The number of slots which were dispatched to the job by
the scheduler.
task_number
Array job task index number.
cpu The cpu time usage in seconds.
mem The integral memory usage in Gbytes seconds.
io The amount of data transferred in input/output opera-
tions.
category
A string specifying the job category.
iow The io wait time in seconds.
pe_taskid
If this identifier is set the task was part of a paral-
lel job and was passed to Grid Engine via the qrsh
-inherit interface.
maxvmem
The maximum vmem size in bytes.
queue
Records of type queue contain state information for queues
(queue instances). A queue record has the following fields:
qname
The cluster queue name.
hostname
The hostname of a specific queue instance.
report_time
The time (GMT unix time stamp) when a state change was
triggered.
state
The new queue state.
queue_consumable
A queue_consumable record contains information about queue
consumable values in addition to queue state information:
qname
The cluster queue name.
hostname
The hostname of a specific queue instance.
report_time
The time (GMT unix time stamp) when a state change was
triggered.
state
The new queue state.
consumables
Description of consumable values. Information about
multiple consumables is separated by space. A consum-
able description has the format
<name>=<actual_value>=<configured value>.
host
A host record contains information about hosts and host load
values. It contains the following information:
hostname
The name of the host.
report_time
The time (GMT unix time stamp) when the reported infor-
mation was generated.
state
The new host state. Currently, Grid Engine doesn't
track a host state, the field is reserved for future
use. Always contains the value X.
load values
Description of load values. Information about multiple
load values is separated by space. A load value
description has the format <name>=<actual_value>.
host_consumable
A host_consumable record contains information about hosts
and host consumables. Host consumables can for example be
licenses. It contains the following information:
hostname
The name of the host.
report_time
The time (GMT unix time stamp) when the reported infor-
mation was generated.
state
The new host state. Currently, Grid Engine doesn't
track a host state, the field is reserved for future
use. Always contains the value X.
consumables
Description of consumable values. Information about
multiple consumables is separated by space. A consum-
able description has the format
<name>=<actual_value>=<configured value>.
SEE ALSO
sge_conf(5). host_conf(5).
COPYRIGHT
See sge_intro(1) for a full statement of rights and permis-
sions.
Man(1) output converted with
man2html