|PMLOGREDUCE(1)||General Commands Manual||PMLOGREDUCE(1)|
NAME¶pmlogreduce - temporal reduction of Performance Co-Pilot archives
SYNOPSIS¶$PCP_BINADM_DIR/pmlogreduce [-z?] [-A align] [-s samples] [-S starttime] [-t interval] [-T endtime] [-v volsamples] [-Z timezone] input output
DESCRIPTION¶pmlogreducereads one set of Performance Co-Pilot (PCP) archivesidentified by inputand creates a temporally reduced PCP archive inoutput.inputis a comma-separated list of names, eachof which may be the base name of an archive or the name of a directory containing one or more archives. The data reduction involves statistical and temporal reduction of samples with an output sampling interval defined by the -toption in the outputarchive (independent of the sampling intervals in the inputarchives), and is further controlled byother command line arguments.
For some metrics, temporal data reduction is not going to be helpful, so for metrics with types PM_TYPE_AGGREGATEorPM_TYPE_EVENT,a warning is issued if these metrics are found in inputand they will be skipped and not appear in the outputarchive.
OPTIONS¶The available command line options are:
- -A align, --align=align
- Specify a ``natural'' alignment of the output sample times; refertoPCPIntro(1).
- -s samples, --samples=samples
- The argumentsamplesdefines the number of samples to be written tooutput.Ifsamplesis 0 or-sis not specified,pmlogreduce will sample until the end of the set of PCP archives, or the end of the time window as specified by -T,whichever comes first. The -soption will override the-T option if it occurs sooner.
- -S starttime, --start=starttime
- Define the start of a time window to restrict the samples retrievedfrom the inputarchives; refer toPCPIntro(1).
- -t interval, --interval=interval
- Consecutive samples in theoutputarchive will appear with a time delta defined by interval;refer toPCPIntro(1).Note the default value is 600 (seconds, i.e. 10 minutes).
- -T endtime, --finish=endtime
- Define the termination of a time window to restrict the samplesretrieved from the inputarchives; refer toPCPIntro(1).
- -v volsamples
- Theoutputarchive is potentially a multi-volume data set, and the-v option causes pmlogreduceto start a new volume aftervolsampleslog records have been written to theoutputarchive.
Independent of any-voption, each volume of an archive is limited to no more than 2^31 bytes, so pmlogreducewill automatically create a new volume for the archive before this limit is reached.
- -z, --hostzone
- Use the local timezone of the host from theinputarchives when displaying the date and time, or interpreting the -Sand-Toptions. The default is to initially use the timezone of the local host.
- -Z timezone, --timezone=timezone
- Usetimezonewhen displaying the date and time, or interpreting the-Sand-Toptions.Timezoneis in the format of the environment variable TZas described inenviron(7).
- -?, --help
- Display usage message and exit.
DATAREDUCTION¶The statistical and temporal reduction follows the following rules:
- Consecutive records frominputare read without interpolation, and at most one output record is written for each interval,summarizing the performance data over that period.
- If the semantics of a metric indicates it isinstantaneousordiscrete then outputvalue is computed as the arithmetic mean of the observations (if any) over each interval.
- If the semantics of a metric indicates it is acounterthen the following transformations are applied:
- Metrics with 32-bit precision are promoted to 64-bit precision.
- Any counter wrap (overflow) is noted, and appropriate adjustment madein the value of the metric over each interval.This will be correct in the case of a single counter wrap, but will silently underestimatein the case where more than one counter wrap occurs between consecutive observations in the inputarchives, and silentlyoverestimatein the case where a counter reset occurs between consecutive observations in the inputarchives; unfortunately these situations cannot be detected, but are believed to be rare events for the sort of production monitoring environments where pmlogreduce is most likely to be deployed.
- Any changes in instance domains, and indeed all metadata, is preserved.
- Any ``mark'' records in theinputarchives (as created bypmlogextract(1))will be preserved in theoutputarchive, so periods where no data is available are maintained, and data interpolation will notoccur across these periods when theoutputarchive is subsequently processed with PCP applications.
CAVEATS¶The preamble metrics (pmcd.pmlogger.archive, pmcd.pmlogger.host, and pmcd.pmlogger.port), which are automatically recorded by pmloggerat the start of the archive, may not be present in the archive output by pmlogreduce.These metrics are only relevant while the archive is being created, and have no significance once recording has finished.
DIAGNOSTICS¶All error conditions detected by pmlogreduceare reported onstderrwith textual (if sometimes terse) explanation.
Should theinputarchives be corrupted (this can happenif thepmloggerinstance writing the archive suddenly dies), thenpmlogreduce will detect and report the position of the corruption in the file, and any subsequent information from the inputarchives will not be processed.
If any error is detected,pmlogreducewill exit with a non-zero status.
FILES¶For each of the inputandoutputarchives, several physical files are used.
- metadata (metric descriptions, instance domains, etc.) for the archive log
- initial volume of metrics values (subsequent volumes have suffixes1, 2,...) - forinputthese files may have been previously compressed with bzip2(1)orgzip(1)and thus may have an additional .bz2or.gzsuffix.
- temporal index to support rapid random access to the other files in thearchive log.
PCPENVIRONMENT¶Environment variables with the prefix PCP_ are used to parameterize the file and directory names used by PCP. On each installation, the file /etc/pcp.conf contains the local values for these variables. The $PCP_CONF variable may be used to specify an alternative configuration file, as described in pcp.conf(5).