.TH "PAPI_profil" 3 "Thu Dec 14 2023" "Version 7.1.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_profil \- Generate a histogram of hardware counter overflows vs\&. PC addresses\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int \fBPAPI_profil(void *buf, unsigned bufsiz, unsigned long offset, unsigned scale, int EventSet, int EventCode, int threshold, int flags )\fP; .RE .PP \fBFortran Interface\fP .RS 4 The profiling routines have no Fortran interface\&. .RE .PP \fBParameters\fP .RS 4 \fI*buf\fP -- pointer to a buffer of bufsiz bytes in which the histogram counts are stored in an array of unsigned short, unsigned int, or unsigned long long values, or 'buckets'\&. The size of the buckets is determined by values in the flags argument\&. .br \fIbufsiz\fP -- the size of the histogram buffer in bytes\&. It is computed from the length of the code region to be profiled, the size of the buckets, and the scale factor as discussed above\&. .br \fIoffset\fP -- the start address of the region to be profiled\&. .br \fIscale\fP -- broadly and historically speaking, a contraction factor that indicates how much smaller the histogram buffer is than the region to be profiled\&. More precisely, scale is interpreted as an unsigned 16-bit fixed-point fraction with the decimal point implied on the left\&. Its value is the reciprocal of the number of addresses in a subdivision, per counter of histogram buffer\&. Below is a table of representative values for scale\&. .br \fIEventSet\fP -- The PAPI EventSet to profile\&. This EventSet is marked as profiling-ready, but profiling doesn't actually start until a \fBPAPI_start()\fP call is issued\&. .br \fIEventCode\fP -- Code of the Event in the EventSet to profile\&. This event must already be a member of the EventSet\&. .br \fIthreshold\fP -- minimum number of events that must occur before the PC is sampled\&. If hardware overflow is supported for your component, this threshold will trigger an interrupt when reached\&. Otherwise, the counters will be sampled periodically and the PC will be recorded for the first sample that exceeds the threshold\&. If the value of threshold is 0, profiling will be disabled for this event\&. .br \fIflags\fP -- bit pattern to control profiling behavior\&. Defined values are shown in the table above\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP .br \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .br \fIPAPI_ENOMEM\fP Insufficient memory to complete the operation\&. .br \fIPAPI_ENOEVST\fP The EventSet specified does not exist\&. .br \fIPAPI_EISRUN\fP The EventSet is currently counting events\&. .br \fIPAPI_ECNFLCT\fP The underlying counter hardware can not count this event and other events in the EventSet simultaneously\&. .br \fIPAPI_ENOEVNT\fP The PAPI preset is not available on the underlying hardware\&. .RE .PP \fBPAPI_profil()\fP provides hardware event statistics by profiling the occurrence of specified hardware counter events\&. It is designed to mimic the UNIX SVR4 profil call\&. .PP The statistics are generated by creating a histogram of hardware counter event overflows vs\&. program counter addresses for the current process\&. The histogram is defined for a specific region of program code to be profiled, and the identified region is logically broken up into a set of equal size subdivisions, each of which corresponds to a count in the histogram\&. .PP With each hardware event overflow, the current subdivision is identified and its corresponding histogram count is incremented\&. These counts establish a relative measure of how many hardware counter events are occurring in each code subdivision\&. .PP The resulting histogram counts for a profiled region can be used to identify those program addresses that generate a disproportionately high percentage of the event of interest\&. .PP Events to be profiled are specified with the EventSet and EventCode parameters\&. More than one event can be simultaneously profiled by calling \fBPAPI_profil()\fP several times with different EventCode values\&. Profiling can be turned off for a given event by calling \fBPAPI_profil()\fP with a threshold value of 0\&. .PP \fBRepresentative values for the scale variable\fP .RS 4 HEX DECIMAL DEFININTION 0x20000 131072 Maps precisely one instruction address to a unique bucket in buf. 0x10000 65536 Maps precisely two instruction addresses to a unique bucket in buf. 0x0FFFF 65535 Maps approximately two instruction addresses to a unique bucket in buf. 0x08000 32768 Maps every four instruction addresses to a bucket in buf. 0x04000 16384 Maps every eight instruction addresses to a bucket in buf. 0x00002 2 Maps all instruction addresses to the same bucket in buf. 0x00001 1 Undefined. 0x00000 0 Undefined. .RE .PP Historically, the scale factor was introduced to allow the allocation of buffers smaller than the code size to be profiled\&. Data and instruction sizes were assumed to be multiples of 16-bits\&. These assumptions are no longer necessarily true\&. \fBPAPI_profil()\fP has preserved the traditional definition of scale where appropriate, but deprecated the definitions for 0 and 1 (disable scaling) and extended the range of scale to include 65536 and 131072 to allow for exactly two addresses and exactly one address per profiling bucket\&. .PP The value of bufsiz is computed as follows: .PP bufsiz = (end - start)*(bucket_size/2)*(scale/65536) where .PD 0 .IP "\(bu" 2 bufsiz - the size of the buffer in bytes .IP "\(bu" 2 end, start - the ending and starting addresses of the profiled region .IP "\(bu" 2 bucket_size - the size of each bucket in bytes; 2, 4, or 8 as defined in flags .PP \fBDefined bits for the flags variable:\fP .RS 4 .PD 0 .IP "\(bu" 2 PAPI_PROFIL_POSIX Default type of profiling, similar to profil (3)\&. .br .IP "\(bu" 2 PAPI_PROFIL_RANDOM Drop a random 25% of the samples\&. .br .IP "\(bu" 2 PAPI_PROFIL_WEIGHTED Weight the samples by their value\&. .br .IP "\(bu" 2 PAPI_PROFIL_COMPRESS Ignore samples as values in the hash buckets get big\&. .br .IP "\(bu" 2 PAPI_PROFIL_BUCKET_16 Use unsigned short (16 bit) buckets, This is the default bucket\&. .br .IP "\(bu" 2 PAPI_PROFIL_BUCKET_32 Use unsigned int (32 bit) buckets\&. .br .IP "\(bu" 2 PAPI_PROFIL_BUCKET_64 Use unsigned long long (64 bit) buckets\&. .br .IP "\(bu" 2 PAPI_PROFIL_FORCE_SW Force software overflow in profiling\&. .br .PP .RE .PP \fBExample\fP .RS 4 .PP .nf int retval; unsigned long length; PAPI_exe_info_t *prginfo; unsigned short *profbuf; if ((prginfo = PAPI_get_executable_info()) == NULL) handle_error(1); length = (unsigned long)(prginfo\->text_end \- prginfo\->text_start); profbuf = (unsigned short *)malloc(length); if (profbuf == NULL) handle_error(1); memset(profbuf,0x00,length); if ((retval = PAPI_profil(profbuf, length, start, 65536, EventSet, PAPI_FP_INS, 1000000, PAPI_PROFIL_POSIX | PAPI_PROFIL_BUCKET_16)) != PAPI_OK) handle_error(retval); .fi .PP .RE .PP .PP \fBSee also\fP .RS 4 \fBPAPI_overflow\fP .PP \fBPAPI_sprofil\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&.