PMC(3) BSD Library Functions Manual PMC(3)NAME
pmc — library for accessing hardware performance monitoring counters
LIBRARY
Performance Counters Library (libpmc, -lpmc)
SYNOPSIS
#include <pmc.h>
DESCRIPTION
Intel Pentium PMCs are present in Intel Pentium and Pentium MMX proces‐
sors. These PMCs are documented in the "Volume 3B: System Programming
Guide, Part 2", Intel 64 and IA-32 Intel(R) Architectures Software
Developer's Manual, Order Number 253669-024US, Intel Corporation, August
2007.
PMC Features
These CPUs contain two PMCs, each 40 bits wide. These PMCs support the
following capabilities:
Capability Support
PMC_CAP_CASCADE No
PMC_CAP_EDGE No
PMC_CAP_INTERRUPT No
PMC_CAP_INVERT No
PMC_CAP_READ Yes
PMC_CAP_PRECISE No
PMC_CAP_SYSTEM Yes
PMC_CAP_TAGGING No
PMC_CAP_THRESHOLD No
PMC_CAP_USER Yes
PMC_CAP_WRITE Yes
Event Qualifiers
Event specifiers for Intel Pentium PMCs can have the following common
qualifiers:
duration
Count duration (in clocks) of events. The default is to count
events.
os Measure events at privilege levels 0, 1 and 2.
overflow
Assert the external processor pin associated with a counter on
counter overflow.
usr Measure events at privilege level 3.
If neither of the “os” or “usr” qualifiers are specified, the default is
to enable both.
Some events may only be used on specific counters and some events are
defined only on processors supporting the MMX instruction set. Note that
these PMCs do not have the ability to interrupt the CPU.
Intel Pentium Event Specifiers
The event specifiers supported by Intel Pentium PMCs are:
p5-any-segment-register-loaded
(Event 0FH) The number of writes to any segment register, includ‐
ing the LDTR, GDTR, TR and IDTR. Far control transfers and task
switches that involve privilege level changes will count this
event twice.
p5-bank-conflicts
(Event 0AH) The number of actual bank conflicts.
p5-branches
(Event 12H) The number of taken and not taken branches including
branches, jumps, calls, software interrupts and interrupt
returns.
p5-breakpoint-match-on-dr0-register
(Event 23H) The number of matches on the DR0 breakpoint register.
p5-breakpoint-match-on-dr1-register
(Event 24H) The number of matches on the DR1 breakpoint register.
p5-breakpoint-match-on-dr2-register
(Event 25H) The number of matches on the DR2 breakpoint register.
p5-breakpoint-match-on-dr3-register
(Event 26H) The number of matches on the DR3 breakpoint register.
p5-btb-false-entries
(Event 3AH, Pentium MMX) The number of false entries in the BTB.
This event is only allocated on counter 0.
p5-btb-hits
(Event 13H) The number of branches executed that hit in the
branch table buffer.
p5-btb-miss-prediction-on-not-taken-branch
(Event 3AH, Pentium MMX) The number of times the BTB predicted a
not-taken branch as taken. This event is only allocated on
counter 1.
p5-bus-cycle-duration
(Event 18H) The number of cycles while a bus cycle was in
progress.
p5-bus-ownership-latency
(Event 2AH, Pentium MMX) The time from bus ownership being
requested to ownership being granted. This event is only allo‐
cated on counter 0.
p5-bus-ownership-transfers
(Event 2AH, Pentium MMX) The number of bus ownership transfers.
This event is only allocated on counter 1.
p5-bus-utilization-due-to-processor-activity
(Event 2EH, Pentium MMX) The number of clocks the bus is busy due
to the processor's own activity. This event is only allocated on
counter 0.
p5-cache-line-sharing
(Event 2CH, Pentium MMX) The number of shared data lines in L1
cache. This event is only allocated on counter 1.
p5-cache-m-state-line-sharing
(Event 2CH, Pentium MMX) The number of hits to an M- state line
due to a memory access by another processor. This event is only
allocated on counter 0.
p5-code-cache-miss
(Event 0EH) The number of instruction reads that miss the inter‐
nal code cache. Both cacheable and uncacheable misses are
counted.
p5-code-read
(Event 0CH) The number of instruction reads to both cacheable and
uncacheable regions.
p5-code-tlb-miss
(Event 0DH) The number of instruction reads that miss the
instruction TLB. Both cacheable and uncacheable unreads are
counted.
p5-d1-starvation-and-fifo-is-empty
(Event 33H, Pentium MMX) The number of times the D1 stage cannot
issue any instructions because the FIFO was empty. This event is
only allocated on counter 0.
p5-d1-starvation-and-only-one-instruction-in-fifo
(Event 33H, Pentium MMX) The number of times the D1 stage could
issue only one instruction because the FIFO had one instruction
ready. This event is only allocated on counter 1.
p5-data-cache-lines-written-back
(Event 06H) The number of data cache lines that are written back,
including those caused by internal and external snoops.
p5-data-cache-tlb-miss-stall-duration
(Event 30H, Pentium MMX) The number of clocks the pipeline is
stalled due to a data cache TLB miss. This event is only allo‐
cated on counter 1.
p5-data-read
(Event 00H) The number of memory data reads, counting internal
data cache hits and misses. I/O and data memory accesses due to
TLB miss processing are not included. Split cycle reads are
counted individually.
p5-data-read-miss
(Event 03H) The number of memory read accesses that miss the data
cache, counting both cacheable and uncacheable accesses. Data
accesses that are part of TLB miss processing are not included.
I/O accesses are not included.
p5-data-read-miss-or-write-miss
(Event 29H) The number of data reads and writes that miss the
internal data cache, counting uncacheable accesses. Data
accesses due to TLB miss processing are not counted.
p5-data-read-or-write
(Event 28H) The number of data reads and writes including inter‐
nal data cache hits and misses. Data reads due to TLB miss pro‐
cessing are not counted.
p5-data-tlb-miss
(Event 02H) The number of misses to the data cache translation
lookaside buffer.
p5-data-write
(Event 01H) The number of memory data writes, counting internal
data cache hits and misses. I/O is not included and split cycle
writes are counted individually.
p5-data-write-miss
(Event 04H) The number of memory write accesses that miss the
data cache, counting both cacheable and uncacheable accesses.
I/O accesses are not counted.
p5-emms-instructions-executed
(Event 2DH, Pentium MMX) The number of EMMS instructions exe‐
cuted. This event is only allocated on counter 0.
p5-external-data-cache-snoop-hits
(Event 08H) The number of external snoops to the data cache that
hit a valid line, or the data line fill buffer, or one of the
write back buffers.
p5-external-snoops
(Event 07H) The number of external snoop requests accepted,
including snoops that hit in the code cache, the data cache and
that hit in neither.
p5-floating-point-stalls-duration
(Event 32H, Pentium MMX) The number of cycles the pipeline is
stalled due to a floating point freeze. This event is only allo‐
cated on counter 0.
p5-flops
(Event 22H) The number of floating point adds, subtracts, multi‐
ples, divides and square roots. Transcendental instructions
trigger this event multiple times. Instructions generating
divide-by-zero, negative square root, special operand and stack
exceptions are not counted. Integer multiply instructions that
use the x87 FPU are counted.
p5-full-write-buffer-stall-duration-while-executing-mmx-instructions
(Event 3BH, Pentium MMX) The number of clocks the pipeline has
stalled due to full write buffers when executing MMX instruc‐
tions. This event is only allocated on counter 0.
p5-hardware-interrupts
(Event 27H) The number of taken INTR and NMI interrupts.
p5-instructions-executed
(Event 16H) The number of instructions executed. Repeat prefixed
instructions are counted only once. The HLT instruction is
counted only once, irrespective of the number of cycles spent in
the halted state. All hardware and software exceptions are
counted as instructions, and fault handler invocations are also
counted as instructions.
p5-instructions-executed-v-pipe
(Event 17H) The number of instructions that executed in the V
pipe.
p5-io-read-or-write-cycle
(Event 1DH) The number of bus cycles directed to I/O space.
p5-locked-bus-cycle
(Event 1CH) The number of locked bus cycles that occur on account
of the lock prefixes, LOCK instructions, page table updates and
descriptor table updates.
p5-memory-accesses-in-both-pipes
(Event 09H) The number of data memory reads or writes that are
paired in both pipes.
p5-misaligned-data-memory-or-io-references
(Event 0BH) The number of memory or I/O reads or writes that are
not aligned on natural boundaries. 2- and 4-byte accesses are
counted as misaligned if they cross a 4 byte boundary.
p5-misaligned-data-memory-reference-on-mmx-instructions
(Event 36H, Pentium MMX) The number of misaligned data memory
references when executing MMX instructions. This event is only
allocated on counter 0.
p5-mispredicted-or-unpredicted-returns
(Event 37H, Pentium MMX) The number of returns predicted incor‐
rectly or not at all, only counting RET instructions. This event
is only allocated on counter 0.
p5-mmx-instruction-data-read-misses
(Event 31H, Pentium MMX) The number of MMX instruction data read
misses. This event is only allocated on counter 1.
p5-mmx-instruction-data-reads
(Event 31H, Pentium MMX) The number of MMX instruction data
reads. This event is only allocated on counter 0.
p5-mmx-instruction-data-write-misses
(Event 34H, Pentium MMX) The number of data write misses caused
by MMX instructions. This event is only allocated on counter 1.
p5-mmx-instruction-data-writes
(Event 34H, Pentium MMX) The number of data writes caused by MMX
instructions. This event is only allocated on counter 0.
p5-mmx-instructions-executed-u-pipe
(Event 2BH, Pentium MMX) The number of MMX instructions executed
in the U pipe. This event is only allocated on counter 0.
p5-mmx-instructions-executed-v-pipe
(Event 2BH, Pentium MMX) The number of MMX instructions executed
in the V pipe. This event is only allocated on counter 1.
p5-mmx-multiply-unit-interlock
(Event 38H, Pentium MMX) The number of clocks the pipeline is
stalled because the destination of a prior MMX multiply is not
ready. This event is only allocated on counter 0.
p5-movd-movq-store-stall-due-to-previous-mmx-operation
(Event 38H, Pentium MMX) The number of clocks a MOVD/MOVQ
instruction stalled in the D2 stage of the pipeline due to a pre‐
vious MMX instruction. This event is only allocated on counter
1.
p5-noncacheable-memory-reads
(Event 1EH) The number of bus cycles for non-cacheable instruc‐
tion or data reads, including cycles caused by TLB misses.
p5-number-of-cycles-not-in-halt-state
(Event 30H, Pentium MMX) The number of cycles the processor is
not idle due to the HLT instruction. This event is only allo‐
cated on counter 0.
p5-pipeline-agi-stalls
(Event 1FH) The number of address generation interlock stalls.
An AGI that occurs in both the U and V pipelines in the same
clock signals the event twice.
p5-pipeline-flushes
(Event 15H) The number of pipeline flushes that occur. Pipeline
flushes are caused by branch mispredicts, exceptions, interrupts,
some segment register loads, and BTB misses. Prefetch queue
flushes due to serializing instructions are not counted.
p5-pipeline-flushes-due-to-wrong-branch-predictions
(Event 35H, Pentium MMX) The number of pipeline flushes due to
wrong branch predictions resolved in either the E- or WB- stage
of the pipeline. This event is only allocated on counter 0.
p5-pipeline-flushes-due-to-wrong-branch-predictions-resolved-in-wb-stage
(Event 35H, Pentium MMX) The number of pipeline flushes due to
wrong branch predictions resolved in the stage of the pipeline.
This event is only allocated on counter 1.
p5-pipeline-stall-for-mmx-instruction-data-memory-reads
(Event 36H, Pentium MMX) The number of clocks during pipeline
stalls caused by waiting MMX data memory reads. This event is
only allocated on counter 1.
p5-predicted-returns
(Event 37H, Pentium MMX) The number of predicted returns, whether
correct or incorrect. This counter only counts RET instructions.
This event is only allocated on counter 1.
p5-returns
(Event 39H, Pentium MMX) The number of RET instructions executed.
This event is only allocated on counter 0.
p5-saturating-mmx-instructions-executed
(Event 2FH, Pentium MMX) The number of saturating MMX instruc‐
tions executed. This event is only allocated on counter 0.
p5-saturations-performed
(Event 2FH, Pentium MMX) The number of saturating MMX instruc‐
tions executed when at least one of its results were actually
saturated. This event is only allocated on counter 1.
p5-stall-on-mmx-instruction-write-to-e-o-m-state-line
(Event 3BH, Pentium MMX) The number of clocks during stalls on
MMX instructions writing to E- or M- state cache lines. This
event is only allocated on counter 1.
p5-stall-on-write-to-an-e-or-m-state-line
(Event 1BH) The number of stalls on a write to an exclusive or
modified data cache line.
p5-taken-branch-or-btb-hit
(Event 14H) The number of events that may cause a hit in the BTB,
namely either taken branches or BTB hits.
p5-taken-branches
(Event 32H, Pentium MMX) The number of taken branches. This
event is only allocated on counter 1.
p5-transitions-between-mmx-and-fp-instructions
(Event 2DH, Pentium MMX) The number of transitions between MMX
and floating-point instructions and vice-versa. This event is
only allocated on counter 1.
p5-waiting-for-data-memory-read-stall-duration
(Event 1AH) The number of clocks the pipeline was stalled waiting
for data memory reads. Data TLB misses processing is included in
this count.
p5-write-buffer-full-stall-duration
(Event 19H) The number of clocks while the pipeline was stalled
due to write buffers being full.
p5-write-hit-to-m-or-e-state-lines
(Event 05H) The number of writes that hit exclusive or modified
lines in the data cache.
p5-writes-to-noncacheable-memory
(Event 2EH, Pentium MMX) The number of writes to non-cacheable
memory, including write cycles caused by TLB misses and I/O
writes. This event is only allocated on counter 1.
Event Name Aliases
The following table shows the mapping between the PMC-independent aliases
supported by Performance Counters Library (libpmc, -lpmc) and the under‐
lying hardware events used.
Alias Event
branches p5-taken-branches
branch-mispredicts (unsupported)
dc-misses p5-data-read-miss-or-write-miss
ic-misses p5-code-cache-miss
instructions p5-instructions-executed
interrupts p5-hardware-interrupts
unhalted-cycles p5-number-of-cycles-not-in-halt-state
SEE ALSOpmc(3), pmc.atom(3), pmc.core(3), pmc.core2(3), pmc.iaf(3), pmc.k7(3),
pmc.k8(3), pmc.p4(3), pmc.p6(3), pmc.tsc(3), pmclog(3), hwpmc(4)HISTORY
The pmc library first appeared in FreeBSD 6.0.
AUTHORS
The Performance Counters Library (libpmc, -lpmc) library was written by
Joseph Koshy ⟨jkoshy@FreeBSD.org⟩.
BSD October 4, 2008 BSD