\documentclass[14pt]{article}
\usepackage{hyperref}
\setlength{\paperwidth}{12cm}
\setlength{\textwidth}{11cm}
\setlength{\oddsidemargin}{-2.04cm}
\title{ Possible Intel PMU Bug }\date{ Mar 1, 2014 }\author{ Mehmet Kayaalp }\begin{document}
\maketitle\begin{center}
Tags: %
\href{/tags/#PMU}{PMU}, \href{/tags/#x86}{x86}\end{center}
\end{document}
\documentclass[14pt]{article}
\usepackage{hyperref}
\setlength{\paperwidth}{12cm}
\setlength{\textwidth}{11cm}
\setlength{\oddsidemargin}{-2.04cm}
\begin{document}
Intel provides a bunch of (non-architectural) performance monitoring events to count uops.
I have been trying to use the ones related to cache behavior of loads in a loadable kernel module.
The odd thing is, even though the LKM is operating on ring-0,
the performance monitoring events that I program to count ring-0 events are not working
when it comes to uop related events.
Switching to user level, however, gives (somewhat) meaningful data.
I am convinced that it might be a possible bug in Sandy Bridge processors.
The following should work, but it does not. All it reads from PMC0 is 0.
If the value written to the \verb|PERFEVTSEL0| is changed to \verb|0x4181D0|,
then the results become non-zero.
\end{document}
; write 0 to PMC0xor%edx,%edxxor%eax,%eaxmov$0xC1,%ecxwrmsr; set PERFEVTSEL0 to count MEM_UOP_RETIRED.ALL_LOADS ; 0x4----- means enable counting for PMC0; 0x-2---- means ring-0; 0x--81-- is the umask value; 0x----D0 is the event numbermov$0x4281D0,%eaxmov$0x186,%ecxwrmsr; do a bunch of loads here...; read PMC0 into %edx:%eaxxor%ecx,%ecxrdpmc