From Collaborative RCE Knowledge Library
Super-secret debug capabilities of AMD processors !
| Item name: | Super-secret debug capabilities of AMD processors ! |
|
||||||
|---|---|---|---|---|---|---|---|---|
| Author: | Czernobyl aka Czerno | |||||||
| Home URL: | http://www.czerno.tk/ | |||||||
| Last updated: | December 26, 2010 | |||||||
| Version (if appl.): | 0.71 + minor edit | |||||||
| Direct D/L link: | N/A | |||||||
| Description: | ** Super-secret debug capabilities of Athlon XP and better AMD processors ! ** Here unveiled by Czernobyl aka Czerno - Mail : <me AT czerno.tk> Article /in fieri/ as Fravia+ liked to say. (http://en.wikipedia.org/wiki/Fravia) To be released at a later date under liberal copyright/copyleft options. Meanwhile please do NOT republish, rather link to the original document! I shall retain the 'moral' rights stemming from authorship. °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°° Click the "more details" button or link downpage to view additional notes! °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°° Hardly was this page released, news of it spread like a fire : as a result this page and the entire site disappeared from the web for a few hours, /slash-dotted/ :=) Speculations about the hidden feature are all over the web now, and that is good in the end : hopefully we'll see it used in application and/or kernel debuggers - Linux anybody ? Amidst a ton of comments were somber interrogations about security. IMO what is described herein does not pose security problems; after all MSRs and Control Registers aren't accessible except from ring zero. Nor are the Host's CRs and MSRs accessible from a properly designed Virtual machine. I doubt the newly disclosed features will open security risks that were not already present due to poor OS and/or virtualization systems designs... °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°° [Nov 18] AMD's public relations have commented, denying security risks (me too!),saying the features were /not/ "secret" (semantics...), that they may vary as processors evolve (alright, provided they get better and not buried), and finally that those are for factory testing - which I /don't/ quite buy ! Processors have many features embedded to that end, including JTAG; expanded conditional break points do not make much sense in chip testing, but do make much for ordinary and systems software debugging. Additionally the expanded debug logic is /always/ active as we shall see, so much for the fiction of for post-factory testing only. °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°° Very important : you can help! Yes, YOU! - This document was published before completing proper testing, to create public awareness and elicit collaborative discovery and testing. Not only is the information given incomplete at the moment, it may contain *errors*. The page gets updated as we learn more in common - If you have the know-how can donate time, please do your own trial of the features and contact me over any errors/inaccuracies/complements you find! What's needed now is thorough and tedious combinatorial cross-checking the effect of the _Control_MSR bits vs DR7 vs null/non-null address and data masks... - We also want to assert whether the features are consistent among generations of AMD CPUs. Send your observations with type of CPU, preferably CPUID info. 64-bit capable processors should be tested for operation in "long mode" too. °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°° _Summary_ : Ahlon XP and later processors from AMD have included firmware-based debugging features that expand greatly over standard, architecturally defined capabilities of X86. For some reason though, AMD has been tightly secretive about these features; their existence was first guessed after considering a list of undocumented MSRs found on CBID's page (URL below). Herein we uncover the outcome of our running experimentation, in the hope it may be useful to software developers, & possibly included in future debuggers or debugger plug-ins. I call the new capabilities "expanded", since the term "debug extensions" was already used and refers to other features in Pentium and later processors. Author can be contacted by email, or PM, or in the thread on the reversing forum. °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°° _New MSRs_ : Four new machine specific registers (MSR) are involved in the expanded debug facilities. Those MSRs are "password" protected against casual access : read/write access (RDMSR/WRMSR) to the registers is granted only if EDI holds the correct password value, viz. EDI=9C5A203A. Otherwise, a GPF exception occurs. _Control_ @ C001_1024 width: 8 bits _Data_Match_ @ C001_1025 width 32 [AMD64: 64 bits?] _Data_Mask_ @ C001_1026 - ditto - _Address_Mask_ @ C001_1027 width: 12 bits These registers default/reset values are zero. [In the "BIOS and Kernel developer's guide" for AMD NPT Family 0Fh, the Control Register is acknowleged as "enables AMD debug extensions", having the following layout : 31-8 reserved SBZ 7 EHM Enter HDT Mode (R/W) 6-0 reserved SBZ ...where "HDT mode" is AMD hardware debug tool, using the JTAG bus. We'll leave the EHM bit alone and focus on the "reserved, SBZ" ones... ] °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°° _Operational details_ The operation of breakpoint *BP0* (using DR0) is expanded as will be described. (breakpoints BP1 to BP3 are unaffected). Breakpoint *BP0* is controlled by the new MSRs in addition to good old DR7. It should be noted that the expanded operation of BP0 *cannot be switched off* as far as we know : as soon as the address in DR0 is validated by DR7 (bits 0 and/or 1), it operates as will be explained, there is no further enabling bit. The AMD engineers presumably managed to hide the new steps in the pipeline such that they don't add visible latencies... °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°° 1) General notions about the "Mask" MSRs : Both in Data_Mask and Address_Mask, _set_ (=1) bits are interpreted as "don't care". Formally, when a comparison of two values/addresses needs to occur "under" Mask, the Mask is OR-ed to both and then the compare is done. A match occurs if and only if masked values are equal. A mask value of zero thus is requiring exact match (considering compare length). Conversely, a data mask consisting of all ones in effect disables data mastching : the comparison will succeed for any value. The largest allowable address mask value, 00000FFF, creates a matching a set of 4096 page-aligned, consecutive memory or port addresses. Considering Address_Mask, it should be a string of zeroes terminated by (zero or more) contiguous binary ones, IOW represent a power-of-two minus-one. However that logical requirement is not enforced by the AMD firmware. °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°° 2) The Address_Mask: It is used *unconditionally* for all types of BP whether instruction execution, memory or IO data. A mask of zero in effect switches address expansion off, mimicking legacy breakpoint behavior. °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°° 3) Instruction breakpoint (DR7 type =0): Triggered at _any_ address matching DR0 under Address_Mask. _Control_ MSR has no effect (should be zero). Data_Match and Data_Mask are not used for this type of breakpoints. °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°° 4) Memory Data breakpoint (DR7 types 1 and 3): DR0-address is expanded according to Address_Mask, as always. Then, - Case: *Control = 0* (default), no additional check is performed. Break occurs either on Write only or on Any_Access, according to the BP type selected in DR7. - Case: *Control = 2* : Breaks will occur on WRITE to memory (even if the DR7 type is RW, no breaks happen on Read). Data_Match check is performed, CPU break (trap 01) will happen on MATCH. - Case *control = 3* : same as Control = 2 , except the data condition is reversed, viz. traps on NON-MATCH. - Case: *Control = 1* : break on ACCESS to ANY memory! (address mask notwithstanding!) on Data Match. This last case is powerful and...dangerous. You'd better not rely on the int 1 routine in your vanilla debugger to sort it out! [Checked only for Write type breakpoints at the moment of this writing.] °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°° 5) IN/OUT data breakpoint : Address (i.e., port number) is subject to expansion based upon Address_Mask. Additionally Data read/written *may be* compared with Data_Match under Data_Mask, as decided by the Control_MSR : = Control = 2 : break on OUT, on *Data match*. = Control = 3 : break on OUT, on *Data NON-match*. = Control = 8 : break on IN, on *Data Match* = Control = 9 : break on IN, on *Data NON-Match* = Control = 0 (default) : break on *any IO*, *ignore data matching* = Control = 1 : break on *OUT* to ANY port! (address mask bypassed!)* *on Data Match* Reminder: I/O breakpoints need CR4 bit 3 (DE) set. °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°° Copyright (c) 2010 by Czerno. Knowledge is free. |
|||||||
| Related URLs: |
|
|||||||
Feed containing all updates for this item.
Here below you will find useful notes about this tool, left by other users.
You are welcome to add your own useful notes here, or edit any existing notes to improve or extend them.
(please also edit it if you think it fits well in some additional category, since this can also be controlled)
Feature orthogonality...
When defining processor features, orthogonality is always desirable,although it may sometimes conflict with other design goals.
While rev-engineering unknown / undocumented features, orthogonality is even more desirable, often allowing us to successfully guess our way to the solution.
Unfortunately the individual features of AMD's expanded debug don't satisfy the orthogonality criterium, making the Reverser's road the more uneasy !
More concretely said, it would be nice if every Control_MSR bit were controlling some independent feature (for instance, some bit might turn Address masking on or off, another bit would Data masking, and so on...) but turns out it is not so. There is some degree of orthogonality in the design, for an example cf. cases Ctl=2,3,8,9 for expanded IO breakpoints : one would be glad to conclude that "bit 0" means "reverse the condition on data compare", unfortunately this interpretation does not hold any more if you consider the cases : Ctl=0 vs. Ctl=1.
However, some cases of apparent unorthogonality might stem from incomplete implementation of projected features : by setting Ctl=8 or 9, I/O breakpoints can be defined to trigger on READs only (+data and address matching), whereas MEMORY data BPs do not have the corresponding possibility (on my test processor at least, Ctl=8/9 for data BPs act the same as Ctl=0/1, i.e. bit 3 has no effect.)
It would be desirable to have Memory DATA BPS that fire only on READs. Maybe on newer AMD processors that also is possible ? Can you check it, and other unknown cases too, helping yourself of the principle of orthogonality in the measure that it applies ? Your comments are welcome! -- Czerno on Dec 1st, 2010