From bc3dd83c59b1a5d2b8ed0521b983447068cacbbc Mon Sep 17 00:00:00 2001 From: Derek Selander Date: Wed, 6 Nov 2019 09:28:13 -0700 Subject: [PATCH] Morning writing time is up, time to start real work --- docs/index.md | 102 +++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 93 insertions(+), 9 deletions(-) diff --git a/docs/index.md b/docs/index.md index 6f167c08..27cb288a 100644 --- a/docs/index.md +++ b/docs/index.md @@ -6,7 +6,7 @@ Building out a "class-dump"-like introspection tool for Apple platforms has changed considerably since the original [class-dump](http://stevenygard.com/projects/class-dump/) came out. Learning these new (and old) technologies can be quite intimidating due to the steep learning curve and somewhat hard to find documentation. -This article *attempts* to explain the complete process of programmatically inspecting a [Mach-O](https://en.wikipedia.org/wiki/Mach-O) (Apple) binary to display the compiled types with Swift and Objective-C by discussing the following: +This article *attempts* to explain the complete process of programmatically inspecting a [Mach-O](https://en.wikipedia.org/wiki/Mach-O) (Apple) binary to display the compiled Swift types and Objective-C classes by discussing the following: * [Mach-O File Format](#apples_mach-o_file_format) * [Load Commands](#load_commands) @@ -115,7 +115,7 @@ Keep an eye on that `ncmds` with the value 20; this is what's going to be discus --- -## Mach-O File Format.Load Commands +## Load Commands --- It's these load commands (whose count is given by the `ncmds` from the `mach_header_64`) that can be interesting when exploring a compiled executable. @@ -268,7 +268,7 @@ There are many, many interesting Mach-O sections. One could write a novel on jus --- -## Mach-O File Format.File Offsets => Virtual Addresses (and back) +## File Offsets => Virtual Addresses (and back) --- The Mach-O segment/section load command info provide a translation into the virtual address of stuff loaded into memory and to the file offsets on disk for an image. @@ -356,7 +356,7 @@ This trick is used extensively in [dsdump](https://github.com/DerekSelander/dsdu --- -## Mach-O File Format.PIE +## PIE --- Ohhhh but it gets a bit more confusing than that. In addition to the vritual load address, the OS can shift a loaded image's virtual addresses on runtime to a different starting base address to help mitigate attacks. This is called **Address Space Layout Randomization** or simply **ASLR**. @@ -770,7 +770,7 @@ Objective-C still plays quite a relevant role——even in Swift. A pure Swift c In addition, Swift methods *can* be stripped out of the symbol table, but Objective-C methods can still be resolved via other methods (as you'll see shortly). If a Swift class overrides an Objective-C method (i.e. `viewDidLoad`), there'll be a compiler generated Objective-C bridging method (called a [thunk](https://en.wikipedia.org/wiki/Thunk)) which retains and rearranges assembly registers to the Swift calling convention. The thunk method is visible on the Objective-C side, while the actual Swift method can be stripped out. You'll see at the end of this writeup that you can infer the stripped Swift method by using this knowledge and the Swift reflection type knowledge introduced later. --- -## Objective-C.Class List +## Class List --- Using the Mach-O knowledge you've built up earlier, it's quite easy to hunt for Objective-C classes that are built into an image. All you have to do is look for the **`__DATA_CONST.__objc_classlist`** Mach-O section in an executable. @@ -789,7 +789,7 @@ Build up an executable with Objective-C code and name it **ex5.m**: int main() { return 0; } ``` -Compile ex6.m with the debugging information flag (`-g`): +Compile ex5.m with the debugging information flag (`-g`): ```bash clang ex5.m -fmodules -o ex5 -g @@ -844,7 +844,7 @@ It's the dereferenced values, `0x0000000100002148` and the `0x0000000100002198` > **Note:** As of around clang version `clang-1100.0.33.8` (in Xcode 11), the default configuration for compiling the Objective-C `__objc_class_list` Mach-O section was moved from the `__DATA` Mach-O segment to the `__DATA_CONST` Mach-O segment. This change is discussed in the DYLD opcodes part of the writeup, but just be aware that if you have an older version of clang, you'll see `__objc_class_list` in the `__DATA` Mach-O segment. --- -## Objective-C.Objc4 +## Objc4 --- The most recent opensource Objective-C class layout (at the time of writing this) can be found in a header named **[objc-runtime.new.h](https://opensource.apple.com/source/objc4/objc4-756.2/runtime/objc-runtime-new.h.auto.html)** @@ -877,7 +877,7 @@ struct objc_class { If you want to resolve the pointer from the `bits` value, you'd have to bitwise AND it with **0x00007ffffffffff8UL**. This is defined as the **`FAST_DATA_MASK`** in the objc-runtime-new.h header. --- -## Objective-C.How to Disappoint Swift Developers +## How to Disappoint Swift Developers --- Earlier, I said all pure Swift classes on Apple platforms are really just Objective-C classes underneath, so let's prove that. @@ -971,9 +971,93 @@ Oh no! There's no 2 in that 0x00007fff811c6c50 value! All your Swift classes on I anticipate this will change in a couple years, but for now, it's always fun to bring Swift developers down to my level 😈 --- -## Objective-C.The `class_ro_t` struct +## The `class_ro_t` struct --- +The `class_ro_t` struct is the "key" value to exploring an Objective-C class. It's the gateway to the class's name, it's methods, it's properties, it's instance variables, etc. + +Here's the *simplified* `class_ro_t` layout: + +```c +struct class_ro_t { + uint32_t flags; + uint32_t instanceStart; + uint32_t instanceSize; + uint32_t reserved; + + const uint8_t * ivarLayout; + const char * name; + method_list_t * baseMethodList; // An array for method_t + protocol_list_t * baseProtocols; // An array for protocol_t + const ivar_list_t * ivars; // An array for ivar_t + const uint8_t * weakIvarLayout; + property_list_t *baseProperties; // An array for property_t +} +``` + +Using this knowledge, find the `const char* name` of this Swift class. Continue using LLDB. Earlier, you obtained the `objc_class*` of the `APureSwiftClass` via the `__objc_classlist` Mach-O section. This time, use Apple's **`NSClassFromString`** API to get the same address. + +While LLDB is still paused (if not run it again and break on `main`), in the `main` function of ex6, execute the following Swift code: + +```bash +(lldb) e import Foundation # Needed to reference the NSClassFromString API in Swift +(lldb) p/x NSClassFromString("ex6.APureSwiftClass") # print the result in hexadecimal +(AnyClass?) $R18 = 0x0000000100002138 ex6.APureSwiftClass +``` + +Note how that `0x0000000100002138` (or equivalent on your computer) address matches with the dereferenced value obtained from `__objc_classlist` Mach-O section you found earlier. + + +Now remember, whenever you reference a class, you are initializing it, and changing the `bits` value from a `class_ro_t*` to a `class_rw_t*`. + +Rerun the program with the `run` command. + +```bash +(lldb) run +``` + +The program should have reset itself to the start of the `main` function. Since ASLR is disabled, dump the address for the `APureSwiftClass` (mine was 0x0000000100002138) **without** initializing it. + + +```bash +(lldb) x/5gx 0x0000000100002138 +0x100002138: 0x0000000100002100 0x00007fff91b6d478 +0x100002148: 0x00007fff63aa1400 0x0000000000000000 +0x100002158: 0x00000001000020b2 <- bits, AKA (class_ro_t* | FAST_IS_SWIFT_STABLE) +``` + +Keep a record of the `bits` value as it will change in once sec. My value is **`0x00000001000020b2`** + +Initialize the `APureSwiftClass` via Swift: + +```bash +(lldb) e APureSwiftClass.self +(ex6.APureSwiftClass.Type) $R16 = ex6.APureSwiftClass +``` + +Rerun the earlier command and inspect the `objc_class` struct's `bits` value: + +```bash +(lldb) x/5gx 0x0000000100002138 +0x100002138: 0x0000000100002100 0x00007fff91b6d478 +0x100002148: 0x00007fff63aa1400 0x0000001800000000 +0x100002158: 0x0000000100501682 <- bits, AKA (class_rw_t* | FAST_IS_SWIFT_STABLE) +``` + +The `bits` param has now changed to the `class_rw_t` pointer + the FAST_IS_SWIFT_STABLE + +> *If you a building out an Objective-C runtime introspection tool, and you're testing the tool on itself, make sure you know the correct struct that is in the `bits` value*. I burned *a lot* of hours working with the wrong struct if I accidentially initialized an Objective-C class by `po`'ing it in LLDB + +Fortunately, for both the `class_ro_t` struct and the `class_rw_t` struct, they both have `int32_t` flags value right at the beginning, which among other things, tells if this class is initialized. The value is (1 << 31, AKA 0x80000000) to see if the class is initialized. + +In the above example if I didn't know if a class was initialized, I'd start with the `bits` value, 0x0000000100501682. I'd remove the Swift bit packed flags, turning the value into **0x0000000100501680**. Then, I'd dereference this value with a size of 32bits in LLDB + +```bash +(lldb) x/wx 0x0000000100501680 +0x100501680: 0x80080000 +``` + +From that 0x8 in the most significant bit, I can see that this class has already been initialized meaning I am working with an `class_rw_t` struct. ### Source Version