Mach-O symbol stubs (IOS)

I am trying to understand how Mach-o files work, and have made a good deal of progress with the online resources available (In particular, the Apple page here: http://developer.apple.com/library/mac/#documentation/developertools/conceptual/MachORuntime/Reference/reference.html), but I have hit a roadblock on understanding how symbol stubs work.

Using “otool -l” I see the following section:

  • Disassemble default iOS apps with otool
  • subnormal IEEE 754 floating point numbers support on iOS ARM devices (iPhone 4)
  • View disassembly in XCode 4 (or Xcode 5 or Xcode 6)
  • Cross-compiling ZeroMQ to ARM for use in a MonoTouch iPhone app configure settings
  • How to instruct Xcode to use yasm to compile .asm files?
  • False positive: Undefined or garbage value returned to caller
  • Section
      sectname __symbolstub1
       segname __TEXT
          addr 0x00005fc0
          size 0x00000040
        offset 20416
         align 2^2 (4)
        reloff 0
        nreloc 0
         flags 0x80000408
    

    However when I look at the data from the binary file in a hex editor I see the following 4 bytes repeated again and again:

    00005FC0  38 F0 9F E5 38 F0 9F E5  38 F0 9F E5 38 F0 9F E5  88
    00005FD0  38 F0 9F E5 38 F0 9F E5  38 F0 9F E5 38 F0 9F E5  88
    00005FE0  38 F0 9F E5 38 F0 9F E5  38 F0 9F E5 38 F0 9F E5  88  
    00005FF0  38 F0 9F E5 38 F0 9F E5  38 F0 9F E5 38 F0 9F E5  88
    

    This looks something like a LDR which increases the PC by a fixed amount, but I don’t see why the amount is the same for each entry in the symbol table.

    If someone can shed light on why this is so, or provide any resources that get this low level, please let me know.

    Thanks!

    Solutions Collect From Internet About “Mach-O symbol stubs (IOS)”

    I will describe the situation with the current iOS, it’s somewhat different in the old versions.

    The symbol stubs indeed load into the PC a function pointer. For the standard “lazy” (on-demand) imports, the pointer resides in the __lazy_symbol section and initially points to a helper routine in the __stub_helper section, e.g.:

    __symbolstub1 _AudioServicesAddSystemSoundCompletion
    __symbolstub1 LDR  PC, _AudioServicesAddSystemSoundCompletion$lazy_ptr
    __symbolstub1 ; End of function _AudioServicesAddSystemSoundCompletion
    
    __lazy_symbol _AudioServicesAddSystemSoundCompletion$lazy_ptr DCD _AudioServicesAddSystemSoundCompletion$stubHelper
    
    __stub_helper _AudioServicesAddSystemSoundCompletion$stubHelper
    __stub_helper LDR R12, =nnn ; symbol info offset in the lazy bind table
    __stub_helper B   dyld_stub_binding_helper
    

    The function dyld_stub_binding_helper is the fist one in the __stub_helper section and essentially is just a trampoline to the dyld_stub_binder function in dyld, passing to it what I call “symbol info offset” value. That value is an offset inside the lazy binding info stream (pointed to by the LC_DYLD_INFO or LC_DYLD_INFO_ONLY load command), which is a sort of bytecode stream with commands for dyld. Typical sequence for a lazy import looks like this:

    72: BIND_OPCODE_SET_SEGMENT_AND_OFFSET_ULEB(M, 0xYYYYY)
    19: BIND_OPCODE_SET_DYLIB_ORDINAL_IMM(NNNN)
    40: BIND_OPCODE_SET_SYMBOL_TRAILING_FLAGS_IMM(0x00, '_AudioServicesAddSystemSoundCompletion')
    90: BIND_OPCODE_DO_BIND()
    

    here dyld would do the following:

    1. look up function named ‘_AudioServicesAddSystemSoundCompletion’ from
      a dylib number NNNN in the list of dylibs listed in the load
      commands.
    2. look up the executable’s segment number M (most likely __DATA)
    3. write the function pointer at the offset YYYYY.
    4. jump to the looked up address so that the actual function does its job

    The address written to happens to be the _AudioServicesAddSystemSoundCompletion$lazy_ptr slot. So, the next time the _AudioServicesAddSystemSoundCompletion is called, it will jump directly to the imported function, without going via dyld.

    N.B.: you should not look at the offset 05fc0 in the file right away. The addr field is the virtual address, you should look up the containing segment command and see at what VA it starts and what is its file offset, then do the math. Usually the __TEXT segment starts at 1000.

    However, the actual symbol stubs do look like you pasted, probably you have a fat mach-o with the fat header taking the first 1000 bytes, so the offsets line up.