Miasm – Reverse Engineering Framework In Python

Miasm is a free and open source (GPLv2) reverse engineering framework. Miasm aims to analyze / modify / generate binary programs. Here is a non exhaustive list of features:Opening / modifying / generating PE / ELF 32 / 64 LE / BE using ElfesteemAssembling / Disassembling X86 / ARM / MIPS / SH4 / MSP430Representing assembly semantic using intermediate languageEmulating using JIT (dynamic code analysis, unpacking, …)Expression simplification for automatic de-obfuscation…See the official blog for more examples and demos.Basic examplesAssembling / DisassemblingImport Miasm x86 architecture:>>> from miasm2.arch.x86.arch import mn_x86>>> from miasm2.core.locationdb import LocationDBGet a location db:>>> loc_db = LocationDB()Assemble a line:>>> l = mn_x86.fromstring(‘XOR ECX, ECX’, loc_db, 32)>>> print lXOR ECX, ECX>>> mn_x86.asm(l)[‘1\xc9’, ‘3\xc9’, ‘g1\xc9’, ‘g3\xc9’]Modify an operand:>>> l.args[0] = mn_x86.regs.EAX>>> print lXOR EAX, ECX>>> a = mn_x86.asm(l)>>> print a[‘1\xc8’, ‘3\xc1’, ‘g1\xc8’, ‘g3\xc1’]Disassemble the result:>>> print mn_x86.dis(a[0], 32)XOR EAX, ECXUsing Machine abstraction:>>> from miasm2.analysis.machine import Machine>>> mn = Machine(‘x86_32’).mn>>> print mn.dis(‘\x33\x30’, 32)XOR ESI, DWORD PTR [EAX]For Mips:>>> mn = Machine(‘mips32b’).mn>>> print mn.dis(’97A30020′.decode(‘hex’), “b")LHU V1, 0x20(SP)Intermediate representationCreate an instruction:>>> machine = Machine(‘arml’)>>> instr = machine.mn.dis(‘002088e0’.decode(‘hex’), ‘l’)>>> print instrADD R2, R8, R0Create an intermediate representation object:>>> ira = machine.ira(loc_db)Create an empty ircfg>>> ircfg = ira.new_ircfg()Add instruction to the pool:>>> ira.add_instr_to_ircfg(instr, ircfg)Print current pool:>>> for lbl, irblock in ircfg.blocks.items():… print irblock.to_string(loc_db)loc_0:R2 = R8 + R0IRDst = loc_4Working with IR, for instance by getting side effects:>>> for lbl, irblock in ircfg.blocks.iteritems():… for assignblk in irblock:… rw = assignblk.get_rw()… for dst, reads in rw.iteritems():… print ‘read: ‘, [str(x) for x in reads]… print ‘written:’, dst… print…read: [‘R8’, ‘R0’]written: R2read: []written: IRDstEmulationGiving a shellcode:00000000 8d4904 lea ecx, [ecx+0x4]00000003 8d5b01 lea ebx, [ebx+0x1]00000006 80f901 cmp cl, 0x100000009 7405 jz 0x100000000b 8d5bff lea ebx, [ebx-1]0000000e eb03 jmp 0x1300000010 8d5b01 lea ebx, [ebx+0x1]00000013 89d8 mov eax, ebx00000015 c3 ret>>> s = ‘\x8dI\x04\x8d[\x01\x80\xf9\x01t\x05\x8d[\xff\xeb\x03\x8d[\x01\x89\xd8\xc3’Import the shellcode thanks to the Container abstraction:>>> from miasm2.analysis.binary import Container>>> c = Container.from_string(s)>>> cDisassembling the shellcode at address 0:>>> from miasm2.analysis.machine import Machine>>> machine = Machine(‘x86_32′)>>> mdis = machine.dis_engine(c.bin_stream)>>> asmcfg = mdis.dis_multiblock(0)>>> for block in asmcfg.blocks:… print block.to_string(asmcfg.loc_db)…loc_0LEA ECX, DWORD PTR [ECX + 0x4]LEA EBX, DWORD PTR [EBX + 0x1]CMP CL, 0x1JZ loc_10-> c_next:loc_b c_to:loc_10loc_10LEA EBX, DWORD PTR [EBX + 0x1]-> c_next:loc_13loc_bLEA EBX, DWORD PTR [EBX + 0xFFFFFFFF]JMP loc_13-> c_to:loc_13loc_13MOV EAX, EBXRETInitializing the Jit engine with a stack:>>> jitter = machine.jitter(jit_type=’python’)>>> jitter.init_stack()Add the shellcode in an arbitrary memory location:>>> run_addr = 0x40000000>>> from miasm2.jitter.csts import PAGE_READ, PAGE_WRITE>>> jitter.vm.add_memory_page(run_addr, PAGE_READ | PAGE_WRITE, s)Create a sentinelle to catch the return of the shellcode:def code_sentinelle(jitter): jitter.run = False jitter.pc = 0 return True>>> jitter.add_breakpoint(0x1337beef, code_sentinelle)>>> jitter.push_uint32_t(0x1337beef)Active logs:>>> jitter.set_trace_log()Run at arbitrary address:>>> jitter.init_run(run_addr)>>> jitter.continue_run()RAX 0000000000000000 RBX 0000000000000000 RCX 0000000000000000 RDX 0000000000000000RSI 0000000000000000 RDI 0000000000000000 RSP 000000000123FFF8 RBP 0000000000000000zf 0000000000000000 nf 0000000000000000 of 0000000000000000 cf 0000000000000000RIP 000000004000000040000000 LEA ECX, DWORD PTR [ECX+0x4]RAX 0000000000000000 RBX 0000000000000000 RCX 0000000000000004 RDX 0000000000000000RSI 0000000000000000 RDI 0000000000000000 RSP 000000000123FFF8 RBP 0000000000000000zf 0000000000000000 nf 0000000000000000 of 0000000000000000 cf 0000000000000000….4000000e JMP loc_0000000040000013:0x40000013RAX 0000000000000000 RBX 0000000000000000 RCX 0000000000000004 RDX 0000000000000000RSI 0000000000000000 RDI 0000000000000000 RSP 000000000123FFF8 RBP 0000000000000000zf 0000000000000000 nf 0000000000000000 of 0000000000000000 cf 0000000000000000RIP 000000004000001340000013 MOV EAX, EBXRAX 0000000000000000 RBX 0000000000000000 RCX 0000000000000004 RDX 0000000000000000RSI 0000000000000000 RDI 0000000000000000 RSP 000000000123FFF8 RBP 0000000000000000zf 0000000000000000 nf 0000000000000000 of 0000000000000000 cf 0000000000000000RIP 000000004000001340000015 RET>>>Interacting with the jitter:>>> jitter.vmad 1230000 size 10000 RW_ hpad 0x2854b40ad 40000000 size 16 RW_ hpad 0x25e0ed0>>> hex(jitter.cpu.EAX)’0x0L’>>> jitter.cpu.ESI = 12Symbolic executionInitializing the IR pool:>>> ira = machine.ira(loc_db)>>> ircfg = ira.new_ircfg_from_asmcfg(asmcfg)Initializing the engine with default symbolic values:>>> from miasm2.ir.symbexec import SymbolicExecutionEngine>>> sb = SymbolicExecutionEngine(ira)Launching the execution:>>> symbolic_pc = sb.run_at(ircfg, 0)>>> print symbolic_pc((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10)Same, with step logs (only changes are displayed):>>> sb = SymbolicExecutionEngine(ira, machine.mn.regs.regs_init)>>> symbolic_pc = sb.run_at(ircfg, 0, step=True)Instr LEA ECX, DWORD PTR [ECX + 0x4]Assignblk:ECX = ECX + 0x4________________________________________________________________________________ECX = ECX + 0x4________________________________________________________________________________Instr LEA EBX, DWORD PTR [EBX + 0x1]Assignblk:EBX = EBX + 0x1________________________________________________________________________________EBX = EBX + 0x1ECX = ECX + 0x4________________________________________________________________________________Instr CMP CL, 0x1Assignblk:zf = (ECX[0:8] + -0x1)?(0x0,0x1)nf = (ECX[0:8] + -0x1)[7:8]pf = parity((ECX[0:8] + -0x1) & 0xFF)of = ((ECX[0:8] ^ (ECX[0:8] + -0x1)) & (ECX[0:8] ^ 0x1))[7:8]cf = (((ECX[0:8] ^ 0x1) ^ (ECX[0:8] + -0x1)) ^ ((ECX[0:8] ^ (ECX[0:8] + -0x1)) & (ECX[0:8] ^ 0x1)))[7:8]af = ((ECX[0:8] ^ 0x1) ^ (ECX[0:8] + -0x1))[4:5]________________________________________________________________________________af = (((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8] ^ 0x1)[4:5]pf = parity((ECX + 0x4)[0:8] + 0xFF)zf = ((ECX + 0x4)[0:8] + 0xFF)?(0x0,0x1)ECX = ECX + 0x4of = ((((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8]) & ((ECX + 0x4)[0:8] ^ 0x1))[7:8]nf = ((ECX + 0x4)[0:8] + 0xFF)[7:8]cf = (((((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8]) & ((ECX + 0x4)[0:8] ^ 0x1)) ^ ((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8] ^ 0x1)[7:8]EBX = EBX + 0x1________________________________________________________________________________Instr JZ loc_key_1Assignblk:IRDst = zf?(loc_key_1,loc_key_2)EIP = zf?(loc_key_1,loc_key_2)________________________________________________________________________________af = (((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8] ^ 0x1)[4:5]EIP = ((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10)pf = parity((ECX + 0x4)[0:8] + 0xFF)IRDst = ((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10)zf = ((ECX + 0x4)[0:8] + 0xFF)?(0x0,0x1)ECX = ECX + 0x4of = ((((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8]) & ((ECX + 0x4)[0:8] ^ 0x1))[7:8]nf = ((ECX + 0x4)[0:8] + 0xFF)[7:8]cf = (((((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8]) & ((ECX + 0x4)[0:8] ^ 0x1)) ^ ((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8] ^ 0x1)[7:8]EBX = EBX + 0x1________________________________________________________________________________>>>Retry execution with a concrete ECX. Here, the symbolic / concolic execution reach the shellcode’s end:>>> from miasm2.expression.expression import ExprInt>>> sb.symbols[machine.mn.regs.ECX] = ExprInt(-3, 32)>>> symbolic_pc = sb.run_at(ircfg, 0, step=True)Instr LEA ECX, DWORD PTR [ECX + 0x4]Assignblk:ECX = ECX + 0x4________________________________________________________________________________af = (((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8] ^ 0x1)[4:5]EIP = ((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10)pf = parity((ECX + 0x4)[0:8] + 0xFF)IRDst = ((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10)zf = ((ECX + 0x4)[0:8] + 0xFF)?(0x0,0x1)ECX = 0x1of = ((((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8]) & ((ECX + 0x4)[0:8] ^ 0x1))[7:8]nf = ((ECX + 0x4)[0:8] + 0xFF)[7:8]cf = (((((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8]) & ((ECX + 0x4)[0:8] ^ 0x1)) ^ ((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8] ^ 0x1)[7:8]EBX = EBX + 0x1________________________________________________________________________________Instr LEA EBX, DWORD PTR [EBX + 0x1]Assignblk:EBX = EBX + 0x1________________________________________________________________________________af = (((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8] ^ 0x1)[4:5]EIP = ((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10)pf = parity((ECX + 0x4)[0:8] + 0xFF)IRDst = ((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10)zf = ((ECX + 0x4)[0:8] + 0xFF)?(0x0,0x1)ECX = 0x1of = ((((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8]) & ((ECX + 0x4)[0:8] ^ 0x1))[7:8]nf = ((ECX + 0x4)[0:8] + 0xFF)[7:8]cf = (((((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8]) & ((ECX + 0x4)[0:8] ^ 0x1)) ^ ((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8] ^ 0x1)[7:8]EBX = EBX + 0x2________________________________________________________________________________Instr CMP CL, 0x1Assignblk:zf = (ECX[0:8] + -0x1)?(0x0,0x1)nf = (ECX[0:8] + -0x1)[7:8]pf = parity((ECX[0:8] + -0x1) & 0xFF)of = ((ECX[0:8] ^ (ECX[0:8] + -0x1)) & (ECX[0:8] ^ 0x1))[7:8]cf = (((ECX[0:8] ^ 0x1) ^ (ECX[0:8] + -0x1)) ^ ((ECX[0:8] ^ (ECX[0:8] + -0x1)) & (ECX[0:8] ^ 0x1)))[7:8]af = ((ECX[0:8] ^ 0x1) ^ (ECX[0:8] + -0x1))[4:5]________________________________________________________________________________af = 0x0EIP = ((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10)pf = 0x1IRDst = ((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10)zf = 0x1ECX = 0x1of = 0x0nf = 0x0cf = 0x0EBX = EBX + 0x2________________________________________________________________________________Instr JZ loc_key_1Assignblk:IRDst = zf?(loc_key_1,loc_key_2)EIP = zf?(loc_key_1,loc_key_2)________________________________________________________________________________af = 0x0EIP = 0x10pf = 0x1IRDst = 0x10zf = 0x1ECX = 0x1of = 0x0nf = 0x0cf = 0x0EBX = EBX + 0x2________________________________________________________________________________Instr LEA EBX, DWORD PTR [EBX + 0x1]Assignblk:EBX = EBX + 0x1________________________________________________________________________________af = 0x0EIP = 0x10pf = 0x1IRDst = 0x10zf = 0x1ECX = 0x1of = 0x0nf = 0x0cf = 0x0EBX = EBX + 0x3________________________________________________________________________________Instr LEA EBX, DWORD PTR [EBX + 0x1]Assignblk:IRDst = loc_key_3________________________________________________________________________________af = 0x0EIP = 0x10pf = 0x1IRDst = 0x13zf = 0x1ECX = 0x1of = 0x0nf = 0x0cf = 0x0EBX = EBX + 0x3________________________________________________________________________________Instr MOV EAX, EBXAssignblk:EAX = EBX________________________________________________________________________________af = 0x0EIP = 0x10pf = 0x1IRDst = 0x13zf = 0x1ECX = 0x1of = 0x0nf = 0x0cf = 0x0EBX = EBX + 0x3EAX = EBX + 0x3________________________________________________________________________________Instr RETAssignblk:IRDst = @32[ESP[0:32]]ESP = {ESP[0:32] + 0x4 0 32}EIP = @32[ESP[0:32]]________________________________________________________________________________af = 0x0EIP = @32[ESP]pf = 0x1IRDst = @32[ESP]zf = 0x1ECX = 0x1of = 0x0nf = 0x0cf = 0x0EBX = EBX + 0x3ESP = ESP + 0x4EAX = EBX + 0x3________________________________________________________________________________>>>How does it work?Miasm embeds its own disassembler, intermediate language and instruction semantic. It is written in Python.To emulate code, it uses LLVM, GCC, Clang or Python to JIT the intermediate representation. It can emulate shellcodes and all or parts of binaries. Python callbacks can be executed to interact with the execution, for instance to emulate library functions effects.DocumentationTODOAn auto-generated documentation is available here.Obtaining MiasmClone the repository: Miasm on GitHubGet one of the Docker images at Docker HubSoftware requirementsMiasm uses:python-pyparsingpython-develfesteem from Elfesteemoptionally python-pycparser (version >= 2.17)To enable code JIT, one of the following module is mandatory:GCCClangLLVM with Numba llvmlite, see below’optional’ Miasm can also use:Z3, the Theorem ProverConfigurationInstall elfesteemgit clone https://github.com/serpilliere/elfesteem.git elfesteemcd elfesteempython setup.py buildsudo python setup.py installTo use the jitter, GCC or LLVM is recommendedGCC (any version)Clang (any version)LLVMDebian (testing/unstable): Not testedDebian stable/Ubuntu/Kali/whatever: pip install llvmlite or install from llvmliteWindows: Not testedBuild and install Miasm:$ cd miasm_directory$ python setup.py build$ sudo python setup.py installIf something goes wrong during one of the jitter modules compilation, Miasm will skip the error and disable the corresponding module (see the compilation output).Windows & IDAMost of Miasm’s IDA plugins use a subset of Miasm functionnality. A quick way to have them working is to add:elfesteem directory and pyparsing.py to C:\…\IDA\python\ or pip install pyparsing elfesteemmiasm2/miasm2 directory to C:\…\IDA\python\All features excepting JITter related ones will be available. For a more complete installation, please refer to above paragraphs.TestingMiasm comes with a set of regression tests. To run all of them:cd miasm_directory/testpython test_all.pySome options can be specified:Mono threading: -mCode coverage instrumentation: -cOnly fast tests: -t long (excludes the long tests)They already use MiasmToolsSibyl: A function divination tooR2M2: Use miasm2 as a radare2 pluginCGrex : Targeted patcher for CGC binariesethRE Reversing tool for Ethereum EVM (with corresponding Miasm2 architecture)Blog posts / papers / conferencesDeobfuscation: recovering an OLLVM-protected programTaming a Wild Nanomite-protected MIPS Binary With Symbolic Execution: No Such CrackmeGénération rapide de DGA avec Miasm: Quick computation of DGA (French article)Enabling Client-Side Crash-Resistance to Overcome Diversification and Information Hiding: Detect undirected call potential argumentsMiasm: Framework de reverse engineering (French)Tutorial miasm (French video)Graphes de dépendances : Petit Poucet style: DepGraph (French)BooksPractical Reverse Engineering: X86, X64, Arm, Windows Kernel, Reversing Tools, and Obfuscation: Introduction to Miasm (Chapter 5 "Obfuscation")BlackHat Python – Appendix: Japan security book’s samplesMiscMan, does miasm has a link with rr0d?Yes! crappy code and uggly documentation.Download Miasm

Link: http://feedproxy.google.com/~r/PentestTools/~3/Cx6IGqWfrzI/miasm-reverse-engineering-framework-in.html