Hi All,
I'm collecting little tricks that will stymie a disassembler (that is,
prevent it from disassembling the code correctly) to use in a book
project I'm working on ("The Art of Disassembly"). I've collected a
bunch of tricks over the years (MyGosh, it's getting to be decades
now), but chances are pretty good that I've missed some pretty good
ones.
Here are some of the ideas I'm using in the book:
1. Burying data in the code stream
2. Placing code in the middle of data objects (a variant of [1]).
3. Arithmetic expressions involving two relocatable addresses (e.g.,
lbl1-lbl2)
4. Burying instructions within the opcodes of other instructions
5. Using alignment operations in code and data
6. Writing code that does not have well-defined procedure/function
boundaries
7. data tables and, in general, making data boundaries
fuzzy.
8. Using unions and variant types to make it difficult to infer a data
object's type
9. Writing interpreters that allow a mixture of 80x86 and interpretive
code in the code stream
10. Using the breakpoint (int 3) and trace flag facilities within the
application
11. Using the machine instructions that correspond to a copyright
notice (or other string) do useful computations within the program.
12. Using the data at some location as both program data and executable
machine code (a generalization of [11]). This includes, for example,
self-modifying code.
13. Using lots of dynamically-linked libraries to make it difficult (or
even impossible) for a disassembler to infer much about the external
code.
14. Creating wrappers for system APIs to make it difficult for
heuristic analysis to make any headway processing those calls.
My interest in this subject is duomorphic. I want to be able to discuss
how to overcome these problems when using (or writing) a disassembler;
I also want to discuss how to help obfuscate object code to make it
difficult to disassemble. Any and all constructive comments,
suggestions, and examples are welcome.
Cheers,
Randy Hyde