In the last post we took a look at how to manually remove invalid opcodes from an obfuscated assembly. We did this by decompiling the assembly, replacing with the nop opcode and then recompiling. We used this manual method of removing these as Mono.Cecil crashed at the sight of some of the invalid opcodes. In this post we take a look at a tiny "hack" to Mono.Cecil which allows us to do the same thing in an automated manner.
Read more:
Paul MasonRecap: What needs fixing?
Please note: An assumption is made in this article that all
invalid opcodes are single byte opcodes; this example does not cater for
invalid double byte opcodes.
Well, to work out what needs fixing, we'll firstly write some code
that we'll use to break Mono.Cecil (and for testing):
01.
02.
var
assembly = AssemblyFactory.GetAssembly(
03.
@
"D:\temp\Obfuscated\SimpleLibrary.dll"
);
04.
05.
06.
foreach
(TypeDefinition type
in
assembly.MainModule.Types)
07.
{
08.
09.
foreach
(MethodDefinition
def
in
type.Methods)
10.
{
11.
12.
if
(def.HasBody)
13.
{
14.
15.
CilWorker worker = def.Body.CilWorker;
16.
17.
18.
List<Instruction>
instructionsToFix =
new
List<Instruction>();
19.
20.
21.
foreach
(Instruction instr
in
def.Body.Instructions)
22.
{
23.
24.
}
25.
26.
27.
foreach
(Instruction instr
in
instructionsToFix)
28.
{
29.
Instruction
newInstr = worker.Create(OpCodes.Nop);
30.
worker.Replace(instr,
newInstr);
31.
}
32.
}
33.
}
34.
}
35.
36.
37.
AssemblyFactory.SaveAssembly(assembly,
@
"D:\temp\Obfuscated\SimpleLibrary.new.dll"
);
This is some pretty basic code which simply goes through each type
and each method inside an assembly and replaces all invalid opcodes with
a nop.
When we run this code using the default version of Mono.Cecil we
unfortunately come across an error:
Mono.Cecil didn't like
an opcode Now we know what we're fixing!
Getting the source
First of all, we need to get the source for Mono.Cecil to start
working with it. Rather than get the entire Mono system, I decided to
just check out the project that I needed via SVN:
svn co svn://anonsvn.mono-project.com/source/trunk/mcs/class/Mono.Cecil
Unfortunately the project won't compile by itself due to the .snk
file being located in a directory one up from Mono.Cecil. For this
example I simply turned off assembly signing to get this compiling,
however please feel free to download the .snk file and place it in the
appropriate location to have a fully signed version of Mono.Cecil.
Hacking Mono.Cecil
Now that we've got the source and it's compiling; let's hack it. Now,
from the screenshot you'll see that the error is sourcing from the
CodeReader class on line 207 (in my copy anyway). Taking a look in the
code at that line we see the following switch statement:
01.
if
(cursor
== 0xfe)
02.
op = OpCodes.TwoBytesOpCode [br.ReadByte ()];
03.
else
04.
op = OpCodes.OneByteOpCode [cursor];
05.
06.
Instruction
instr =
new
Instruction
((
int
) offset,
op);
07.
switch
(op.OperandType) {
08.
case
OperandType.InlineNone :
09.
break
;
10.
...
11.
case
OperandType.InlineTok :
12.
MetadataToken token =
new
MetadataToken
(br.ReadInt32 ());
13.
switch
(token.TokenType) {
14.
...
15.
default
:
16.
throw
new
ReflectionException (
"Wrong token: "
+ token);
17.
}
18.
break
;
19.
}
That's our error message alright; and it seems to be happening
because it is going into OperandType.InlineTok. Hmmm... well, ideally
we'd like to go into InlineNone due to not having any subsequent
operand. As you can see, the OperandType comes from the variable op
which is defined by the lines:
1.
if
(cursor
== 0xfe)
2.
op = OpCodes.TwoBytesOpCode [br.ReadByte ()];
3.
else
4.
op = OpCodes.OneByteOpCode [cursor];
Well, since we're only working with one byte op codes in this
example, let's concentrate on that. The OpCodes.OneByteOpCode variable
is actually an array which places each opcode as a position in the array
according to it's byte code representation; for example: index 0 = 0x00
= nop, index 1 = 0x01 = break ... etc. In one
of our previous articles, we placed several invalid opcode bytes
throughout the code; all within a certain subset: 0xbe, 0xc0, 0xc1...
etc. Therefore, our invalid opcodes should be at the specified index of
OneByteOpCode; i.e. 190, 192, 193... etc.
Still following? Essentially to solve this problem we need to see
what opcodes are being defined at these indexes in Mono.Cecil at
runtime. Well, as we all know, a struct is never null therefore the
object at each of those "unused" opcode indexes is an empty struct (i.e.
all variables left uninitialised). Due to the way that the Mono.Cecil
OpCode object works, this gives us a confusing result stating that the
size of the OpCode is two bytes - even though it is in the one byte
array (check out OpCode.Size property to see why).
No wonder it causes problems! So how do we fix this? Well, for a
start we should initialise the array inside the OpCodes class to avoid
this issue:
01.
static
OpCodes()
02.
{
03.
04.
for
(
int
i = 1; i <
OneByteOpCode.Length; i++)
05.
{
06.
07.
if
(OneByteOpCode[i].Op2
== 0x00 && OneByteOpCode[i].Code != Code.Arglist)
08.
{
09.
OneByteOpCode[i]
=
new
OpCode(0xff,
(
byte
) i,
Code.Unused, FlowControl.Next, OpCodeType.Primitive,
10.
OperandType.InlineNone, StackBehaviour.Pop0,
StackBehaviour.Push0);
11.
}
12.
}
13.
}
Basically we are looking for all OpCodes that haven't been
initialised properly; that is those with Op2=0x0. We have to be careful
however: both Nop and Arglist use an empty Op2 correctly - therefore we
intentionally skip these ones. Now, if you copied and pasted this into
your code it will complain about the variable Code.Unused. To make
things cleaner I simply added a new option to the Code enum so that
identification of invalid OpCodes is nice and easy. The reason I use the
word "unused" is really so that it is inline with how ILDASM sees an
invalid OpCode.
Before we finish hacking Mono.Cecil; there is one more "aesthetic"
change that I thought I'd make. Technically, the change above fixes the
issue for us; however being the pedantic guy that I am, I also wanted to
fix the "ToString()" method so that it'd display "unused" instead of
"arglist" when an invalid OpCode is present. Well, it actually isn't a
hard aesthetic fix to make. Simple find the Name property in the OpCode
class, and use the following:
1.
public
string
Name {
2.
get
{
3.
int
index =
(Size == 1) ? Op2 : (Op2 + 256);
4.
return
OpCodeNames.names
[index] ??
"unused"
;
5.
}
6.
}
Now to test it all...
Testing our results
As you'll remember; I declared a new enum member: Code.Unused.
It starts to come in use when we rewrite our testing program:
01.
02.
var
assembly = AssemblyFactory.GetAssembly(
03.
@
"D:\temp\Obfuscated\SimpleLibrary.dll"
);
04.
05.
06.
foreach
(TypeDefinition type
in
assembly.MainModule.Types)
07.
{
08.
09.
foreach
(MethodDefinition
def
in
type.Methods)
10.
{
11.
12.
if
(def.HasBody)
13.
{
14.
15.
CilWorker worker = def.Body.CilWorker;
16.
17.
18.
List<Instruction>
instructionsToFix =
new
List<Instruction>();
19.
20.
21.
foreach
(Instruction instr
in
def.Body.Instructions)
22.
{
23.
24.
if
(instr.OpCode.Code ==
Code.Unused)
25.
instructionsToFix.Add(instr);
26.
}
27.
28.
29.
foreach
(Instruction instr
in
instructionsToFix)
30.
{
31.
Instruction
newInstr = worker.Create(OpCodes.Nop);
32.
worker.Replace(instr,
newInstr);
33.
}
34.
}
35.
}
36.
}
37.
38.
39.
AssemblyFactory.SaveAssembly(assembly,
@
"D:\temp\Obfuscated\SimpleLibrary.new.dll"
);
We use Code.Unused to test for an invalid opcode to replace.
What are the results? Well, Reflector can now decompile the code as per
usual (again):
Reflector now works ok
again Conclusion
This week we took a look at "fixing" the problem with Mono.Cecil when
we reached an invalid OpCode. Essentially to fix the problem in
Mono.Cecil involved:
- Creating a new enum member Code.Unused so that we can identify
invalid opcodes - Initialising the static array with our invalid opcodes:
OpCodes.OneByteOpCode. This helped provide us with accurate opcode
descriptions in unused positions. - (Optional) Changing OpCode.Name to return an accurate friendly name
for invalid opcodes.
Once Mono.Cecil could handle these Opcodes, we had no problem
whatsoever writing an automated tool to "fix" the assembly for us. It
certainly doesn't take much to reverse some of the "value added"
obfuscation techniques does it!?
Next time
Well, that's all for this week. If you have any
questions/suggestions/notes, then please let me know. Not sure what the
next article will be about yet, however I'll be sure to make it
something interesting (perhaps tamper proofing?). What are your
thoughts?