Overload-Induction Method Renaming

Dotfuscator implements patented technology for method renaming called Overload-Induction™. Whereas most renaming systems simply assign one new name per old-name (i.e. “getX()” will become “a()”, “getY()” will become “b()”), Overload-Induction induces method overloading maximally. The underlying idea being that the algorithm attempts to rename as many methods as possible to exactly the same name. Many customers report a full thirty-three percent of all methods being renamed to “a()”. In these cases, that is a full fifty-percent of renameable methods! We say renameable because many methods inherently cannot be renamed, these include constructors, “class constructors”, and methods intended to be called by the runtime.

After this deep obfuscation, the logic, while not destroyed, is beyond comprehension. The following simple example illustrates the power of the Overload Induction technique:

Original Source Code Before Obfuscation
private void CalcPayroll(SpecialList employeeGroup) {
   while (employeeGroup.HasMore()) {
        employee = employeeGroup.GetNext(true);
        employee.UpdateSalary();
        DistributeCheck(employee);
    }
}
Reverse-Engineered Source Code After Overload Induction Dotfuscation
private void a(a b) {
    while (b.a()) {
        a = b.a(true);
        a.a();
        a(a);
    }
}

The example shows that the obfuscated code is more compact. A positive side effect of renaming is size reduction. For example, if a name is 20 characters long, renaming it to a() saves a lot of space (specifically 19 characters). Renaming also saves space by conserving string heap entries. Renaming everything to “a” means that “a” is stored only once, and each method or field renamed to “a” can point to it. Overload Induction enhances this effect because the shortest identifiers are continually reused.

There are several distinct advantages to this methodology:

  1. Renaming has long been a way to make decompiled output harder to understand. Renaming to unprintable characters (or names illegal in the target source language) is futile since decompilers are easily equipped to re-rename such identifiers. Considering that Overload-Induction could make one of three method names “a()”, understanding decompiled output is more difficult to say the least.
  2. Overload-Induction has no limitations that do not exist in all renaming systems (such limitations are discussed later).
  3. Since overload-induction tends to use the same letter more often, it reaches into longer length names more slowly (e.g. aa, aaa, etc.). This also saves space.

Overload-Induction’s patented algorithm determines all possible renaming collisions and only induces method overloading when it is safe to do so. The procedure is provably irreversible. In other words, it is impossible (even via running Overload-Induction again) to reconstruct the original method name relationships.

© 2002-2007 PreEmptive Solutions. All rights reserved.