Garrett (garote) wrote,
Garrett
garote

  • Mood:

Reverse-engineering machine-minified Javascript

1. Literals are substituted

"void 0", "!1" and "!0" are turned into variables and then substituted all through the code. These three are actually "undefined", "false" and "true".
Also look for a "null" variable, and a variable assigned "this" in the global context.
It may help to search and replace using regular expressions to avoid dot references of properties in objects. You only want to swap globals. For example, to avoid matching property references when searching for global "Xx", use "[^\w\.]Xx[^\w]".

2. Numbers are shortened

"1E9" is actually the integer "1000000000". Swap away.

3. Any string used more than once is substituted for a variable.

Feel free to swap out any variable that appears to be assigned a value only once. This is most often done with the common strings used to work with DOM attributes. For example, if you see this:
var ba = 'hidden';
... And then 2000 lines of code later you see this:
if (ba==j.visibility&&ba==k.visibility) {
	doSomething();
}
Feel free to make it more readable by changing it to this:
if (j.visibility == "hidden" && k.visibility == "hidden") {
	doSomething();
}

4. Substituted library references

Library functions like "Math.max" are redefined as variables, e.g. "var Ng = Math.PI". Swap all those out.

5. Spacing and brackets are eliminated

Time to fix up some spacing. Consider the following:
if ((g!=e||l!=d)&&b)return 0
This will read a bit better when you re-insert some spaces, and perhaps some parenthesis:
if (((g != e) || (l != d)) && b) {
	return 0;
}
Now that lowercase L has less chance of getting lost in the shuffle, hmm?

6. If statements are inlined.

Obfuscated code heavily leverages order-of-operations and conditional execution.
For example, "if" statements take up space. "if (a==3) {alert('yes');}" can actually be re-written as "a==3&&alert('yes')".
There are two tricks at work here.
First, the "==" operator has higher precedence than "&&", so the interpreter is guaranteed to do that comparison - which has a result of "true" - before looking at the "&&".
Second, the logical "&&" operator will always evaluate its left-hand-side first. If the left-hand-side evaluates to the equivalent of boolean "false", then the "&&" operator actually avoids evaluating - and therefore executing - the code on the right-hand-side. Why? Because the interpreter knows that both sides of "&&" must evaluate to "true" in order for the conditional to be "true". If one side does not, the interpreter doesn't need to check the other side to determine the final outcome. The outcome must be "false" no matter what the other side is. So it skips the code.
So, in effect, "a == 3 && alert('yes')" behaves exactly the same as "if (a==3) {alert('yes');}"

(Nevertheless, it is considered bad form to compose code this way. A general rule for clarity is, if you intend to skip execution of code under certain conditions, you should use something more obvious than order-of-operations to point that out.)

Anyway, it is possible to unpack "&&" into an "if" statement directly. Other operators can be unpacked in similar ways. Here's a more complicated example:

Packed:
testMe()&&4==EnvInfo.type&&3<=EnvInfo.version||goForIt(d)
Unpacked:
if (!(testMe() && (EnvInfo.type == 4) && (EnvInfo.version >= 3)) {
	goForIt(d);
}
There are several things going on here:
Numerical comparisons are usually swapped, from left-to-right, in obfuscated code, just to mess with your brain, so they've been un-swapped here.
The last comparison is an "||" operator. When unpacking this, you may get confused about exactly when the right-hand-side of the "||" should be executed, and when it should be skipped. In Javascript, "&&" has slightly more precedence than "II", so everything to the immediate right or left of the "&&" operators should be evaluated before dealing with the "II".
Furthermore, the "II" operator follows a non-execution rule similar to "&&". If the left-hand-side of the "||" evaluates to "true", the interpreter knows that the whole "II" evaluates to true and therefore skips the right-hand-side. But if the left-hand-side is "false", the interpreter must then execute the right-hand-side to determine whether it is "false" as well, which is the only way the entire "II" could evaluate to "false".
To reflect this reversed logic of the "||" relative to the "&&", we unpack it into an if statement with the entire contents of the if test negated.

(As an aside, the same left-to-right trick is used in assignment operations. "var t = fred || barney" will assign the value of "barney" to "t" if "fred" evaluates to "false". More specifically, if the result of type-converting "fred" to "boolean" results in "false". That means if "fred" is integer 0, or an empty string, or undefined, or null, then "t" gets assigned "barney" - but if "fred" is the STRING "0", it will go straight to "t" and "barney" will be ignored.)

7. Statements are chained using commas

The comma operator is an ugly and bizarre Javascript feature that you've probably never heard of. It's most intriguing trait is that it is dead last in the order of operations stack. It's called the "multiple evaluation" operator, and it can be combined with parenthesis to enclose what would otherwise be multiple statements entirely inside boolean operator logic.

An example will clarify things:
item&&(item=item.toUpperCase().replace("6","7"),doSomething(item),t=5);
Using what we know already, we can unpack this into an if statement:
if (item) {
	(item=item.toUpperCase().replace("6","7"),doSomething(item),t=5);
}
At this point we have a chunk of stuff in parenthesis. Since there is nothing happening outside the parenthesis - no assignment, no further logic - we can safely remove them.
if (item) {
	item=item.toUpperCase().replace("6","7"),doSomething(item),t=5;
}
In this context, the comma operator is the equivalent of the semicolon, and we can rewrite it.
if (item) {
	item = item.toUpperCase().replace("6","7");
	doSomething(item);
	t=5;
}

8. If-else statements all become "?:" shorthand

Time to get familiar with the inline conditional comparison operators, "?" and ":". These are logically equivalent to "if/then" and "else" without the brackets and parenthesis.

This chunk:
this.G&&(this.loaded()?this.L=true:load(this.G),this.armed()?fire(this.G):arm(this.G))
Unfolds to:
if (this.G) {
	if (this.loaded()) {
		this.L = true;
	} else {
		load(this.G)
	}
	if (this.armed()) {
		fire(this.G);
	} else {
		arm(this.G);
	}
}
Sometimes these get nested. The trick to interpreting them is to read them left-to-right and think of each "?" as opening a new block, and each ":" as closing one. Also, keep in mind that they have lower precedence than all the other comparison operators. That will affect what ends up in each block.

This nested series:
this.G&&(this.loaded()?this.armed()?fire(this.G):arm(this.G):load(this.G))
Unfolds to:
if (this.G) {
	if (this.loaded()) {
		if (this.armed()) {
			fire(this.G);
		} else {
			arm(this.G);
		}
	} else {
		load(this.G)
	}
}
If you get lost, use the "highlight block" feature in your editor to verify what's inside what. (In TextWrangler it's command-B.)

9. Generic accessor functions are constructed on-the-fly

In the beginning of a block of compressed code you may find something like:
function Aa(a) {return function(b) {this[a]=b}}
Unpacked and given a name, this turns into:
function makeFunctionThatSetsValOfProperty(a) {
	return function(b) {
		this[a]=b
	}
}
This can used farther along to cheaply construct a generic accessor function, like so:
thingy.prototype.setValueOfF = makeFunctionThatSetsValOfProperty("f");
Whether you swap these out is up to you. They're a pretty handy optimization. Just keep in mind that if you decide to rename the property "f" in object "thingy" to something else, you need to alter this function call as well.

10. Look at the objects the code builds in the browser.

If your aim is to reverse-engineer something, you will eventually run into a problem where the properties of various objects all look alike, getting you lost in the code. It will help you immensely if you can get the code you're looking at running in a web browser, where you can use the console to inspect the contents of objects in memory. This will tell you exactly what type of object has been assigned to each property, helping you to correlate references in one part of the code with objects in another.

At first you will probably only be able to get the fully obfuscated version to run. To keep track of things, make a list of all the strings you've renamed while reading the code, and their old equivalents, like so:

ot = smartSplitOnEscaper
ov = deEscapeAsterisks
pr = asteriskEscapeSomeChars

Eventually, with luck, brains, and a huge amount of perseverance, you'll understand what's going on, and learn the secrets of the universe.
Subscribe
  • Post a new comment

    Error

    default userpic

    Your reply will be screened

  • 5 comments