JavaScript Source Compression versus Obfuscation

Posted: July 6th, 2005
Filed under: Uncategorized


If you didn’t know it before, Port80 Software does dabble in client-side tools.  One such little project of ours called w3compiler brings up lots of questions about JavaScript compression and obfuscation. 

The basic premise of w3compiler is to prepare HTML, JavaScript, CSS and other source files for optimal Web site/application delivery.  In order to optimize core Web site files for fast download, a variety of techniques can be employed, depending on the file type.  For example, in (X)HTML files, you can remove most white space, comments, and unused tags such as extra

tags.  CSS is fairly similar.  JavaScript, however, presents a totally different set of possibilities and challenges.  Of course, you can remove white space and some dead code statements here like you can in HTML and CSS.  You can also shorthand statements (for example, x=x+1 would become x++, with a whopping 2 byte savings – hey, every 1 and 0 counts on the wire, right?).  It may not seem like much, but it adds up, particularly with more complex AJAX-style applications
 
A challenge we sometimes face here at Port80 is explaining exactly what we are doing with JavaScript in the w3compiler. Well, first off, w3compiler does safely remove white space, which indeed can make code look a little harder to decipher and make it much smaller.  Yet, the software does a number of other things, namely variable renaming and object remapping.
 
In the case of variable renaming, you might have a variable like usersFirstName, and the tool might remap that variable everywhere across the entire Web site/app to uF or something similar.  Variables include function names and such, so there could be quite a bit to rename across an entire site…
 
Object remapping is a bit different. For example, you might have repetition of:
 
document.write('hello '); 
document.write('world');
document.write('!');

 
w3compiler might transform the above to:
 
var d=document;
d.write('hello ');
d.write('world');
d.write('!');

 
The remapping of objects is done intelligently so as to not to increase code size or break the site — so the w3compiler has to understand usage of the code, much like a Web browser would parse the file.  One downside: you do pay a slight browser memory penalty and potentially performance penalty at run time as the object is copied into the local browser’s memory space — but it is negligible and offset in most cases by the reduction in code size from optimization.
 
Delving deeper into the balance between JS speed optimization and obfuscation, let’s review a more complex example.  Given JavaScript like:
 
var longVar = 5;
var longVar2 = 10;
 
function foo( )
 {
  document.write('foo ');
  document.write('is');
  document.write(' a ');
  document.write('silly ');
  document.write('function!');
 }
 
foo();
alert(longVar * longVar2);

 
You might w3compile it to something like:
 
l=5,l2=10;
d=document;
function f()
 {
  d.write('foo ');
  d.write(' is');
  d.write(' a ');
  d.write('silly ');
  d.write('function!');
}
f();
alert(l*l2);
 
In fact, it would really look like:
 
l=5,l2=10;d=document; function f(){d.write('foo ');d.write(' is');d.write(' a ');d.write('silly ');d.write('function!');}f();alert(l*l2);
 
Now, you might look at this code and exclaim, “Great, now it’s real difficult for people to steal my hard work in JavaScript!”  Well, that’s somewhat true.  The reality is that if you really wanted to make things harder, you might have variables not l and l1, but l1l1ll0l0l and l11l1ll0101 — which are, as they say, a bit difficult to just eyeball and then unroll the code.  You might go further and run some simple obfuscation against the code like escaping the code:
 

eval(unescape("l%3D5%2Cl2%3D10%3Bd%3Ddocument%3B%20function%20f%28%29%7Bd.write%28%27foo%20%27%29
%3Bd.write%28%27%20is%27%29%3Bd.write%28%27%20a%20%27%29%3Bd.write%28%27silly%20%27%29%3Bd.write
%28%27function%21%27%29%3B%7Df%28%29%3Balert%28l*l2%29%3B"));

 
Combo the two or do some other types of encodings, and you make it harder and harder to figure out what is going on with your JavaScript.  This is the idea of obfuscation — making it so much of a pain for the potential thief so they move on to easier targets.  Of course, with those digital assets that people want (like free video games, serial numbers, etc.), programmers can really add amazing obfuscation and copy protection — and folks will still spend quite a lot of time cracking away at this stuff.  This is not to say that you shouldn’t attempt to obfuscate with a tool like w3compiler, but let’s be realistic about the protection JavaScript obfuscation provides…
 
In the case of w3compiling, we balance the need for speed with the need for protection — so far, we favor speed.
 
– J.A. @ Port80
 
P.S.  The new version of w3compiler 2.0 is in internal BETA. If you want to be on the public beta team, just send an e-mail to beta@port80software.com.

1 Comment »

One Comment on “JavaScript Source Compression versus Obfuscation”

  • Too many new functions are very bad for performance.

    Such designs should be used only in unusual cases, for example for getting id..

    function $(v){return document.getelementbyid(var)}

    Posted by: iFrame at 8:09 am on April 29th, 2008