Microsoft Humor: Blue Screens of Death

Posted: November 9th, 2006
Filed under: Uncategorized


As they say, “amaze your friends and scare your enemies” with this nostalgic screen saver:


http://www.microsoft.com/technet/sysinternals/Miscellaneous/BlueScreen.mspx


: )


– Port80

1 Comment »

On Streaming, Chunking, and Finding the End

Posted: November 8th, 2006
Filed under: Uncategorized


As with most data transfer events, there are two basic ways to send an HTTP message:  All at once, or in pieces.  In other words, HTTP provides the option of streaming data until there is no more to send, rather than sending all the data in a single logical unit with a known size. 

If you are a long-time Web developer, you are probably familiar with how to do flushes in server-side scripts.  This technique allows a script to start sending some data down to the user, while it goes on and does some relatively slow processing (say, a costly database query).  If you have ever written such a thing, then you have also taken advantage of the underlying HTTP streaming mechanism, whether or not you knew the details of how HTTP was getting the job done.

But this is not just ancient history.  If you are a contemporary Web developer or an administrator responsible for taking care of the code written by such developers, then HTTP streaming is probably going to end up being an increasingly important part of your professional life — whether you realize it or not — due to the increasing importance of Ajax in contemporary Web applications.  The statefulness and responsiveness that developers are turning to Ajax to achieve partly depends on the fact that HTTP (among other neat tricks) can send and receive data in stream-wise fashion.

So, at this point, you might be thinking, “Well it’s cool that HTTP does this for me, but I don’t really want to know how it all happens.  Why should I?  I just want to use the darn thing, not become an expert in it.”

Here is where we get to trot out one of our favorite Spolskyisms — Spolsky‘s Law of Leaky Abstractions — which says this:  All abstractions leak.  And there is also an important corollary to the law, which we would phrase like this:  When an abstraction starts leaking on you, if you don’t understand how that abstraction works, you are going to get very wet.

Now it’s true that, most of the time, whether you are a developer or administrator, you don’t have to worry about any of this.  It is all mercifully abstracted away for you by the HTTP implementation in your development environment’s libraries, or in your end-user’s browser, or on your Web server.  If it weren’t for such lovely abstractions that hide all the sausage-making we don’t really care about from us, we would never get anything done.  Instead, we would be spending all our time googling for things like this article.

But now suppose you are administering a Web application that uses Ajax, and you change something in the Web server’s configuration, and a particular widget in the application suddenly starts hanging — even though the Web server and the server-side of the application appear to be perfectly healthy, and even though the data seems to be getting to client, according to your network traces.  Or suppose you are implementing a Web service, and it works fine when the consumer asks the provider for the data using HTTP 1.0, but when it asks using HTTP 1.1, suddenly the message no longer parses correctly.  Cases like these, and many others of similar weirdness, could very well be the result of the HTTP streaming abstraction breaking down.

It just might be worth understanding some of the underlying concepts and complexities after all!

For instance, did you know that there are two different ways in HTTP to stream data?  All together, that makes three different ways that an HTTP message can get from its provider to its consumer.  And did you know that each of these ways that an HTTP message can be sent has different implications, in turn, for what happens on the next lowest layer in the protocol stack, down in TCP?  In particular, that each way of sending a message in HTTP affects what happens with the duration of the underlying TCP connection?

Lost yet?  If so, don’t feel bad.  Whenever we at Port80 find ourselves having to bring this kind of thing up around fellow Web professionals, we are always a little surprised by how few of them — even very seasoned and accomplished ones — have any clue what we are talking about.  Fortunately, HTTP streaming is really not that hard to understand.  All you have to do is start at the end.

No, that wasn’t a typo.  The end really is the place to start.  It’s like this:

HTTP messages can contain a payload of data: an entity body in HTTP-speak (not that this is always the case: think of HEAD requests). When they do, the HTTP implementation that has to handle that data needs to know when it has finished receiving the last of it.  This is a requirement whether we are talking about a client like a browser, a bot, a Web service consumer, or a server handling data from the client, as in the case of a POST.  The key to understanding HTTP streaming is to focus on how it is that the process at the receiving end of an HTTP message knows, in the course of handling the data contained in that message, that it has reached the end of the data.

There are only three possible ways the process on the receiving end of the message can know this.  As you might have guessed, these three ways happen to correspond to the one non-streaming and the two stream-wise ways of sending data in HTTP:

1) The non-streaming case.  The most straightforward way of telling the receiving process where the end of the HTTP message is to use the Content-Length header.  When this header is present, an HTTP client or server can count on the message size matching the number of bytes indicated in header value.  Content served from static files almost always contains this header, in which case the value is the size (in bytes) of the file itself — which in the HTTP message becomes the entity body (the data minus the headers). Likewise, POST requests from a client contain a Content-Length header. 

Use of Content-Length makes knowing where the end of the data is pretty efficient and foolproof.  Unfortunately, it also precludes streaming the data by sending it in an arbitrary number of pieces of arbitrary size.  The server streaming data to a client won’t know, at the beginning of a stream, how many pieces are to come, or how large each one will be.  If it did, it probably wouldn’t need to stream the data in the first place.  You see the dilemma.

2) Enter HTTP 1.0 style streaming.  If an HTTP message has a entity body, but it also lacks a known content length at the point when the message starts to be sent, then the simplest way to tell the HTTP client implementation on the receiving end that it has got all the data is to close the underlying TCP connection.  Nothing works like a closed connection to say, “That’s it, no more!”  If you use an HTTP protocol analyzer (like HttpWatch)to look at the header section of an HTTP response that is streamed in this fashion, you will usually see this header below:

Connection: close

indicating the intention of the server to close the underlying connection, as soon as it is done streaming the data (technically, the header need not be present in an HTTP 1.0 context, since the default connection type for HTTP 1.0 is non-persistent, meaning 1.0 implementations assume that the connection will be closed unless accompanied by a Connection: Keep-Alive header).  If you look at this same response at the TCP/IP level (by using Wireshark or an equivalent tool), you will see that immediately after the server sends the last of the data to which the Connection: close header applies, it does in fact initiate a full-duplex close at the TCP layer, by sending a packet with the FIN and ACK bits set.  All this works very well as a way of telling the client where the end of the stream is.

There is obviously a rather nasty trade-off here, though.  Yes, by this means a server can tell a client that it is done sending data.  But the price is that the TCP connection over which the data was sent is torn down, and it will need to be rebuilt if any more messages are going to change hands.  This vitiates the whole idea of what HTTP 1.0 used to call keep-alives (what in HTTP 1.1 we call persistent connections).

Is that a big deal?  Well, yes, it can be, depending on the circumstances.  The reuse of existing TCP connections, when possible, is a major performance optimization.  It not only saves the time required to set up a new pair of TCP/IP sockets, it also makes for much more efficient use of networking resources.  This is true both for browsers, which can multiplex numerous requests over a few persistent connections when loading a page with many dependencies, as well as servers, which would rapidly accumulate connections in the TIME_WAIT state if clients were not able to keep reusing existing connections.  Optional in HTTP 1.0 (where it is indicated by the Connection: Keep-Alive header), persistent connections were made the default behavior in HTTP 1.1 — precisely to address these types of concerns.

Unfortunately, there was no way in HTTP 1.0 to avoid a sharp trade-off between steaming data and keeping the TCP connection alive.  You had to choose, because there was no other way besides cutting the connection to tell the client where the end of the data stream was….

3) You say you would like to be able to avoid this trade-off?  So did the folks who wrote the HTTP 1.1 specification.  They were keenly aware that the “Keep-Alive” mechanism had been something of an afterthought bolted onto HTTP 1.0, and that it made no provision for streaming.  And so we have two big changes in HTTP 1.1:  First, persistent connections were made the default connection type (what you get unless you tell the program on the other side that you want to close the connection); second, something called “chunked transfer encoding” was added to the mix. 

The presence in the header section of an HTTP 1.1 response of this header below:

Transfer-Encoding: Chunked

means that the message was sent using chunked data.  If you examine such a response at the TCP/IP level, you will notice that the connection normally remains open after all the data associated with this header has been sent and received — there is no FIN/ACK coming immediately from the server after the last chunk.  Instead, the socket usually gets reused for the next request in line.

How is this possible?  Meaning:  How does the browser know where the end of the data stream is?  After all, it can’t deduce this any more from the fact that the connection is being torn down.

As it happens, chunked encoding provides a way to tell the browser where it is in the stream at all times, as well as when it has gotten to the end.  First, each chunk of data is preceded by a field that indicates the length of that particular chunk in hexadecimal bytes (represented using the ASCII characters).  The length field and the chunk of data corresponding to it are each terminated with a CRLF sequence.  Then comes the next length/data pair, and so on, until all the data is received.  Only the last chunk lacks a length field, because it is always a value of zero — more accurately, it always contains the value 0 before the terminating CRLF.  This is how the client is able to determine that the stream has run its course…

Hopefully, this walking tour of streaming and chunked transfer encoding will make it a bit easier to understand these HTTP fundamentals and will help in Web site and application debugging.  This topic often comes up at Port80 in relation to HTTP compression with httpZip and ZipEnable, so if we can clarify anything for you, just let us know.

– Port80 Software

1 Comment »

The “Truth” About Google Search Results Stats

Posted: November 2nd, 2006
Filed under: Uncategorized


I bet you love Google. We certainly do at Port80.  Yet, like it or not, at times in any relationship you can start questioning your love, and something we noticed recently just didn’t sit well with us:  We have found that that they appear to put a bit of spin on Google’s search results.

For example, look at the stats provided for this quick Google query on “Ajax library” (run it yourself: http://www.google.com/search?hl=en&q=Ajax+library):

 
A whopping 12,400,000 references were found by Goggle in a mere .09 seconds. Hot damn, that’s fast! And double damn, that’s a lot of info on Ajax libraries, and hopefully these aren’t mentions about cleaning your library with Ajax, but about writing code.

Anyway, jests aside, we tried to get to really dig into the results — you know, go beyond the first page.  Away we go, starting to page through, and we get to only 661 results… where Google politely informs us that it has omitted stuff that we shouldn’t care about.

So, we try again with the omitted results, and this time we get to 1000 results and no further.

 
You can try as you like with the Advanced Results, even changing the number of results to 100 per page, and you will get no farther than 1000.  Try it yourself here.

So, it turns out there might be 12 million or more references to “Ajax library“ that Googlebots have discovered on the Web, but you can only have 1000 — so much for the long tail idea.  In fact, if you think about it, this is roughly 0.008% of the total content you can access on the query“Ajax library”.  So, if you query for something super common like “html”:

http://www.google.com/search?hl=en&q=html&btnG=Google+Search

… you will get about 9,430,000,000 results — but again only 1000 available for you to actually search!  In this case, it a whopping 0.00001% of the search results that Google says are available.

However, to be fair, if your result set is below 1000, you may be able to get them all, but watch a bit of slight of hand happen here:  Query for “fie foo fum 123“,  and you may see 520 results available.  Try to access the results, and you get to 291, a bit over half.  If you retry with omitted results you get now 517, but as you page through the result set, it magically drops by one which you can reach. It seems really fishy how these numbers are calculated — and if they have any value.

If you go looking around on the Web, you will find that, lo and behold, other folks have seen this (http://www.googleguide.com/last_results_page.html), but it isn’t really as commonly known as you’d expect.

So, we still love Google, but let’s call that bar on the right what it really is: marketing and branding!  Yes, Mr. Google, you are fast, and yes, you are big, but since you really can’t do anything with it, the “data” is about as useful as the signs where McDonald’s tell you about how many burgers they have served.

Sincerely,
Graham A.
300 Millionth American Citizen and Future Port80 Employee

PS: Not to pick on Google exclusively, it is similar at Yahoo.  If you query for “Ajax library”, you find 5.9 million results but are also limited to 1000 viewable results. Check it out here.

3 Comments »

Microsoft and Zend: PHP to be a first class app on IIS

Posted: October 31st, 2006
Filed under: Uncategorized


Microsoft continues to offer the olive branch to the open source community with its recent announcement to work with Zend to make PHP run faster on IIS and Windows, which should effectively erase a major Apache Web server performance advantage:

http://www.informationweek.com/management/showArticle.jhtml?articleID=193500750

This is great news for the whole Internet.

Cheers,
Port80

PS: httpZip 3.7 is coming soon with kernel mode caching support, Ajax features, and PDF compression…  Now that will make for even faster PHP (and whatever you got) sites and apps on IIS!

1 Comment »

Phishing and Privacy

Posted: October 27th, 2006
Filed under: Uncategorized


Both of the new browsers (IE and FF) want to combat the dreaded specter of phishing. 


You know those Citibank and Ebay e-mails that have plagued your Inbox? Those weren’t from real sites, we hope you know.  Anyway, the two new releases of the world’s most popular Web browsers attempt to combat phishing with something quite disturbing in reality.  The basic idea is that, as you browse, they send the site address to be inspected to a service (Google or Microsoft) to be compared against a list of known phishing sites.  If things are clear, nothing happens, but if you have a match, the service reports back that you might want to watch out.  If you are at a site that looks suspicious, you can even report them to the anti-phishing authorities.


So, kind readers, we shouldn’t have to tell you, but this stuff scares the living daylights out of us. Let’s summarize: we’ll just send our entire browsing habits to some third party for analysis (that’s it, no data mining whatsoever?)?  No problem right?  Oh well, maybe we have been doing that all along anyways with our browser toolbars, so who cares… 


Now, surfers out there on the Internets can even nominate a Web site as a Phisher King, whether it is or not (please, please, please don’t brand Port80’s sites with a scarlet P – instead, we would happy to share with you our competitors’ URLs and our enemies list, and you can have big time fun reporting them as phishing sites). The mob can be smart, we’ll give you that, but the mob also burns witches and throws people in rivers with rocks to see if they float — so you don’t think that happens online? Let’s sincerely hope that Microsoft and Google add human editors to their phishing service; otherwise, in the short term, someone or some organization is bound to get burned here.


– Port80


PS: We turned our phishing service off.  We’ll just opt (yeah, right) to give our privacy away to our ISP instead!

No Comments »

Internet Explorer Turned 7 This Month

Posted: October 26th, 2006
Filed under: Uncategorized


So, IE 7 is out, and not a moment too soon, as it has only been five years of browser bugs, CSS problems, and security concerns. 

We are happy to say that IE 7 indeed is a MAJOR improvement! 

Microsoft developers removed a number of security concerns, including the clipboard problem we mentioned a while back.  The browser also is much more restrictive in its security settings, and it really tries to inform users of anything suspect.

Unfortunately, if you are a Web developer, you aren’t going to get everything you want.  CSS support is indeed improved. Fixed positioning, max-width, min-width, direct child selectors – yes, these are there, but don’t expect full conformance.  PNG support is in place, but don’t look for all the little details to be in place.  Ajax is native, but JavaScript is still slow, and there are few new additions.  If you liked tabs in Firefox, you’ll like IE 7, and RSS is in there too.  We looked under the hood and saw little new.  A new header flying around for CPU for some reason (UA-CPU HTTP header with a default value of x86) — maybe animation concerns under Vista?  Just a guess.  We saw no differences in HTTP compression and no inclusion of Q-ratings for content negotiation outside of language.

In short, there were no surprises, but that’s OK. We are still happy. It’s been five years, and we are all excited at Port80 Software that IE can blow out its seven candles now!

– Port80

1 Comment »

Ajax Leaks Memory (and Other “Cause and Effects”)

Posted: October 25th, 2006
Filed under: Uncategorized


Very often, we at Port80 have noticed that causality and correlation are not well understood in the Web development community — case and point being memory leaks in JavaScript and Ajax.  Yes, folks, it is true with many browsers (read Internet Explorer/IE 6) that you can have major problems with memory and JavaScript.

However, is this type of leak an Ajax-exclusive issue or just correlated to the use of Ajax?  We would like to propose that this is not related directly to Ajax but simply the use of JavaScript itself in a certain way.

The situation is this: IE 6 does not memory manage well (though IE 7 does exceptionally well — more on that in our next post).  However, you are unlikely to see major problems with IE 6 and memory because, under traditional Web apps, you move to other pages and recover the memory before you start to encounter problems.  However, in an Ajax app, you may stay on a page long enough to have some real issues with memory consumption. So, yes indeed, under an Ajax app, you may have memory problems but that is not the cause — it is simply the effect of the Ajax pattern.  The cause of the memory problem in IE 6 is that you need to manage your references, use closures, and simply acknowledge that at times IE 6 does leak memory like a sieve.

We can repeat this type of cause and effect correlation in Web development over and over.  Another example: have you heard that site maps improve page ranking?  Well, maybe so you think, but we say maybe your problem is actually that your nifty JavaScript or Flash navigation system just couldn’t be crawled by a search engine bot (whoops, it looked cool). Thus, by adding such text links in the classic site map, you simply got indexed, and rank goes up!?!?

For our last object lesson for the day, look at this Web developer truism: if you repeat the usage of images, CSS, and JavaScript, you cache the object and dramatically improve access.  Yes, that is true… sort of.  Browsers by default will revalidate an object that was previously accessed/downloaded to see if it has changed on the origin Web server. If it has changed, the modified file is served — if the file has not changed since it was first accessed by the browser, you get a 304 Not Modified response from the server.  Of course, this leads to browsers and proxy caches pounding the heck out of origin Web servers for files that have not changed since first access and slows everything down for your site users. Most browsers will not recheck objects during a browsing session after the first check for already-accessed content, and thus indeed reuse improves performance to a certain degree.  The real cause for speed bumps with previously downloaded files is that most sites are not telling browsers how long to consider an accessed file as valid or “fresh”.  Unfortunately, even with one wasted 304 round trip for revalidation, once you end that browser session, you will still see a validation check the next time the same browser goes online to access that file in a new session — and the look-up performance hit happens all over again. This effect is easily remedied through expiration-based cache control headers, which are rarely used (see Port80’s study on Fortune 1000 sites and cache control). Hopefully, you now get the whole point of CacheRight.

So, kind readers, please remember: if you see an effect on the Web, try to find the real cause, as it may not be what you think.

– Thomas @ Port80

No Comments »

IIS 6 Beats Out Apache in New Port80 Fortune 1000 Surveys

Posted: October 11th, 2006
Filed under: Uncategorized


After almost a year hiatus, Port80 Software has released new surveys today on the Web server and application server technologies used on the main corporate Web sites of Fortune 1000 companies:


http://www.port80software.com/surveys


There is some interesting news in this latest survey:  IIS 6 penetration in Web server deployments has doubled since 2005 and has overtaken Apache to serve more Fortune 1000 sites.  A press release exploring this development and the surveys historically, including Microsoft’s take on the surveys, is available at http://www.port80software.com/about/press/101106.


These new survey results come at a time when Microsoft IIS has been gaining market share in the Netcraft surveys against Apache as well.  ComputerWeekly.com recently pointed out that Microsoft IIS has reduced Apache’s lead in the Netcraft survey from 48.2% to 31.5% in the period from March to June 2006 alone:


http://www.computerweekly.com/Articles/2006/10/03/218751/IIS+closes+gap+on+Apache+as+it+boosts+net+integration.htm


ASP.NET also continues to lead in Fortune 1000 application server deployments in the second edition of that Port80 Software survey also released today.  Check out the surveys for detailed results, graphs, and check on Fortune 1000 site technology and scan your own site with Port80’s online tools at http://www.port80software.com/surveys.


Best regards,
Port80 Software

No Comments »

Ajax Articles @ Network World from Port80’s Thomas Powell

Posted: September 7th, 2006
Filed under: Uncategorized


Port80 Software thought leader Thomas Powell has expanded on this year’s series of Ajax articles on the blog with these Network World articles on the technology:

AJAX is the future of Web app development

If you’ve used Google Maps, Gmail or Microsoft’s Outlook Web Access, you’re familiar with the power of AJAX, which gives Web applications the responsiveness users associate with desktop applications.

http://www.networkworld.com/research/2006/071706-ajax-future.html


How an AJAX application works

The key to AJAX is the XMLHttp Request (XHR) object, which can be invoked to send and receive XML and plain text data synchronously and asynchronously via HTTP.

http://www.networkworld.com/research/2006/071706-ajax-future-how.html

Cheers,
Port80

 

No Comments »

Error Messages: How to frame issues amidst hacker reconnaissance

Posted: September 5th, 2006
Filed under: Uncategorized


Yes, you should be trapping errors that occur in Web sites and applications – and report back to the parties concerned with the error condition. 


It is good for your users, whose expectations should be managed and whose patience may be slim, even if you have a great site architecture/navigation/search/etc (older research suggests a 95% site abandon rate on an HTTP error, which feels right for an e-commerce site but is probably high for a B2B site or business application; of course, keeping all users on track is never a bad thing, even if the abandon rate is half that). 


It is equally good for you to track these errors on the Web server side as well and feed this info back into your development process to continually improve user experience and increase application efficiency. 


But it ain’t good if your displayed error messages tell hackers what you are doing from a security perspective.  Don’t be too nice or too descriptive in error handling messages on the public side, or you may be exposing a larger attack surface to hackers…


This excellent article by SPI Dynamics explores the topic in detail:
http://www.securitypark.co.uk/article.asp?articleid=25746&CategoryID=1

– Port80

1 Comment »

Microsoft, can we get a little QA please?

Posted: August 17th, 2006
Filed under: Uncategorized


You know we love Microsoft here at Port80.  Deep down, Redmond always means well, but with a platform like Windows, IIS, Internet Explorer, etc., that’s a heck of a lot of moving parts, and things can break.  We even try to help with our products, but folks, they really broke IE6 with a recent patch in conjunction with HTTP compression — check this out:

http://www.computerweekly.com/Home/Articles/2006/08/16/217755/Microsoft+security+patch+crashes+browsers.htm


It makes you wonder why the Microsoft Internet Explorer team wouldn’t test compatibility with HTTP compression, something the largest, high traffic sites like Google,  EBay, and Amazon use — hey, I guess not.  It’s like the change was made without anybody browsing the damn Internet with the patch applied! 

So, for the next week or so, you can imagine the annoyances you might hear from users who crash with their IE patched browsers on ANY HTTP compressed site or Web-based app using mod_gzip, native IIS 6 compression, our products, our competitor’s products, appliances, and on and on….


Our suggestion to admins with compressed sites for the next week: just flip the switch OFF on HTTP compression for one week if you have a production site with lots of traffic, as some folks may crash in IE6 right now.  You can add an exclusion rule for a week in our ZipEnable or httpZip products for IE6.  I wouldn’t be surprised to see some sites, particularly in the open source community, browser sniffing and then sending IE6 browsers to a page that says: “Microsoft broke compression. Try Firefox instead”.  Now, that would be mean, but we are sorry to say in this case maybe not uncalled for. 

Message to Redmond: someone or many someones in the IE QA department need to be fired, as their oversight has messed the entire Web up here… Until the patch is out next Tuesday, August 22!

– Port80


 


 

2 Comments »

Prefetching Ills with Firefox’s FasterFox

Posted: August 15th, 2006
Filed under: Uncategorized


In the past, we have reviewed the concept of “prefetching” content before the browser end user requests it – a fascinating and very Ajaxian approach to Web site and app performance.  FireFox browsers can take full advantage of this with a relatively new browser plug-in called FasterFox (catchy) that handles prefetching for FF.

Attack of the Prefetchers

As FireFox usage continues to expand, some Webmasters are seeing bandwidth shoot through the roof on even non-Ajax sites as users with FasterFox are prefetching content galore. 

What’s an admin to do?

Fighting Back

Thank goodness for robots.txt.  Add a few lines to your “robots.txt”, and you can block FasterFox prefetches:

1. User-agent: Fasterfox
2. Disallow: /

Your FireFox users will still get content, but you will not have to pre-serve it.

Best,
Port80

 

No Comments »

Using CacheRight to Trap and Modify Unwanted Cache Control HTTP Headers

Posted: August 9th, 2006
Filed under: Uncategorized


Sometimes, we inherit code and just plain have to deal with it.  A legacy app coded by previous developers.  A Web-based application or library integrated into a site.

What happens if you get HTTP cache control headers in a Web response that you did not expect from an application that then caches a file longer than you want to on browser and proxy caches?

This could be especially problematic if the file that is getting cached is a dynamic ASP, ASPX, CFM, PHP or JSP file, and you may really need to kill those headers caching these files on the browser.  This could lead you to have to search through the code to find where the header is injected, which may be time consuming or hard if not impossible to find.

On IIS, you can solve this challenge with CacheRight and trap/modify/remove unwanted cache control headers.  Here are a few steps to try:

1. In the CacheRight rules file, set the default rule to:


ExpiresDefault : immediately public

Or, set up a rule for how you want the dynamic files to be cache controlled, depending on the cache policy you want to enforce…

2. Then, using the block by path rule to exclude your dynamic files (.JSP files, in this example, in a particular directory; wildcards work to control for extension, if you needed this much control versus blocking a whole directory):
BlockByPath /JSP Folder/* :


3.
Additionally, make sure the “Override existing cache control headers” box in the CacheRight Settings Manager is checked.

4. This will force CacheRight to remove all of the cache control headers in your dynamic files, thus allowing for the user’s browser or a proxy cache to request the standard 304 modification check.

The result? No uncontrolled caching headers that break dynamic pages from the code base you inherited or integrated.

Best regards,
Port80

 

No Comments »

The Internet turns 15 this month

Posted: August 7th, 2006
Filed under: Uncategorized


As they say, it is an arbitrary date, but Tim Berners-Lee considers August 6 to the Web’s Birthday, so who are we to argue?:


http://news.bbc.co.uk/1/hi/technology/5245080.stm


We have come a long way since 1991, but even Tim say’s the real works to come…

And many more,
Port80

1 Comment »

Fear, Uncertainty and Doubt in Web 2.0

Posted: August 4th, 2006
Filed under: Uncategorized


The more we read articles like today’s mention in eWEEK about Ajax (http://www.eweek.com/article2/0,1895,1998795,00.asp), the more we see the natural technology cycle revealing itself. 


After a new product launch, it would seem that most technology is adopted by those of us on the cutting edge (the innovators and early adopters; stand tall, be proud) and is often raved about for its new possibilities or just its plain newness.  However, it is just newness — and because of that newness, a technology is not well understood at the outset by the market, leading to the influence of FUD factors (fear, uncertainty and doubt).  Also, because it is new and all the details/vectors for issues have not been worked out, the technology may be rough around the edges, feature incomplete and green, thus contributing to more FUD.  Finally, in the rush to produce new technologies, security is often not thought out because of the impulse to gain that coveted “first mover” status in the market…  All of this adds up to MAJOR FUD. 


Now, here come these new eWEEK quotes in print, straight out of the mouths of hackers trying to make names for themselves at Black Hat in Vegas this week, ringing the “Ajax is insecure, Batman!” alarm bell.  OK, folks.  Ajax, like all new stuff, has its problems.  We get it.  New technologies, standing on the shoulders of giants, and new approaches often mean the system and implications of new features have not been fully fleshed out, and there is the real possibility of a new hole or attack vector being introduced.  But, where is the new data here from eWEEK?  What is scary or groundbreaking in this prognostication about the coming Ajax Armageddon?


I guess some questions beg answers. What are we really protecting?  What is less secure now with Ajax in the mix?


Backend FUD with Ajax


If you are protecting access to your backend database, stored credit cards, application or personal data, etc., then we say Ajax is no more insecure that your server-side app because all the things you are protecting are on the server in the first place.  Someone still has to inject SQL, dump your source files, exploit a server bug, exploit a logic bug, etc. to get at your data treasure-trove.


However, how does Ajax make that more likely?  It’s only more likely if you are not sanitizing your input strings, not checking your referers, not doing all things you were supposed to do if the application logic was being done server-side in the first place.  If you just decided to forgo these things because of Ajax, well then, yes, you are more insecure right now: but did the technology make you do that?  If you have no idea what I am talking about here, could it be your server app is just as vulnerable?  Security overlooked is insecurity, no matter the implementation details.


Frontend FUD with Ajax


If you are protecting your intellectual property of Web development — your code out there on the Net — well then, yes, Ajax is certainly more insecure.  In a server-side app, you will hide much of your hard work in .PHP, .ASPX, .JSP, etc., files that are executed before delivered over the network.  Unless something explicit dumps unexecuted server code to the browser, your hard work is relatively safe.  But, by employing the Ajax pattern and moving the majority of code to the client side in the form of JavaScript, you are indeed presenting your hard work for inspection or theft to anyone who can view source or cache explore.  Yet, before you consider source reengineering as your worst security challenge, remember that some sites (hopefully, not yours) just plain give away secrets of execution in code or other security factoids that lead to blatant insecurity.  Don’t sweat Ajax attack vectors if you have a comment in a form like:



(Yes, it does happen.)  If so, your page has larger security issues to deal with first!


Ajax and any technology can always be made more secure, and malicious hackers will always try to break the locks. Your best solution then is to make life rougher for source sifters, digital thieves, and other dastardly villains of this sort.  Obfuscate, remove white space, dump comments… do what the w3compiler does.  That’s one reason why we built the tool.  At first, Port80 Software noticed customers were all about buying the client-side, pre-deployment code optimization tool for speed.  Now, with the rise of Ajax, we see more w3compiler real-world value in source code and related intellectual property protection. Unfortunately, these benefits are a bit at odds with each other, as more IP obfuscation may bloat your code footprint, reducing delivery speed.  It is a balancing act, depending on whether performance or security matter most to your Web site or application.


Yes, there is some truth to the Ajax insecurity claims.  Truth in that any new technology exposes unintended consequences/challenges and also truth in the sense that, if you are already securing your code, the “newness” of Ajax may mean you simply have to port some server-side security logic/best practices to the new Ajax implementation…  which leads us back to eWEEK:  that article doesn’t really speak to the reader about these truths:



< RANT time >


eWEEK should be ashamed of statements like:

“By exploiting shortcomings in Ajax programmers’ work, hackers may also be able to gain access to Web applications themselves and wreak havoc with online businesses.” 



OK, I know fear mongering sells magazines, but let’s rephrase that to ridicule eWEEK properly:


“Bad guys can do bad things if they can get into your site because you didn’t do things right.”



Could you not say this about ANYTHING in the world, technology or otherwise?!?!?!?


The reality is that the attack surface that Ajax exposes is no more than most Web  sites and apps, unless there is logic in the JavaScript.  If your JavaScript is all UI and you don’t divulge information via comments or other things in that JavaScript, there is no more attack surface than a non-Ajax implementation.  You still have a URL, a query string or posted values, and cookies (as well as other headers) as your attack surface.  Ajax does not change the fundamentals of technology at that level.


You can see eWEEK’s clear misunderstanding here:


“Now [an attacker] is inside your application and can create a pipeline that allows them to see all the function names, variables and parameters of your site,” Hoffman said.”



Hello? Inside the application?  When I view Amazon.com or any other site, I am inside their application, by this way of thinking.  What does http://www.fakesite.com/doit.aspx?id=5&view=new&userReturn=false do?  They see your parameters!!!  They can fiddle with them!!! Your URLs are function names even in most styles of Web pages.  If you provide raw access to private, server-side variables using a JavaScript library, you, my friend, need to be ridiculed as a Web developer newbie.  This honestly is no different than using posted variables with no checks, keeping register globals on in PHP or all sorts of dumb things you can do to any simple server-side app.  If someone is that green, ANY TECHNOLOGY THEY USE WILL BE INSECURE (you are pro, I am sure this does not apply to you, fair reader).


You may change the style of the attack surface, you may add more moving parts, but it is still the same… it’s all about inputs.  You must know them. However they are implemented, you must clean inputs or you can never ever trust them, Ajax or not.


< /RANT over >



We kid eWEEK, and there is no harm intended here: you guys are just part of the cycle of technology.  You are helping evolve Ajax with the tough love of FUD. 


If the technology is legitimate, people will enter the space and try to clean-up the FUD spillage by offering new products, utilities, articles, or new versions of the technology that rectify the problems. We sell a few ourselves and will sell some new ones quite soon.  After a while, these band-aids become less needed as the ideas are incorporated into the best practices and core technology itself.  However, usually by then, the technology is no longer new but now old, tried and true — and there should be less FUD smog in the air. 


Of course, then that cycle begins over and over again because we can’t really be using that LAME OLD stuff, can we?


– Port80


 


 

2 Comments »

A Caveat RE: Online Tools to Test Compression/Examine HTTP Header Responses

Posted: August 1st, 2006
Filed under: Uncategorized


We have received a bunch of requests of late regarding our online tools to test your Web sites and applications for the impact of Port80 Software tools, especially for HTTP compression:

http://www.port80software.com/support/p80tools

In general, online compression testers may or may not capture the truth about compression, depending on how well they were written to emulate browser functionality.  For instance, some online tools cannot handle chunked encoded data properly, and there are number of other such limitations that are quite common.  Programming for browser emulation is still, shall we say, a work in progress (as are most browsers themselves).

In fact, our own “Compress Check” is intended primarily as an online sales tool and has often failed to show compression when we know independently that it is taking place (send us a bad link to a report, and we will work to get it fixed).  Our Python code is good, but it ain’t perfect for online HTTP analysis.  Also, intermittent network transience can also get in the way of any 100% perfect HTTP analysis session…

This is why, as a rule, Port80 recommends that you examine the HTTP traffic directly — with independent, third party tools — in order to verify compression or the impact of any IIS or HTTP technology.  We document how to do this in our httpZip evaluation guide, the second section of which contains links to the kind of tools you will need in order to do an independent evaluation of any compression solution.  This will also point you to a good 200 OK blog post with benefits of various third party HTTP analysis tools:

Free support is a good thing.

Best regards,
Port80

 

No Comments »

CacheRight Tips for Better Microsoft IIS Web Performance, Availability

Posted: July 31st, 2006
Filed under: Uncategorized


With the launch of CacheRight 3.0 last week, Port80 Software delivers enhanced cache control for Microsoft IIS servers and your Web users/visitors.  The new version has an updated GUI and a new BlockByPath rule to make cache control easier to manage, and the entire filter has been redesigned for better server-side CPU and memory utilization, leveraging the work done to tune httpZip 3.6 for high-speed ISAPI filter performance over the past six months.


Cache control is not to be confused directly with caching itself, which is generally a special high-speed storage mechanism that can be either a reserved section of main memory or an independent high-speed storage device (two types of caching are commonly used in personal computers: memory caching and disk caching– read more on caching here). Cache control is also different from caching in ASP.NET or PHP, Web development technologies that focus on the pre-generation of DB queries so dynamic pages load faster for browsers. 


Cache control is about making a cache smarter on the Web, essentially turning browsers and proxy servers on the Net into your own content distribution network.  The idea is to send less data, less often by making sure repeat visitors and Internet caches from the AOLs of the world (any proxy servers) do not re-validate static, unmodified content like an image (GIF, JPEG, PNG, Bitmap, whatever), CSS style sheets, JavaScript, a video, a PDF, an .exe, etc.  While this can be controlled manually in Microsoft IIS Web servers, there is no developer “easy access”:  developers and designers know what is fresh and what does not change on a site, not the admin — and this method works every time with every cache, unlike the popular and quite sketchy reliance on the Last-Modified value).  Also, IIS does not yet provide a mechanism for developers or admins to manage cache control centrally for a Web server and its virtual servers or Web sites, so CacheRight makes sense for Web professionals with standards, budget and speed in mind…


Once you know what content you would like to cache control, using CacheRight is straightforward:  write a rule, save the rules file, request resource and examine headers to see if the right Expiration-based cache control headers were applied (the Expires: header and other headers for caches).  Browsers and proxy caches then rely on this data to decide whether to bug your Web server for new content and to avoid 304s at the Web server – this means your Web server answers higher priority requests, and repeat visitors see your Web site or application as faster every time they use it as images and non-changing files pop quickly from their CPU, from the browser’s cache – or a closer proxy server.


The hard part is thinking of your Web sites and applications globally and then determining a good cache control policy.  This will help you to craft CacheRight rules that reflect the cache control policies you want to enforce. 
 
The first step is to categorize your objects (files) by MIME type and/or location (path), and decide how long each category of object should remain fresh in browser and/or proxy caches.  Log file analysis tools will help here.  This might be very simple (one or two policy statements) or quite complex (many policy statements), depending on your requirements. 
 
Your cache control policies might be as simple as a statement that says “cache all images as long as possible in both browser and proxy caches, and don’t cache anything that is not an image.”   Usually, there is a little more variation than this…
 
Once you know the cache control policies you want to implement, you can write the actual CacheRight rules that will enforce those policies.  Here is a CacheRight sample rules file so you can see the syntax…  The evaluation guide overall is a good read before testing CacheRight.


Do not let the sample rules file’s length scare you! Most sites end up with 10-20 rules for the whole site — your rules set will vary, depending on how granular you want to control cache freshness and how often your content changes on your Web site/app.


Compression is usually an easier IIS performance technology to start with, as you can just plug in httpZip and get good results with its safe defaults (ZipEnable, like CacheRight, requires you to know what files you want to compress, controlled with that IIS 6 compression tool by file extension).  Cache control often has a bigger impact on site speed, however, because the process of checking whether files without expiration-based cache control headers are fresh.  While the user waits, your Web server answers 304 Not-Modified to each of these requests that are essentially asking the server, “Hey, I have this file in my IE or Firefox cache.  Is it still good?”  Why make users wait for this information?  Why make your server answer needless requests for content you already served?  If we at Port80 did not use cache control, our homepage would generate 64 image requests from each visitor each time they accessed the home page, and we would be forcing our users/visitors to wait longer for content they already have locally and still need to deal with these repeat requests under load when we could serve new visitors the files they need — in priority.  Test your site and ours for the impact of cache control with the Cache Check tool online right now…


Try out cache control, the best-kept secret of fast sites, and let Port80 Software know where we can help with the technology.


Best regards,
Port80


 

1 Comment »

Hug a SysAdmin today.

Posted: July 28th, 2006
Filed under: Uncategorized


Yes, it is your day, O defenders of networks, explainers of power outages, helpers of help desks, auto update tuners, driver diviners, two-fisted certification dangling, USB memory drive masterin’ SysAdmins (that’s system administrators for the non-industry folks):


http://www.sysadminday.com/

We at Port80 thank you for your valiant efforts to tame the wild wildebeest of IT — and for keeping our lights on.

Cheers,
Port80 Software

No Comments »

IE7: Auto Update Coming, Blocker Tool Provided

Posted: July 27th, 2006
Filed under: Uncategorized


Internet Explorer 7 has tabbed browsing ala Firefox, and a host of new features.  Microsoft announced recently that this new IE will be deployed via auto-updates:

http://blogs.msdn.com/ie/archive/2006/07/26/678149.aspx


If you need to control user updates for legacy apps/sites not tested on IE 7 yet or for other mgmt. reasons, there is a blocker kit for the update as well:


http://www.microsoft.com/downloads/details.aspx?FamilyId=4516A6F7-5D44-482B-9DBD-869B4A90159C&displaylang=en

Enjoy good network health,
Port80

No Comments »

Service Oriented Architecture (SOA) and XML Compression

Posted: July 26th, 2006
Filed under: Uncategorized


Here is a good article that reviews the complexities behind SOA and makes a good argument for the integration of HTTP compression into these systems for performance gains:

http://br.sys-con.com/read/250518.htm


Cheers,
Port80

No Comments »