next up previous
Next: Respect packet boundaries Up: Presentation Layer Issues Previous: Word size issues


HTTP's Funky Quoting Mechanism

Very similar problems abound with services based on the HTTP protocol. I'm deliberately using the plural here, because today's Web servers are in fact quite complex beasts, and components such as PHP or Zope or the Java servlet engine can well be considered services in their own right, all sharing the same port and presentation protocol, HTTP.

One very important feature of HTTP based services is that they should never allow clients to view data outside a specific part of the file system. For instance, if the Web server is configured to serve documents from the /pub directory, a client should not be able to retrieve /etc/passwd.

By default, HTTP is quite restrictive about the set of characters allowed in a request; basically, these are alphanumerics plus a few punctuation characters such as /. If you want to request a file name that contains characters outside this set, the HTTP client has to protect (quote) them by representing them as a two-digit hexadecimal number preceded by the % character. For instance, ~joe has to be transmitted as %7ejoe, with 7E being the hexadecimal ASCII representation of ~.

Quite recently a bug was discovered in Tomcat, the Java servlet engine for the Apache web server. The attacker would request e.g. %2e%2e/etc/passwd%00.jsp from the server. Based on the .jsp extension,7.7 Apache decided to hand the request to the Tomcat module, which failed to check that the unquoted file name was inside the document area. It then unquoted the file name, with turned into ../etc/passwd - notice how %00 turns into a NUL character and effectively removes the .jsp extension. It would then go and serve this file via Apache's normal file service mechanism, bypassing the usual restrictions on serving only files from the document area.

The moral of the story is, if you have quoting mechanisms such as HTTP, make sure you first unquote the string, and then perform your access checks. Extra care is required however, that you don't perform any more unquoting operations after your access control logic has decided the request is legit. For instance, if you unquote, check, and unquote again, the following request will pass because at the time of the access checks, it has no .. path component.

%252e%252e/etc/passwd       original request
%2e%2e/etc/passwd           unquoted once, passes .. test
../etc/passwd               unquoted once more


next up previous
Next: Respect packet boundaries Up: Presentation Layer Issues Previous: Word size issues
Olaf Kirch 2002-01-16