Very similar problems abound with services based on the HTTP protocol. I'm deliberately using the plural here, because today's Web servers are in fact quite complex beasts, and components such as PHP or Zope or the Java servlet engine can well be considered services in their own right, all sharing the same port and presentation protocol, HTTP.
One very important feature of HTTP based services is that they should never allow clients to view data outside a specific part of the file system. For instance, if the Web server is configured to serve documents from the /pub directory, a client should not be able to retrieve /etc/passwd.
By default, HTTP is quite restrictive about the set of characters allowed
in a request; basically, these are alphanumerics plus a few punctuation
characters such as /. If you want to request a file name that
contains characters outside this set, the HTTP client has to protect
(quote) them by representing them as a two-digit hexadecimal
number preceded by the % character. For instance, ~joe
has to be transmitted as %7ejoe, with 7E being the
hexadecimal ASCII representation of ~.
Quite recently a bug was discovered in Tomcat, the Java servlet
engine for the Apache web server. The attacker would request
e.g. %2e%2e/etc/passwd%00.jsp from the
server. Based on the .jsp extension,7.7 Apache decided to hand the request to the Tomcat module, which failed to
check that the unquoted file name was inside the document area. It
then unquoted the file name, with turned into ../etc/passwd -
notice how %00 turns into a NUL character and effectively removes
the .jsp extension. It would then go and serve this file via
Apache's normal file service mechanism, bypassing the usual restrictions
on serving only files from the document area.
The moral of the story is, if you have quoting mechanisms such as HTTP,
make sure you first unquote the string, and then perform your access
checks. Extra care is required however, that you don't perform any more
unquoting operations after your access control logic has decided the
request is legit. For instance, if you unquote, check, and unquote again,
the following request will pass because at the time of the access checks,
it has no .. path component.
%252e%252e/etc/passwd original request %2e%2e/etc/passwd unquoted once, passes .. test ../etc/passwd unquoted once more