XSS Filter Improvements in IE8 RC1

On Monday IE8 RC1 was released. Here are some of the most interesting improvements and bug fixes to the XSS Filter feature:

  • Some byte sequences enabled the filter to be bypassed, depending on system locale

    URLs containing certain byte sequences bypassed the Beta 2 filter implementation in some locales. For example, with a Chinese locale system, URLs of the following format would bypass the filter:


    The filter decodes the URLEncoding prior to passing the URL through our regular expression engine. The presence of a raw 0xA0 byte followed by a 0x3C byte (“<”) can cause MultiByteToWideChar to fail. This is because with Chinese and other locales, 0XA0 0x3C is not a valid multi-byte character. In this circumstance, the failure cascades so that the regular expression matching would fail to be case-insensitive. But even worse, at a later point in the regular expression code, the 0xA0 0x3C sequence would be interpreted as a single multi-byte character. Thus the < character would essentially be missing from input and the appropriate heuristic would not detect XSS.

    The IE8 RC1 fix enables the regular expression code to treat all input as a stream of individual bytes, not characters in the default codepage (which could be multi-byte).

    Yosuke Hasegawa and 80sec both discovered this bug in the IE8 Beta 2 release.

  • NULLs in HTTP responses caused filtering to drop chunks of HTTP response data

    The relevant buffer class was rev’d in the code to fix this.

  • Added protection against attack scenarios involving PHP stripslashes

    The stripslashes function in PHP removes backslashes from input. (It also replaces double-backslashes with a single backslash.) It’s common for PHP developers to call stripslashes before outputting a string. In these cases if the output enables a server-side XSS bug, that bug can still be abused despite the IE8 XSS Filter.

    This is an example of a platform-specific artifact as discussed in the XSS Filter Architectural Overview:

    The decoding process briefly mentioned above is flexible and can also account for artifacts of various web platforms. As necessary, the filter generates additional signatures (see below) based on alternate interpretations of the same input data. So for example, because malformed URLEncoded characters may be handled differently for different web platforms, the filter must be capable of building proper signatures regardless.

    This describes the new behavior well – the filter now generates additional signatures as necessary for an alternate interpretation of the input. These new signatures are designed to compensate for the behavior of the PHP stripslashes feature.

    It does appear that the PHP “magic quotes” feature is now deprecated. If the use of stripslashes in PHP code is due to the magic quotes feature then it should be expected that stripslashes usage will decline on the web. Regardless, we made the call that this issue still seems to be pervasive enough, at least today, to be worth mitigating in IE8 RC1.

    Ronald van den Heetkamp identified this issue.

  • Added protection against attack scenarios involving servers that still support overlong UTF-8

    Similar to the PHP Stripslashes change described above, we now generate and process additional signatures if overlong UTF-8 sequences are identified on input.

    While overlong UTF-8 appears to be specifically banned in RFC 3629, it still unfortunately seems to be common enough on web server platforms that it makes sense for us to address this attack vector in our code.

    Amit Klein provided feedback which helped identify this issue.

  • Added protection against attack scenarios involving injection of FORM and ISINDEX elements

    Though in general we do not block generic HTML injection, we make an exception for these two elements as they enable attack scenarios similar to those involving injection of script.

    Gareth Heyes identified the ISINDEX element.

  • OBJECT tag’s CODETYPE attribute is now treated equally to the TYPE attribute

    The OBJECT tag’s CODETYPE attribute provides the same functionality as the TYPE attribute. In IE8 RC1 both attributes are considered equal.

    Gareth Heyes identified this issue.

  • General performance improvements

    Ex: Pre-validation to avoid the performance hit of a regular expression in some cases.

I want to especially thank Dany Joly on the IE team for the extraordinary work he’s done perfecting the XSS Filter implementation in IE8.

Onward to RTM!

David Ross, MSRC Engineering – working on a Security Science project

*Posting is provided “AS IS” with no warranties, and confers no rights.*

Update – 2/11/09: Change to blog signature