Wednesday 25 January 2012

O2 changing web page content on the fly?


We recently noticed an oddity in the way some of our web pages were appearing when viewed via some 3G providers, including at least O2. The pages in question includes something like this in the head section:

  <!--[if IE 6]>
    <style type="text/css" media="screen">@import
       url(http://www.example.com/ie6.css);</style>
  <![endif]-->

which should have the effect of including the ie6.css stylesheet when the document is viewed from IE6 but not otherwise.

When accessed over 3G on some phones, something expands the @import by replacing it with the content of the ie6.css style sheet before the browser sees it:

  <!--[if IE 6]>
    <style type="text/css" media="screen">
       ...literal CSS Statements...
    </style>
  <![endif]-->

which, while a bit braindead, would be OK if it were not for the fact that the CSS file happens to contain (within entirely legal CSS comments) an example of an HTML end-comment:

  <!--[if IE 6]>
    <style type="text/css" media="screen">
      /* Use like this
         <!--[if IE 6]>
           ...
         <![endif]-->
      */
      ...more literal CSS Statements...
    </style>
  <![endif]-->

When parsed, this causes chaos. The first <!-- starts a comment and so hides the <style> tag. But the --> inside the CSS then closes that comment, leaving WebKit's HTML parser in the middle of a stack of CSS definitions. It does the only thing it can do and renders the CSS as part of the document.

I don't know what is messing with the @import statements. I suspect some sort of proxy inside O2's network, perhaps trying to optimise things and save my phone from having to make an extra TCP connection to retrieve the @included file. If so its failing spectacularly since its inlining a large pile of CSS that my phone would never actually retrieve.

You can see this effect in action by browsing to http://mnementh.csi.cam.ac.uk/atimport/. You should just see the word 'Test' on a red background, but for me over O2 I get some extra stuff that is the result of their messing around with my HTML.

[There's also the issue that O2 seem to have recently been silently supplying the mobile phone number of each device in the HTTP headers of each request it makes, but that's a separate issue: http://conversation.which.co.uk/technology/o2-sharing-mobile-phone-number-network-provider/]