About Cuckoo Sandbox

Resources utilized

Link - https://cdimage.debian.org/debian-cd/current/amd64/iso-dvd/

Official cuckoo docs - https://cuckoo.sh/docs/

Cuckoo docs - https://cuckoosandbox.org/blog/207-interim-release

Cuckoo in docker (not verified)- https://hub.docker.com/r/blacktop/cuckoo

Karust docker cukoo (not verified) - https://github.com/karust/cuckoo-docker

Cuckoo setup from hatching - https://hatching.io/blog/cuckoo-sandbox-setup/

Drakvuf sandbox - https://github.com/tklengyel/drakvuf

FlareVM sandbox - https://github.com/mandiant/flare-vm

Nested virtualization of Xen - https://www.linuxquestions.org/questions/linux-virtualization-and-cloud-90/xen-on-ubuntu-16-04-a-4175587197/

Supported architecture - https://discussions.citrix.com/topic/399783-what-ubuntu-1604-lts-kernels-are-approved-by-citrix/

Xen version 4.10 install - https://wiki.xenproject.org/wiki/Xen_Project_4.10_Release_Notes

Xen Project guide - https://wiki.xenproject.org/wiki/Xen_Project_Beginners_Guide

Volatility - https://github.com/volatilityfoundation/volatility/wiki

Guacd - https://guacamole.apache.org/doc/gug/guacamole-architecture.html#guacd

Xen 4.10.0 from source - https://wiki.xenproject.org/wiki/Compiling_Xen_From_Source#Build_Dependencies

Xen pre reqs for 4.10.0 ubuntu 16.04 - https://gist.github.com/cnlelema/ca366be63573dbdaa14938107c611897

Xen 4.10 docs - https://xenbits.xen.org/docs/4.10-testing/man/xl.1.html

Xen cheatsheet - https://lzone.de/cheat-sheet/Xen

Virt unprivelged user - https://unix.stackexchange.com/questions/198768/xen-libvirt-access-for-non-root-user

xen info - https://wiki.xenproject.org/wiki/Xen_Project_Software_Overview

python pip info - https://blog.enterprisedna.co/python-setup-py-egg_info-error-code/#:~:text=While%20working%20with%20Python%20packages,or%20other%20package%20installation%20tools.

Virtual environments - https://engaging-web.mit.edu/eofe-wiki/virtual_envs/scripted/python_venv/#:~:text=To%20re%2Denter%20the%20virtual,inside%20it%20will%20be%20there.

Transferring files - https://www.baeldung.com/linux/transfer-files-ssh

VMCloak docs - https://vmcloak.readthedocs.io/en/latest/config.html

Xen-API docs - https://xapi-project.github.io/xen-api/basics.html


What is a sandbox environment?

A Security mechanism to run untrusted programs or code in a segmented environment.

Some free example already configured sandbox environments:

Hybrid Analysis

Sandbox link - https://hybrid-analysis.com/

Github - https://github.com/PayloadSecurity

Triage (Personal favorite. Someone from Discord recommended this)

Sandbox link - https://tria.ge/

Github - https://github.com/hatching

Website - https://hatching.io/triage/

Planning phase

When configuring a sandbox environment with a tool such as Cuckoo, it's important to plan out what you are planning to analyze with what types of files and the type of platforms and information you would like to extract about the files themselves.

When planning, you should have a plan the decides on the type of operating system, language and patching level you are using alongside with the software installs and which versions dependent on the exploits.

Automated malware analysis is never 100 percent true, deterministic and the success depends on many factors such as the malware may already know it's running in a sandbox environment and even the specifications of your environment may not be enough for both the dynamic and static analysis with cuckoo as well. When using a sandbox, it's always best to use a virtual machine with remnants of daily usage, files, exported documents, network traffic being used, pictures, cookies and anything related to what a daily user would have on there computer.

Some resources below

How sandboxing is detected

- https://www.picussecurity.com/resource/virtualization/sandbox-evasion-how-attackers-avoid-malware-analysis

PAFish detecting sandboxing envrionment tool

- https://www.vmray.com/cyber-security-blog/a-pafish-primer/

Bleeping computer evasion encyclopedia

- https://www.bleepingcomputer.com/news/security/new-evasion-encyclopedia-shows-how-malware-detects-virtual-machines

MITRE attack sandbox evasion techniques

- https://attack.mitre.org/techniques/T1497/

Virtualization software

When choosing your compatible virtualization software (whether it be virtualbox, VMWare (any version/type), XCP-NG, Microsofts Hyper-V, esxi) understand how it works and what is compatible both upstream and downstream of the software. I will include the basic knowledge base/documentation provided by each of them (XCP-NG is compatible with Xen)

In my case, XCP-NG is running in my environment as a type 1 virtualization software and is compatible upstream and downstream with Xen, BUT NOT Virtualbox, VMWare, Hyper-V according to the documentation. So in my case of this tutorial, we are installing and configuring Xen and XCP-NG for a nested virtualization environment for Cuckoo. I plan on creating a tutorial for the other virtualization software in the future and separate them properly with explanations, but I want to at least provide some resources that can be of assistance for others.

The baseline requirements is to enable virtualization in the BIOS

VT-x for Intel chipsets

AMD-V for AMD chipsets

Resource for VMWare nested virtualization - https://kb.vmware.com/s/article/2009916

Resource for VirtualBox nested virtualization - https://docs.oracle.com/en/virtualization/virtualbox/6.0/admin/nested-virt.html#

VirtualBox helpful commands - https://docs.oracle.com/en/virtualization/virtualbox/6.0/user/vboxmanage-modifyvm.html

Resource for Microsoft Hyper-V nested virtualization - https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/user-guide/nested-virtualization

Resource for running Hyper-V in a nested virtualized environment - https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/user-guide/enable-nested-virtualization

Resource for nested virtualization XCP-NG - https://xcp-ng.org/docs/guides.html#xcp-ng-in-a-vm

Resource for nested virtualization with Xen - https://wiki.xenproject.org/wiki/Nested_Virtualization_in_Xen

What is Cuckoo?

Cuckoo is a automated malware analysis system that is used to automatically run and analyze files and college comprehensive analysis results that show what the malware does while running in an isolated environment. When the sandbox environment within cuckoo is ran, we can also configure each virtual machine with the virtualization software using vmcloak to create snapshots and automatically revert to the configured snapshot after the analysis and report is uploaded to the API. Using Xen, VMCloak is not compatible and the nested VMs will be managed through XenAPI (XAPI).

Cuckoo can retrieve the following below:

- Traces of calls performed by all processes from the malware

- Files created, deleted and downloaded by the malware during execution

- Memory dumps of the malware processes

- Network traffic trace in PCAP (Packet Capture) format

- Screenshots taken during the execution of the malware

- Full memory dumps of the machines

So, what is XenAPI (XAPI):

XenAPI docs - https://xapi-project.github.io/xen-api/

XenAPI is an interface that you can remotely configure and control virtualized guests on Xen-enabled hosts. It is a set of Remote Procedule Calls (RPCs) with a wire format based using XML-RPC that has no language bindings.

The API utilizes references from classes and objects. Classes are a hierarchal namespace and an object is an instance of a class with fields set to specific values

Classes python3 - https://docs.python.org/3/tutorial/classes.html

Classes python2.7 - https://python.readthedocs.io/en/v2.7.2/tutorial/classes.html

Objects Python3 - https://docs.python.org/3/reference/datamodel.html

Objects Python2.7 - https://python.readthedocs.io/en/v2.7.2/c-api/typeobj.html

Directly from the docs, the class are specified by a list of fields along with types and qualifiers

A qualifier is:

- RO/runtime – The field is Read Only. Furthermore, its value is automatically computed at runtime. For example: Current CPU load and disk IO throughput

- RO/constructor - The field must be manually set when a new object is created, but is then Read Only for the duration of the objects life. For example, the maximum memory addressable by a guest is set before the guest boots.

- RW - The field is Read/Write. For example, the name of a VM.

These are the types that are used to specify methods and fields in the API references

- string – Text strings

- int – 64-bit integers

- float – IEEE double-precision floating-point numbers

- bool – Boolean

Boolean - https://www.w3schools.com/python/python_booleans.asp

- datetime – Date and timestamps

- c ref – Reference to an object of class c

- t set – Arbitrary-length set of values of type t

- (k → v) map – Mapping from values of type k to values of type v

- e enum – Enumeration type with name e. Enums are defined in the API reference together with classes that use them

A note within the XenAPI docs is that the ref are double linked:
Example – a VM has a field called VIFs of type VIF ref set ; this field lists the network interfaces attached to a particular VM. Similarily, the VIF class has a field called VM of type VM ref which references the VM to which the interface is connected. These fields are bound togehther which when creating a new VIF causes the VIFs field of the VM object to be updated automatically.

What is a Call function?

A call function calls to another function in assembly and then can return using the ret argument at a later time. Call stores the return address to jump back into the stack.

Example and in depth explanation here - https://blog.korelogic.com/blog/2014/05/27/malware_callback

What is a Memory dump?

When a system or application crashes it would display the error on screen and dump it into memory for later diagnostics. Examples can be seen as the BSOD (Blue Screen of Death) for windows or a Kernel panic within linux.

Memory dump windows - https://learn.microsoft.com/en-us/troubleshoot/windows-server/performance/memory-dump-file-options

Memory dump linux - https://www.baeldung.com/linux/dump-memory-image

Arch memory dump - https://wiki.archlinux.org/title/Core_dump#Where_do_they_go?

Red Hat Enterprise Linux (RHEL and is Centos compatible) - https://access.redhat.com/solutions/56021

Ubuntu (Debian based) Memory dump - https://linuxhint.com/setting-linux-core-dump-location/

Gentoo - https://wiki.gentoo.org/wiki/Kernel_Crash_Dumps

Cuckoos use cases

This tool can be used as a standalone application and can also be implemented into larger frameworks. This tool can analyze some of the following below:

- Generic Windows executables

- DLL files (Dynamic Link Library)

- PDF documents

- Microsoft Office documents

- URLs and HTML files

- PHP scripts

- CPL files

- Visual Basic (VB) scripts

- ZIP files

- Java JAR files

- Python files

- And many more

What is a Dynamic Link Library (DLL) file?

This is a file that contains instructions that other programs can call upon to do certain things and can share the abilities of what is programmed into it within a single file. As an example it can be used to execute a call to print a test page from a default printer setup.

Resource - https://www.spiceworks.com/tech/tech-general/articles/what-is-dll/

Resource (docs) - https://learn.microsoft.com/en-us/troubleshoot/windows-client/deployment/dynamic-link-library

What are PHP scripts?

PHP is used to write server side scripting to develop static or dynamic websites or web applications and stands for Hypertext Pre-processor

Resource - https://www.freecodecamp.org/news/what-is-php-the-php-programming-language-meaning-explained/

Resource (docs) - https://www.php.net/docs.php

What is a Control Panel (CPL) File?

Simply put is each control panel function is represented with a .cpl extension within the Windows\System directory/folder

Resource - https://docs.fileformat.com/system/cpl/#:~:text=It%20is%20used%20by%20the,Displays%2C%20Networking%2C%20amongst%20others.

Resource (docs) - https://support.microsoft.com/en-gb/topic/description-of-control-panel-cpl-files-4dc809cd-5063-6c6d-3bee-d3f18b2e0176

What is a Java ARchive (JAR) file?

This is used to store file within one singular .JAR extension used for storing programs and games.

Resource - https://www.geeksforgeeks.org/jar-files-java/

Resource (docs) - https://docs.oracle.com/javase/8/docs/technotes/tools/windows/jar.html

What is a Python file?

A Python file can be used to read and write data to file and can be crucial for manipulation of data, analysis and storage.

Python docs - https://docs.python.org/3/

Python examples - https://www.programiz.com/python-programming/examples

What is TCPDump?

TCPDump manpage (Manual page) - https://www.tcpdump.org/manpages/tcpdump.1.html

In simple terms when using TCPDump in Cuckoo, this is used to dump all the network traffic when running the potential malicious software within the sandbox environment to show the network traffic for analysis.

When running TCPDump within the Command Line Interface (CLI), it will present outputs with information such as Time stamps with hours, minutes, seconds and even more precise miliseconds, information printed such as window, ports, Source/Destination IPv4/IPv6 address', Time To Live (TTL) and basically any information such as Window Size, 3 way hand shakes, or anything else a regular packet capture will output.

When the tcpdump is finished running it will report the following:

- Packets captured within the time frame tcpdump has run within your environment until either interrupted or a specific timestamp/packet count to finish.

- Packets received by filtering will be displayed dependent on the arguments configured and the operating system you are using on some occasions even if the filter expressions match.

- Packets dropped by the kernel is when a packet was dropped due to buffer space (data temporarily reserved in memory) by the packet capture mechanism on the operating system that is running tcpdump.

Live TCPDump traffic output example

Command used - sudo tcpdump -v

Writing TCPDump data to a file output example

Command used - sudo tcpdump -v -w /home/sdick/output2.pcap

Use CTRL + C to stop the tcpdump capture

Output from the tcpdump capture within wireshark

Wireshark download - https://www.wireshark.org/download.html

Output of packet capture when interrupted/canceled

What is Volatility?

Docs - https://github.com/volatilityfoundation/volatility/wiki/Volatility-Documentation-Project

Volatility is a memory forensics tool for analyzing samples on a static and dynamic analysis level. Can analyze crash dumps, can parse and analyze raw dumps, hibernation files and much more.

What is guacd?

Guacd is what you see on the web user interface of accessing Cuckoo to manage and upload your samples onto and authenticate into.

The Architecture

Cuckoo consists of a central management software that handles sample execution and analysis.

Each analysis is launched in a fresh isolated virtual or physical machine within the cuckoo infrastructure.

It consists of a host machine (the management software) and many or one Guest machine(s) (depending on build can be physical or virtual machines for analysis)

Topology from Cuckoo documentation

Source - https://cuckoo.sh/docs/introduction/what.html

Configuration of the Cuckoo sandbox environment

Directory - $CWD/conf/cuckoo.conf

Special considered options in this configuration file is below:

Option - Machinery in Cuckoo

This configuration file is to configure the cuckoo sandbox and how it works with the environment hosts. This also consits of your type of virtualization software whether it be Xen, Virtualbox, Vmware, etc...

Option - IP and Port in Resultserver

You will also be able to configure the management interface of where your analysis is occuring within Cuckoo and your virtualized hosts where the analysis is done on a Windows 7, Ubuntu, Windows 10, etc and ensure the analysis VM are matching your networking configuration upstream otherwise data will not be presenting. Depending on the virtualization software you are using, there won't be an IP address present for the network interface until the VM is started, it's suggested to input an IP address for your analysis VM in here if it's not propagating or working.

Option - Connection in Database

This defines how cuckoo connects to the internal database which you can use any Database Management System Software (DBMS) supported by SQLAlchemy using valid Database URL syntax.

SQL Alchemy docs - https://www.sqlalchemy.org/

SQL Alchemy Database URL syntax - https://docs.sqlalchemy.org/en/20/core/engines.html#database-urls

Directory - $CWD/conf/volatility.conf

This is where we can enable the volatility tool and configure the plugins for memory dump analysis from the malware we analyze. Within this file we would need to enable 2 plugins:

Enable - volatility in $CWD/conf/processing.conf file

Enable - memory_dump in $CWD/conf/cuckoo.conf


Directory - $CWD/conf/memory.conf

Within the memory configuration file, you can configure the volatility profile to your specifications and even choose to either save the memory dumps or have them deleted after each analysis in order to save disk space.


Directory - $CWD/conf/processing.conf

You can enable, disable and configure how the processing modules are located within the cuckoo.processing modules and how to digest raw data and is collected during the analysis.


Directory - $CWD/confi/reporting.conf

This file contains on how the reports are generated on a per analysis basis. The reports can be sent to other databases you host other places to analyze or can analyze within the Cuckoo Sandbox envrionment.


Android Analysis

You can configure Cuckoo to have android analysis done within your environment. This can be done by downloading the Android SDK into a folder where Cuckoo has access and configure the avd.conf file with the configuration settings of your cuckoo setup


Location of Android configuration file - $CWD/conf/avd.conf

Some options in the configuration file to note are the following:

emulator_path

This is the path of the Android emulator located in the Android SDK

adb_path

Path to the Android Debug Bridge utility within the Android SDK

avd_path

The path where the AVD images are located

Network routing

Per Analysis Network routing

A per analysis network routing is when you can have a sample occur without internet access, a different analysis can go through a VPN or even another analysis can be through the Tor network.

The iptables can be configured with your interface to allow traffic inbound and outbound on a per IP address or whole subnet basis as with any other uncomplicated firewall configuration for the kernel.

When configuring the iptables, they arent saved after restarting the cuckoo sandbox instance, you will need to configure iptables to be persistance after rebooting or configure a bash script to run to add the iptables again after rebooting.

Docs for iptables-persistent - https://help.ubuntu.com/community/IptablesHowTo#Solution_.233_iptables-persistent

In order to enable a per analysis network routing, you can allow the the global routing and per anlaysis routing by enabling IP forwarding:

Command to use to allow IP Forwarding - $ echo 1 | sudo tee -a /proc/sys/net/ipv4/ip_forward $ sudo sysctl -w net.ipv4.ip_forward=1

When configuring internet routing, you must include the network interface for iproute2 on your virtualization software within the routing table config file with an identifier number and your interface name that is running your ubuntu VM. After configuring it, you will need to do this for every network interface on the virtual machine.

Types of network routing options

None Routing

This is the default routing Cuckoo allows when an analysis is happening. This will not do anything and not contact the internet whatsoever

Drop Routing

This allows traffic to the internal Cuckoo traffic and blocks any DNS requests or connections outbound are blocked

Internet Routing

This allows internet routing to the VMs for the samples to execute through the network interfaces. This allows all network traffic and any potential malicious samples to connect through the internet upstream. This requires the configuration of iproute2 within Cuckoo for it to work

IProute2 configuration - https://cuckoo.sh/docs/installation/host/routing.html#routing-iproute2


InetSim Routing

This provides fake services for malware to talk to. We can use InetSim routing with the setup of InetSim on the host machine or a seperate VM to push traffic to and allow Cuckoos virutal machine to talk and receive network traffic from. This helps have a "realistic" sense of network traffic and cause less detection if possible when malware attempts to see if its a sandbox environment. InetSim routing comes with Remnux (Reverse engineering malware linux VM) and you would have to point the Cuckoo routing.conf file in $CWDconf/routing.conf to the statically assigned IP address of the Remnux Machine. Otherwise, you can use the InetSim docs to configure a separate VM to complete this task.

InetSim Routing docs - https://www.inetsim.org/documentation.html


Tor Routing

You will need to install and configure the latest stable version of Tor to accomplish this and then modify the Tor configuration file to configure teh listening adderess and port for TCP/IP connections and UDP requests

You can configure the tor file on the Ubuntu machine at - /etc/tor/torrc

Configuration will need to be added as shown

Example from docs with IP address of 192.168.56.1 to listen on port 5353 and 9040 outbound


TransPort 192.168.56.1:9040

DNSPort 192.168.56.1:5353


After configuring the file, you will need to restart the Tor daemon with the command /etc/init.d/tor restart

You will need to configure the Tor routing within Cuckoos Working Directory located at $CWD/conf/routing.conf

When in the routing configuration file, ensure to configure the following if you would like this setup for Tor routing

VM hosting Cuckoo - /etc/tor/torrc

Within Cuckoo itself - $CWD/conf/routing.conf

When configuring the Tor routing, it must match both within the Cuckoo working directory routing configuration and Tor configuration files located at:


[tor]

enabled = yes

dnsport = 5353

proxyport = 9040


VPN Routing

If you would like to route everything from the virtual machine doing the analysis, you can do so by configuring it within the routing configuration file in Cuckoos working directory in case you would like to have it connect to different places in the world, you will need to add each vpn interface within Cuckoos routing configuration file as well.

Location of routing configuration - $CWD/conf/routing.conf