In part one of this series, we examined some techniques used by malware, ransomware specifically. As we saw, these individual techniques such as downloaders, droppers, and loaders as well as encoding and encryption are all legitimate, programmable capabilities offered by the .Net (dot net) software framework and many other programming frameworks and code languages. Below is a collage of some of the techniques discussed in the previous article.
In this second article, we will proceed to examine the fundamentals of assemblies through the framework of Microsoft .Net. We will delve deeper into the differences between assemblies (EXE vs. DLL) and their relationships which enables how these capabilities are eventually executed from an initial high-level code like C# programming code. We will use the code introduced in the previous article to explore these differences and relationships.
What is Microsoft .Net?
Microsoft .Net is a software development framework designed to support several programming languages and target different operating systems. Supported programming languages like C# (pronounced C sharp) are compiled and run as what is known as managed code (as opposed to unmanaged or native code). To achieve this, .Net runs its code in a dedicated virtual machine rather than directly to the target platform. This virtual machine is known as the .Net common language runtime (CLR). It can be thought of as the common intermediary that eventually runs the compiled or assembled code from all the different programming languages, like C#, VB.Net, and F#, that .Net supports. This example below shows the C# programming language code from the previous article.
Managed code means that the high-level C# programming language code above and others like F# and VB.Net are first compiled to an intermediate language (IL). The C# high-level code shown above compiles to the intermediate language instructions show in the image below. This code resembles low-level assembly programming syntax.
This intermediate language (IL) is then further compiled into native or machine code targeting the relevant machine platform. This compilation is done by another .Net component called the Just-in-Time (JIT) compiler.
Native or machine code is the set of instructions (zeros and ones) that a particular computer’s processor (CPU) understands. This last step is managed by the Common Language Runtime (CLR) which also contains the JIT. The CLR is the .Net runtime environment or virtual machine. Java is another software framework that uses the concept of intermediary runtimes. Similar to the Java Virtual Machine, it is a main part of what makes the .Net platform independent. .Net code is called managed code because the programming code is managed by the intermediary CLR and not run directly by the computer’s CPU.
An advantage of managed code in .Net is automatic memory management and garbage collection. This means that the developer does not need to worry about allocating and deallocating computer memory in their code to save system resources as in the case of say C or C++ code. In .Net, there is the garbage collector that runs periodically to handle deallocated memory. It can also be called by the programmer when needed. The diagram below shows the architecture of a .Net application.
In contrast, non-.Net compilers like VB6, C, and C++ compile their high-level code directly to the target platform’s (OS and CPU) machine code. The resulting executable or assembly of code is therefore tied to the compiler’s target machine platform. This is also known as unmanaged or native code. Although architecturally different, it is possible to use code from assemblies, especially DLLs developed in native code in a .Net-managed application by means of a capability known as Interop Marshalling (Platform Invoke). Examples of this will be the use of native Windows Operating system DLLs or external libraries such as code written in C++ being referenced in a managed .Net application to enable some low-level operating system functionality. In this case, .Net itself can be thought of as a safe wrapper around the native DLLs that the Windows operating system relies on and much of which is actually written in C++.
What is a .Net assembly?
Microsoft describes .Net assemblies as a single unit of deployment. What this means is that an assembly is a collection of various types of code and associated files which have been compiled (assembled) into some form that can be executed on any compatible .Net target platform. The execution is done by .Net’s common language runtime. Examples of assemblies in the Windows operating system are executable files (.exe) and class library or dynamic link library (.dll) files.
Delving deeper into the example code image below shows the C# executable assembly on the left and another C# DLL (also known as class library) assembly code on the right. The executable code references the DLL file and then calls a specific method (function) from the DLL code during execution. These references and calls have been highlighted in the image below. We will explain the details of both pieces of code later in this article. We will also show how this combination can be used for malicious aims in this series.
In the subsequent example, the DLL file is manually referenced in the executable code This means that the DLL and related information about its metadata as well as code (made of modules, classes, and methods) are referenced during the compilation time of the executable code.
As a shared library, DLL code cannot be run on its own directly. From a code point of view, this is because DLLs do not have a main entry point function to execute from and therefore cannot be run as standalone code in the way that an executable (.exe) code is setup to do. As an example, the error message below shows the consequences of trying to run a class library or DLL file directly from a compiler.
Executable code, on the other hand, will have a main entry point function or method where execution begins, but a DLL does not really need a main entry point function as it is primarily a library of code block(s) referenced by other assemblies.
Once referenced, the specific code in the DLL file which is of interest can be called for execution. As shown in the previous article, the code examples (EXE and DLL) below reiterate this point.
The executable application runs and calls code from the DLL it referenced to produce the output shown in the following image.
This simple program shows how .Net assemblies like EXEs and DLLs can be used together.
The DLL code referenced above has a method (function) that takes two parameters per input – a first name and age – and then displays a greeting message using this information. The executable code, on the other hand, runs code that accepts user input details of first name and age from the command line and then passes that information to the DLL method as arguments or inputs. The message from the DLL code is then displayed back to the console screen using the information that the EXE application collected from the user.
Analyzing .Net assemblies
Performing a static analysis on the executable shows the various references of DLLs and other components imported for execution. In addition to our own custom DLL, the executable assembly also imports additional DLLs associated with .Net itself such as mscorlib which is a DLL that contains base code (classes, types, etc.) and is something our program needs to run smoothly.
In our code development environment Visual Studio, we can confirm the use of mscorlib by tracking back its origins in one of the data types (in this case, string from System.String in .Net). This reveals the built-in .Net assembly where that type originates which is mscorlib as shown below.
String is a data type in programming terms where the text the user inputs and then gets displayed back is stored. We can also see from our static analysis the DLL named “DLL_dontNet_Assembly.” This is our custom DLL that contains the “DisplayMsgMethod” method which shows the user a message after they have entered their details.
In our example, we referenced and loaded our custom DLL manually during the compilation of all our code before the program started executing. It is also possible to reference a DLL during the running of an executable. This can be especially useful in cases where we may not have access to the desired DLL during the compilation of our code. This process is known as reflection, and it enables the ability to examine a .Net assembly (metadata and attributes) and also to use code (modules, classes, methods, and properties) contained within it during the run time of our program. This technique can also be tweaked for malicious intent in what is known as reflective DLL injection attacks.
.Net assemblies (executables and class libraries) also consist of a manifest file that contains metadata about the assembly and the intermediate language (IL) code which together enable the common language runtime to run the assembly on any compatible platform that can run .Net. The image below shows the IL assembly instructions and manifest structure of the two assemblies – EXE and DLL. The manifest file contains the metadata about the .Net assembly like version number, description, etc.
We should now have a fundamental understanding of the .Net software framework, its associated assemblies, and how they can interact which each other.
In the next article, we will put the techniques and capabilities we have discussed and learned so far into a single malicious ransomware executable.
Learn more about how Illumio Zero Trust Segmentation can help you contain ransomware breaches.