Obfuscation

From emmtrix Wiki
Jump to navigation Jump to search

Obfuscation is a code transformation technique used to make source code difficult to read and understand while preserving its functionality. This process typically involves renaming identifiers, restructuring code, or inserting misleading elements to obscure the original logic. Obfuscation is often used to protect intellectual property, prevent reverse engineering, or secure sensitive logic in software. While the transformed code remains executable, it becomes significantly harder for humans to analyze, ensuring that key implementation details are concealed without altering program behavior.

Obfuscation Transformation in emmtrix Studio

emmtrix Studio implements obfuscation using #pragma directives or via the GUI. Obfuscation is a transformation that renames identifiers so that they become obscure. This feature is typically used to ensure that the code is unrecognizable. Functionality of the code and its original structure are preserved. The algorithms used are Secure Hash Algorithms SHA-1 and SHA-256.

Typical Usage and Benefits

The transformation is used make the source code hard to understand for humans while still preserving the original structure of the source code. It is meant for cases where the source code needs to be shared (e.g. for debugging) but no intellectual property should be shown.The transformation can be applied only to functions.

Even though the transformation is selectable only for functions, it can affect other identifiers, as well, depending on the user settings. In its full scope, obfuscation affects all identifiers, including:

  • functions and their parameters
  • global and local variables
  • user-defined types
  • structs and unions
  • enumerators
  • file names

Example

/* The following code tests obfuscation transformation applied to main function with default parameters. 
 * According to the default settings, all identifiers except for the external definitions and the main func-
   tion, shall be renamed and become obscure. 
 * In the current code example the identifiers with the suffix _obfu are renamed.
 * Other identifiers include printf and main functions and they remain unmodified.
 */   

const int g_obfu = 10;
void print_obfu(int a_obfu) {
    printf( % d\ n, a_obfu);
}
#pragma EMX_TRANSFORMATION ObfuscateIdentifiers
int main() {
    int l_obfu = 1;
    print_obfu(l_obfu);
    print_obfu(g_obfu);
    return 0;
}
/* The following code is the generated code after the transformation has been applied.
 */

const int var_ee832f = 10;
void func_3db980(int var_e86e08) {
    printf( % d\ n, var_e86e08);
}
void func_baf390(int var_e86e08) {
    printf( % d\ n, var_e86e08);
}
int main() {
    int var_32bc72 = 1;
    func_baf390(var_32bc72);
    func_3db980(var_ee832f);
    return 0;
}

Along with the transformed code, a mapping between the old and the new identifiers is created. It is stored in a CSV file obfuscation_mapping.csv.

file_b93f1b;test_obfuscation00
func_3db980;print_obfu
func_baf390;print_obfu_duplicate2
var_e86e08;a_obfu
var_ee832f;g_obfu
var_32bc72;l_obfu

Parameters

Following parameters can be set (each description is followed by keyword in pragma-syntax and default value):

Id Default Value Description
all true Whole Project - apply obfuscation on all translation units across project
external false External definitions - apply obfuscation on identifiers in header files; use with caution, since it affects used system identifiers and produces uncompilable code
output true Output identifiers - generate CSV file containing mapping of old and new names
seed enc1 Seed for hash-function - arbitrary string used as input for hashing algorithms
n 6 Hash-length - length of hash code in obfuscated identifiers

Note

  • For same seeds and hash-lengths, obfuscation is deterministic.