Sunday, April 14, 2013

Binary Interleaving: A technique to obfuscate data structures

Binary Interleaving (or Data Interleaving) is a term I made up to reflect an idea I've been contemplating lately; interleaving the bits of a set of variables into a single binary blob. Any number and types of variables could be interleaved together. Access to variables in the interleaved blob can even be on-demand, with a controller class encoding or decoding variables on the fly.

What in the World?

Data Interleaving is the process of translating any number of variables to a single binary blob by interleaving the bits of the variables. This obfuscates the variables in memory or external storage. The entire blob need not be decoded to access member variables, though it can be for improved performance.


This will help complicate reverse engineering of code. It will particularly deter identifying data types and variables. Plaintext is also well obfuscated with this interleave.

Interleave Map

The variables to be encoded could be defined by an array of byte sizes of those variables and, optionally, pointers to a location in memory to retrieve or store their reconstituted form. In the case of on-demand access to an interleaved blob, individual variables can be decoded and re-encoded on the fly, so buffers for reconstituted storage are optional (though they may be temporarily reconstituted by the controller class as members are modified).

The members of the bitwise interleave can be referenced in the source code via their indices. For instance, index 0 may be MY_VARIABLE_INSTANCE. By passing the variable index to an interleave blob controller class, it knows the size and, optionally, a pointer for constituted storage.

Member data types can be anything. They need not be similar. When one variable ends, it is simply ended. See a few paragraphs below for what happens when a single variable is longer than the others.
    /* member information */
    /* optional pointer to its normal, constituted storage location */
    /*  (for use in encoding and decoding the member) */
    /* and the size of the member */
    class CInterleaveMember
      void *pvConstitutedStore;
      unsigned long nMemberByteSize;
    CInterleaveMember aInterleaveMap[]
      { szSomeString, sizeof(szSomeString) },
      { &nIntegerMan, sizeof(nIntegerMan) },
      { &cMyClass , sizeof(cMyClass) };
    void *pBLOB;  /* interleaved data stored in a allocated blob */
The total size of the blob need not be stored, as it is the sum of all member sizes in the interleave map. The interleave map provides everything we need to know.

The Process

In case it is not clear, the process for the interleave would go something like this: The array of members is 'walked', putting or getting the current bit index from each member variable, advancing to the next bit index after the entire array has been walked. When a member variable is full of bits (exhausted), it is skipped in subsequent interleave iterations (more on long vars later).

For simplicity, let me define a few variables in bits only (not matching above):
    szSomeString 0 1 1 1 0 0 1 0
    nIntegerMan  1 1 1 0 0 0 1 1 1 0 0 1 0 0 0 1
    cMyClass     0 0 0 1

For the interleave, a bit is taken from each variable in succession.
    First iteration of the interleave, get first bit from each ...
     0 1 0
    Next iteration(s), get the next bit from each ...
     0 1 0 1 1 0
     0 1 0 1 1 0 1 1 0

When a Member is Longer than the Others

In the case where one variable is much longer than the others, thus having no pair to encode with, one could use a simple XOR, and/or toss in redundant, unused data from the prior members. Any number of strategies are possible to prevent plaintext storage in the case of an abnormally long variable not having an interleave partner for its ending bits.

Sample Code

For example, the following represents a high-level view of the calls to a fictional class facilitating Binary Interleaving:

    /* These get stored in an bitwise interleave in the binary blob */
    char szSomeString = "Is there anybody out there?";
    unsigned long nIntegerMan = 0x9090;
    MyClass cMyClass("whoopie");
    class CInterleaveMember
      void *pvConstitutedStore;
      unsigned long nMemberByteSize;
    CInterleaveMember aInterleaveMap[]
      { szSomeString, sizeof(szSomeString) },
      { &nIntegerMan, sizeof(nIntegerMan) },
      { &cMyClass , sizeof(cMyClass) };
    /* NOTE: Total blob size is the of members of Interleave Map */
    typedef enum
    } InterleavedVariables;
    void *pBinaryBlob;  /* dynamically allocated blob storage */
    /* Fictional class constructor, passing the interleave map to it */
    /* From the interleave map, it can calculate the total blob size */
    /* then dynamically allocate storage for the blob. */
    CBitInterleaver cBitInterleave(aInterleaveMap);
    /* If the blob is externally loaded, or needs ext stored, we */
    /* may need to get access to the blob buffer. Fictional example: */
    /* We know the blob size from map! The input size is for safety. */
    cBitInterleave.SetBlob(pIncomingBlob, nSrcBufferSize);
    /* Or we can get the blob */
    /* Example to encode or decode the entire blob to constituted */
    /* storage. We already provided the map, and it decodes or encode*/
    /* to the listed pointers.
    /* Example call to decode a member of the array */
    /* We pass it the INDEX into the MAP, and dest buffer */
    /* From the Index of _nIntegerman, we ALREADY know the size */
    /* The out size is for safety. */
    cBitInterleave.GetVariable(_nIntegerMan, &nIntegerMan, sizeof(nIntegerMan));
    /* OR we can use the default storage address in interleave map */    
    /* Example call to encode a member of the array */
    /* We pass it the INDEX into the MAP, and input reference */
    cBitInterleave.SetVariable(_szSomeString, &szSomeString, sizeof(szSomeString));

No comments:

Post a Comment