Reading structs with FRIDA

We are able to read function arguments with FRIDA using the args:NativePointer[] array. However, this is not possible with arguments that are not simple types such as structs.

Where can we find structs? We can find structs in the Unix time libraries for example, or more importantly in Windows's APICALLs such as the ones in NTDLL.

Stages:

  • Understanding and reading a user-controlled struct.
  • Reading a UNIX syscall structure.
  • Reading a Windows NTDLL structure.

Reading from a user-controlled struct.

Given this declaration:

void print_struct(myStruct s)

We want to log each different member of s. As we can see, the only thing that we have is s and we can't apply any FRIDA API method such as .readInt() or .readCString(). We need to first gather the offsets of the struct to be sure what we are trying to read.

myStruct corresponds to the following:

struct myStruct
{
  short member_1;
  int member_2;
  int member_3;
  char *member_4;
} sample_struct;

In other to gather the offsets we need to figure out the sizes of each type, a short list:

{
  "short": 4,
  "int": 4,
  "pointer": Process.pointerSize,
  "char": 1,
  "long": Process.pointerSize,
  "longlong": 8,
  "ansi": Process.pointerSize,
  "utf8": Process.pointerSize,
  "utf16": Process.pointerSize,
  "string": Process.pointerSize,
  "float": 4,
};

So what we can see is that short has a size of 4, longlong a size of 8, char is 1 but then there's Process.pointerSize for the ansi, string and pointer ones. The reason for this is that size of these types is process dependant on its architecture, it's variable hence we need to take this information into account.

It's important to note that we can always read the first member without any major issues, because the offset of it is 0.

So, what are the offsets of the previous structure?

struct myStruct
{
  short member_1; // 0x0 (4 bytes)
  int member_2; // 0x4 (4 bytes)
  int member_3; // 0x8 (4 bytes)
  char *member_4; // 0x12 (8 bytes)
} sample_struct;

How can we check this is true for each type? We can compile a test program and get these values from sizeof().

So, now we have the offsets of the structure and we want to read each value. In this case we will use the .add() operator.

.add() as the name says adds an offset to a given NativePointer.

Therefore, we can place our pointer in the desired offset to read each value:

// Given s = args[0]:NativePointer

s.readShort() // 1st member.
s.add(4).readInt() // 2nd member.
s.add(8).readInt() // 3rd member.
s.add(12).readPointer().readCString(); // 4th member.

This way we will have obtained the values for each structure offset.

Next, we will try to parse a linux SYSCALL structure.

SYSCALL structure

For this example we will be using a known linux SYSCALL named gettimeofday.

MAN page for gettimeofday: man7.org/linux/man-pages/man2/gettimeofday...

We have the following declaration:

int gettimeofday(struct timeval *tv, struct timezone *tz);

From this we can quickly figure out that timeval and timezone are two structs. And we cannot check what these values are by simply using FRIDA's API.

The timeval struct is:

  struct timeval {
    time_t      tv_sec;     /* seconds */
    suseconds_t tv_usec;    /* microseconds */
  };

Note: The time_t size is even dependant on the API level you are targeting in Android systems. Do not forget to get it's size with Process.PointerSize()

And the timezone struct is:

 struct timezone {
    int tz_minuteswest;     /* minutes west of Greenwich */
    int tz_dsttime;         /* type of DST correction */
 };

For this example we will write a simple command and compile it with clang:

#include <sys/time.h>
#include <stdio.h>

int 
main()
{
  struct timeval current_time;
  gettimeofday(&current_time, NULL);
  printf("seconds : %ld\nmicro seconds : %ld\n",
    current_time.tv_sec, current_time.tv_usec);

  printf("%p", &current_time);
  getchar();
  return 0;
}

And run: clang -Wall program.c. The expected output should be:

pala@jkded:~/code$ ./a.out 
seconds : 1601394944
micro seconds : 402896
0x7fff4a1f8d48

So, given this we will try to access the time_t structure given 0x7fff4a1f8d48 is the structure pointer:

[Local::a.out]-> structPtr = ptr("0x7fff0b9a3118")
"0x7fff0b9a3118"
[Local::a.out]-> structPtr.readLong()
"1601395177"
[Local::a.out]-> structPtr.add(8).readLong()
"439353"

As we can see, the first member is already at offset 0, however we need to get the process pointer size to guess the next offset:

[Local::a.out]-> Process.pointerSize
8

Now that we know that the pointerSize is 8, we can infer that long's size will be 8 bytes and place ourselves in the right offset.

WINAPI structure.

There are a lot of structures in the Windows API and therefore we need to be confident in our structure parsing skills. We can find these structures in NTDLL calls to represent strings such as UNICODE_STRING and other structs such as the SYSTEMINFO structure.

For this example we will take a look at the WINAPI call GetSystemInfo that takes a LPSYSTEM_INFO structure as argument. And this is what a LPSYSTEM_INFO struct looks like:

typedef struct _SYSTEM_INFO {
  union {
    DWORD dwOemId;
    struct {
      WORD wProcessorArchitecture;
      WORD wReserved;
    } DUMMYSTRUCTNAME;
  } DUMMYUNIONNAME;
  DWORD     dwPageSize;
  LPVOID    lpMinimumApplicationAddress;
  LPVOID    lpMaximumApplicationAddress;
  DWORD_PTR dwActiveProcessorMask;
  DWORD     dwNumberOfProcessors;
  DWORD     dwProcessorType;
  DWORD     dwAllocationGranularity;
  WORD      wProcessorLevel;
  WORD      wProcessorRevision;
} SYSTEM_INFO, *LPSYSTEM_INFO;

Wow! Quite a complicated struct that we have here right? Let's first find the size of each offset, specially the ones that can be troublesome such as LPVOID.

On a Windows 10 64-bit system compiled for 32-bit under Visual C++ we get the following values:

TypeSize
WORD2
DWORD4
DWORD_PTR4
LPVOID4

We can check this is true by calling Process.pointerSize() in an attached process:

[Local::ConsoleApplication2.exe]-> Process.pointerSize
4

Beware that these numbers will change if compiled on 64 bit:

TypeSize
WORD2
DWORD4
DWORD_PTR8
LPVOID8

Once we have these values, we can infer the offset for each member. Don't be afraid of the union keyword, it won't be affecting our calculations for the time being.

Getting all the values is out of the scope of this part, so we will getting some of them as an example:

  • dwPageSize
  • lpMinimumApplicationAddress
  • dwNumberOfProcessors

Complete offset list:

typedef struct _SYSTEM_INFO {
  union {
    DWORD dwOemId; // offset: 0
    struct {
      WORD wProcessorArchitecture;
      WORD wReserved;
    } DUMMYSTRUCTNAME;
  } DUMMYUNIONNAME;
  DWORD     dwPageSize; // offset: 4
  LPVOID    lpMinimumApplicationAddress; // offset: 8
  LPVOID    lpMaximumApplicationAddress; // offset: 12
  DWORD_PTR dwActiveProcessorMask; // offset: 16
  DWORD     dwNumberOfProcessors; // offset: 20
  DWORD     dwProcessorType; // offset: 24
  DWORD     dwAllocationGranularity; // offset: 28
  WORD      wProcessorLevel; // offset 32
  WORD      wProcessorRevision; // offset 34
} SYSTEM_INFO, *LPSYSTEM_INFO;

And this is the example program that we will be using to test our guesses:

#include <iostream>
#include <Windows.h>
int main()
{
    SYSTEM_INFO sysInfo ;
    GetSystemInfo(&sysInfo);
    printf("%p", &sysInfo);
    getchar();
}

Now that we have the complete offset list, we can know get the values of dwPageSize, lpMinimumApplicationAddress, and dwNumberOfProcessors respectively:

[Local::ConsoleApplication2.exe]-> sysInfoPtr.add(4).readInt()
4096
[Local::ConsoleApplication2.exe]-> sysInfoPtr.add(8).readInt()
65536
[Local::ConsoleApplication2.exe]-> sysInfoPtr.add(20).readInt()
8