Archive for Januar, 2009

Determining memory usage in process on OSX

Freitag, Januar 30th, 2009

My strenuous journey started with a seemingly simple task: I wanted to obtain a rough and tumble estimate of the amount of memory instantiating a rather opaque data structure from a third party library would consume.

Naively, I started writing a loop that created a new instance sleeping and then … I poked around Single Unix for a system call that could help and ended up coming up with `getrusage`. OSX man pages stated it fills in this:

struct rusage {
  struct timeval ru_utime; /* user time used */
  struct timeval ru_stime; /* system time used */
  long ru_maxrss;          /* integral max resident set size */
  long ru_ixrss;           /* integral shared text memory size */
  long ru_idrss;           /* integral unshared data size */
  long ru_isrss;           /* integral unshared stack size */
  long ru_minflt;          /* page reclaims */
  long ru_majflt;          /* page faults */
  long ru_nswap;           /* swaps */
  long ru_inblock;         /* block input operations */
  long ru_oublock;         /* block output operations */
  long ru_msgsnd;          /* messages sent */
  long ru_msgrcv;          /* messages received */
  long ru_nsignals;        /* signals received */
  long ru_nvcsw;           /* voluntary context switches */
  long ru_nivcsw;          /* involuntary context switches */
};

Unfortunately, all the fields apart from user and system time remain zeroed. `grepping` through the xnu (Darwin kernel, apparently the only available information about what does and doesn’t work under OSX) sources, I ended up finding this comment in the declaration of `struct rusage` which finally convinced me that I’m not too stupid to make a simple call:

struct    rusage {
    struct timeval ru_utime;    /* user time used (PL) */
    struct timeval ru_stime;    /* system time used (PL) */
#if defined(_POSIX_C_SOURCE) && !defined(_DARWIN_C_SOURCE)
    long    ru_opaque[14];        /* implementation defined */
#else    /* (!_POSIX_C_SOURCE || _DARWIN_C_SOURCE) */
    /*
     * Informational aliases for source compatibility with programs
     * that need more information than that provided by standards,
     * and which do not mind being OS-dependent.
     */
    long    ru_maxrss;        /* max resident set size (PL) */
#define    ru_first    ru_ixrss    /* internal: ruadd() range start */
    long    ru_ixrss;        /* integral shared memory size (NU) */
    (...)
};

At least the good folks over at SUN are kind enough to mention the fact all the fields are dummies in their man pages.

Then I found Michael Knight’s (not that one) blog, which used the underlying mach function `task_info`. Unfortunately, Apple doesn’t document the mach API at all and the sole reference they supply points directly to nowhere.

Well, if `ps` can determine memory usage, surely it should be able to tell me how. Finding the source of OSX `ps` was another story. Hint: it’s not located in the `basic_cmds`, `misc_cmds`, `shell_cmds`, or `system_cmds` package. It’s in the `adv_cmds` (advice?, advanced?, adventure?).

`ps` ended up using a bunch of equally undocumented (non-mach) kernel functions. At this point I remembered that `macfuse` contains a `procfs` for OSX. To me, using `proc` seems to be the obvious way to get memory usage under Linux, so I dug through that and saw macfuse uses `task_info` as well.

I finally found documentation for the mach API within the xnu sources under the `osfmk/man` directory or online here and was able to write a simplified version of Michael’s original.

Voila:

#include <mach/task.h>

int getmem (unsigned int *rss, unsigned int *vs)
{
    task_t task = MACH_PORT_NULL;
    struct task_basic_info t_info;
    mach_msg_type_number_t t_info_count = TASK_BASIC_INFO_COUNT;

    if (KERN_SUCCESS != task_info(mach_task_self(),
       TASK_BASIC_INFO, (task_info_t)&t_info, &t_info_count))
    {
        return -1;
    }
    *rss = t_info.resident_size;
    *vs  = t_info.virtual_size;
    return 0;
}

In case anyone knows of an even remotely portable way obtain similar information, please let me know.