Thursday, June 15, 2006

Use of vm_stat(1)

[N.B. I'll make heavy use of links to DarwinGrok in this post. However, I make no comment on the availability of darwingrok; it's set up on a test server which is unsupported, to test how well said unsupported server handles a Java app deployment mechanism which is also currently unsupported. It's been known to go down for days before I realise anything's wrong. However, if it turns out to be unavailable, just look at the end of the URL to find out which file I've linked to, and look for that on darwinsource. For sake of comparison between the two sites, darwingrok's source cache is based on 10.4.5.ppc.]

So, it seems a bit strange to be talking about vm_stat(1) when I recently posted that I'd been working on an alternative to it. But in doing so, as was rightly observed in the comments, I did learn a thing or two about the Mach VM manager and vm_stat is about the most verbose place in which it's encountered. And besides, I had a request :-)

Let's start by looking at some sample output:
Mach Virtual Memory Statistics: (page size of 4096 bytes)
Pages free: 18150.
Pages active: 129075.
Pages inactive: 79964.
Pages wired down: 34856.
"Translation faults": 216587934.
Pages copy-on-write: 3932355.
Pages zero filled: 85582532.
Pages reactivated: 1712668.
Pageins: 474657.
Pageouts: 77330.
Object cache: 478244 hits of 1592868 lookups (30% hit rate)


Just for comparison, here's the output of "free -p" shortly after that (those of you who haven't investigated free since my last blog post are in for a surprise):
total used free
Mem: 262045 244025 18020
Swap: 262144 171028 91116


I'll start by looking at physical memory, because that's a whole topic in itself which I've spoken about so many times I can cover in a self-contained glob. As an aside though, the pagesize is a compile-time setting, and is 4096 bytes on both architectures unless you go compiling your own kernel. If you do, remember that it's got to be a power-of-2 multiple of the hardware page size.

So, physical memory then. Which of the various pages that we're being told about are actually in physical memory? The active ones must be, and so must the wired ones (a page marked as wired may not be swapped, so by definition they're stored in physical RAM). Actually it turns out that the free and inactive pages are both in RAM too; anything that's been handed out to a pager isn't counted as an available page (but the kernel knows where to ask for that page, if it needs it back). There's no information on page counts outside physical RAM, which was why earlier versions of free(1) couldn't report on the available swap space. It also happens not to make much sense to ask how much swap space is available, because it can change without user interaction. Compare this with Linux: if you swapon(2) a 2GB partition and nothing else, then no matter how much swap space is currently in use, the kernel knows that there's 2GB of swap available. Anyway, that's an aside. Add up the number of free, active, inactive and wired pages, multiply by the page size and you should have a number familiar to you as the total amount of RAM installed in your Mac (or arbitrary other Mach OS box).

Next question, what is the amount of unused RAM? I get asked this frequently and my usual answer is: Heh :-). The free tool just looks at free pages, and top(1) goes for this approach too. Other people will say that it's the number of free and inactive pages. And they're right too, I think. Free pages are absolutely free, and have not been claimed by anything. Inactive pages are candidates for reuse, but do actually contain data. An example of when pages might become inactive is that a task is launched and loads a few dynamic libraries in, then quits. It'd be handy to keep those dylibs around just in case (try launching an app like OmniOutliner, quitting, and relaunching. Even on a poor man's metric like # of dock bounces, it was faster the second time, right?) but if the memory needs to be claimed elsewhere, it can be. So the amount of unused RAM is a nontrivial quantity, but if you mean "how much RAM is free" the answer is the free pages. If you mean "how much more RAM could I activate" then it's that in free or inactive pages. You'll have real performance problems if you've got few free pages and and lots of wired pages, because the wired pages can't go anywhere you've got that much memory less for user-space processes to swap in.

Okey dokey, let's have a look at the paging statistics next then. All of the following values are cumulative; they start at 0 at boot time and monotonically increase with (up)time.

The state of a page is known to the kernel in terms of a few variables:
  • Empty

  • Wired

  • Dirt (i.e. has the page been modified)

  • Current access privileges (none, read, or read/write)

  • Desired access privileges

There are lots of different mechanisms by which the state as described above can be modified, and each change of state triggers at least one "translation fault". These faults are handled by vm_fault(), so every time that's called the translation fault count goes up.

When a new Mach task is generated, it has an inheritance property which describes whether it starts with its own memory objects or whether it receives copies from its parent (as would happen with a fork(2) call). But rather than generating all those copies when the task is launched, the memory manager gives it shadow objects which refer to the original memory objects. The task doesn't receive its own copy until it tries to write to the shadow object, at which point that generates a copy-on-write fault and the local version of the memory is finally generated.

"Pages zero-filled" describes exactly its own purpose, I hope.

A reactivated page is one that shifts from being inactive to active, as in the example of re-launching an application given above. Reactivating pages means less going out to disk to recover the data, which means faster performance. Finally, a pagein occurs whenever a new page is requested from the pager (basically, whenever a peek or poke into memory that has not yet been dealt with occurs) and a pageout occurs whenever a page has to be handed off to a pager to allow another page to be created or paged back in.

The problem with all of the above information is that it's not presented in a way which I'd find useful :-). I'd prefer my RAM statistics to be presented in more comfortable memory units such as megabytes; this is analogous to the default output of df(1) being the 512-byte block. I don't care about most of the other VM statistics at all, because they're only important if you're debugging the memory manager or a pager. Rather than the number of pageouts since I switched the computer on, I'd like to know the number of pages currently out (i.e. the amount of swap use). There's a sysctl "vm.swapusage" which reports on swap usage, which I've used in free. The best way to get at swap information would be to interrogate the pager(s), not the kernel, and looking at Apple's I don't think there's currently a way to achieve that. But then there are a few things the dynamic_pager would do if I were world emperor that it currently can't; reporting how far off a high-water alert is, cleanly turning off paging to a particular directory, maintaining a single table of all the files currently in use by all the dynamic_pager instances. Anyway, remember what I said earlier about the amount of swap being changable; all we get from vm.swapusage is the amount of filesystem which is currently dedicated to swap.

One thing which can be achieved with the available information, and which vm_stat doesn't report (neither does free) but top(1) does, and which is incredibly useful, is the differential with respect to time. Go on, fire up top. Where there are the pagein and pageout figures, the little number in brackets tells you the increase since the last report. Now a graph of that over time would tell you what effect the system utilisation was having on the memory utilisation. That'd be sweet.

The final line of vm_stat's output is the object cache hit rate, which is the ratio of objects requested vs. those which were in active memory at time of request. Obviously, the higher the ratio, the fewer reactivations and pageins. On a bunch of different Macs I have access to, the hit rate is between 2% and 30%.

BTW, would you like some references? Here's the OSF Mach documentation (the Principles manual contains a good chapter on virtual memory management), Apple's Kernel Programming guide, and the paper describing Mach's virtual memory management by the guys at CMU.

Edit 2006-06-18 09:07 GMT: stopped claiming that free doesn't know about swap space. It does.

2 comments:

Nigel Kersten said...

Awesome. Now I've got somewhere to point people rather than going "Heh :)" when they ask how much free RAM they have... :)

Charles Wiles said...

hi Graham, nice little article. About differential reporting, though, you can't have read the vm_stat(1) man page carefully enough! Right at the top you'll find this:

vm_stat displays Mach virtual memory statistics. If the optional interval is specified, then vm_stat will display the statistics every interval seconds. In this case, each line of output displays the change in each statistic [...]

eg. try: vm_stat 2

Nigel: when people ask about free RAM, why note suggest to non-techies that they look at /Applications/Utilities/Activity Monitor.app - System Memory ? As for techies, they should know about top... :-)