Writings on various topics (mostly technical) from Oliver Hookins and Angela Collins. We have lived in Berlin since 2009, have two kids, and have far too little time to really justify having a blog.
While I was still working for Anchor Systems, we had a client who was launching a fairly large website and as part of the gradual ramp-up to delivery we needed to perform some capacity tuning of the web/application servers. The application stack was basically Perl via mod_perl on Apache (not threaded) so we had to determine the memory footprint of the application and make a determination of how many client processes we could support on each server (divide your available physical RAM by the size of the process).
Unfortunately for the system administrators in question, this is a little more difficult than expected due to Linux's memory sharing smarts. There used to be no easy way to determine the split between shared and private RSS (Resident Set Size) of a process, making it virtually impossible to say how much of the memory allocation for a process was really completely unique and therefore important to be included in calculations. A similar issue existed for determining the number of mapped pages. At the time, we chose the safest option - consider the entire allocation to be private - thus using slightly more hardware resources but guaranteeing never to cause performance degradation due to overzealous memory allocation.
Kernel versions >= 2.6.25 provide the /proc/$PID/pagemap interface which allows you to examine the page tables for processes. The format of the data is documented in the Linux Cross Reference, which, if you don't already have bookmarked, do it now! There is also a writeup of the interface and how it can be used in LWN.net which is another bookmark-worthy resource with many very technical articles.
It appears someone has also written a userspace tool to pull information out of this interface, at http://selenic.com/repo/pagemap/
It is also possible to view directly human-readable information from /proc/$PID/smaps which divides memory allocation up by loaded libraries and the stack. Quite verbose and certainly useful in some situations.