This is a quite complete rewrite of mergemem by Marnix Coppens 
<maco@telindus.be>. 
To avoid confusion I changed the name of the executable to mergemem2. 
Here is some text from emails describing mergemem2:

********************************

I've just gone through release 0.06 and it's looking good.
If you don't mind, I'd like to make two remarks/requests:

(1) Additional user-space feature
---------------------------------
Right now, mergemem tries to find common pages between processes with
the same name or some user-supplied pids. A nice feature would be to
share pages based on the common use of shared libraries. For instance,
on my (very lightly loaded) system, I count at least 10 processes that have
libc.so mapped in the same virtual location. Even though these processes are
totally unrelated, some of their memory can still be shared.

Generally speaking, right after the .text and .rodata section of the shared
lib,
the bss section follows. Even though the bss is not associated with an inode
(it's anonymous), you can tell what inode it really "belongs" to by looking
at its VMA, which is contiguous with the .text and .rodata sections.
(Note: this is probably not true for every shared lib...).

An extension to the --all option would be some kind of super --all option,
that will analyse the memmap of all processes, make a list of all shared
libraries and their bss section (which may not always be at the same virtual
location, another option to restrict this ?), and build a PidList for
every shared library and proceed as usual. This is certainly non-trivial
and I'm willing to work on it myself (don't you guys have exams or so ?).
Right now, I do this manually and this gives quite a reduction.

There is at least one process that will need extra attention or things
could go horribly wrong, namely the X server. This is quite a little bastard
because it mmap()s several parts of the display memory and the anonymous
mappings that follow those of /dev/mem have nothing to do with bss or such.

=-> A simpler (safer and more feasible) way to go about this would be to let
the user provide a list of library names to let mergemem do its work for,
something like:

mergemem -s /lib/libc.so -s /usr/lib/libm.so

With this -s or --shared option, mergemem would then read and follow
the symlinks to come up with the final device:inode numbers (or just the
names themselves for 2.1.xx) and start processing /proc/xxx/map accordingly
for all processes using one of those libs.

Also keep processes like xinit in mind: it's got most of its libraries mapped
twice! (try "ldd /usr/bin/X11/xinit").

What do you think about this idea?

********************************

Thirdly, for every pid in the merge list, /proc/pid/maps is read and
parsed. This is done rather differently from the old way. There was
actually a bug (or a shortcoming) in the way it used to be done,
because you only merged memory regions that were overlapping, even if
they had nothing to do with each other. For instance, the anon map
created by libc.so can sometimes lie in totally different regions
and mergemem would not merge them; actually it could even try to merge
an anon map belonging to libc.so with one belonging to libXt.so
for instance. Not that this is wrong, it just won't do much..

So what I've done is to remember the inode that precedes every anon map.
In most (all?) cases, there is at most one anon map created for every
exe or shlib (stack pages are also recognised). This also allows me to
filter the anon maps by shared library if desired (this is for the -s
or --shlib option I've added).

Fourth, once the map has been read for every pid, they are then merged
pairwise. The decision of when to merge anon maps of two processes is
now very simple: their inodes and their size have to match, but their
offset in virtual memory doesn't even matter.

To reiterate: only anon maps that belong to the same file mapping are
candidates for being merged, because the vmstart offset is certainly not
a  good indicator to decide which maps to merge. The number of attempted
merges is still kept minimal.
New is also that stack pages are also merged if meaningful.
Come to think of it, I should be adding a new option not to merge them.

Fifth, statistics are gathered along the way, just like before (they're
even the same).

********************************

As promised, I've done some work:

1) drastic reduction of the number of ioctl()s. Every page is now
checksummed at most once. I should have done this right from the
beginning (this is what you get from late evenings work :-|).

2) smeared it out over three (3!) files:
  mergemem.c: contains main(), the daemon stuff (nearly there!) and
              the actual mergemem facilities.
  merge_parse.c: everything related to parsing command line options and
                 configuration files (not yet, very very soon)
  merge_utils.c: various stuff used by everything. I've also added some
                 linked list helper routines without pushing it too far.
Everything's cleaner now. You will have to change your Makefile for this
(not included). Another nicety is when you compile with DEBUG, you'll
get some malloc/free statistics (no memory leaks so far).

3) daemon mode now works with all the different time intervals. The
configuration file is *still* not being read, because that will require
some further changes to merge_parse.c but everything's set now.

4) Ah yes: the loglevel is now a bitmask! To be pedantic: if you specify
--loglevel 13, you will get all messages pertaining to levels 1, 4 and 8.
1 -> user messages + final statistics
2 -> walk_proc_dir (removing/skipping pid)
4 -> read_anon_maps (reading map file, more statistics)
8 -> gory details (real lowlife^H^H^Hevel stuff)
or something like this.

5) some cleanups in creating the mergelist and other places..

There is still a problem for 2.1.xx kernels because they show the full
library's name instead of the device:inode numbers in the /proc/xxx/maps
output. This still requires an lstat() to produce the dev:ino pair.
That's four system calls per process!
(Perhaps we should add another ioctl type to return a list of
memory ranges for a given pid, you already suggested something like this)

Read and enjoy.

