MMLIB(3) Linux Programmer's Manual MMLIB(3)
NAME
m_getpageinfos, m_hashpages, m_merge, m_set_hash_default -
identify and merge virtual memory pages of same contents
SYNOPSIS
#include <mmlib.h>
ssize_t m_getpageinfos( pid_t pid, const void * start,
const void * end, MPAGEINFO * pageinfos, size_t nel, const
void * * contp );
ssize_t m_hashpages( pid_t pid, MPAGEINFO * pageinfos,
size_t nel, unsigned long (*hashfunction) (void * addr,
size_t size) );
ssize_t m_merge( MEQUALPAIR * pagepairs, size_t nel );
For fine tuning and testing:
ssize_t m_set_hash_default( unsigned long (*hashfunction)
(const void * addr, size_t size) )
unsigned long m_hash_addrothalf(const void *, size_t);
unsigned long m_hash_const(const void *, size_t);
DESCRIPTION
mmlib allows to indicate pages of identical contents to
the virtual memory system thereby reducing physical memory
usage. Various strategies for detecting and sharing pages
can be implemented safely with mmlib in user space. Using
mmlib cannot affect the integrity of the memory system.
For example, attempts to merge pages of different contents
will fail.
Programs using mmlib will typically proceed as follows.
Information about each page (virtual, physical address and
reference count) is retrieved with m_getpageinfos. To
find pages of same content, m_hashpages is provided.
Finally, potential candidates are merged together with
m_merge.
You can fine tune m_hashpages with your own hash function,
by passing it to each call, or globally with
m_set_hash_default. Please contribute any function faster
and more significant than the current one!
Information about pages is represented with the structure
MPAGEINFO.
Linux 2.0 12 January 1999 1
MMLIB(3) Linux Programmer's Manual MMLIB(3)
typedef struct
{
const void * v_addr; /* virtual address */
const void * p_addr; /* physical address */
size_t count; /* reference count */
unsigned long hash; /* hash */
} MPAGEINFO;
m_getpageinfos writes into array pageinfos information
about the pages of process pid that are used to represent
the region starting from start until end. The array page-
infos must provide room for at least nel elements. At
most nel entries are written into pageinfos in ascending
order of v_addr. Swapped pages and those that cannot be
merged will not be considered. For every page the fields
v_addr (virtual address), p_addr (physical address), and
count (reference count) are set. The field hash is
ignored and left unchanged.
The argument contp is either NULL or the address of a
valid location.
On success, m_getpageinfos returns the number of written
entries which may be smaller than nel, if there are less
entries in the region. In this case *contp is set to
NULL. If more than nel entries are found, *contp is set
to the first page that has not yet been examined. On
error, -1 is returned.
EXAMPLE m_getpageinfos(pid, (void *)0, (void *)-1, pagein-
fos, nel, &cont) determines the first nel pages in process
pid.
m_getpageinfos(pid, (void *)-1, (void *)-1, pageinfos, 1,
&cont) gets information about the last page in memory (if
present) and sets cont to NULL.
m_hashpages determines the hash values of pages in process
pid. For each of the nel elements in array pageinfos the
page containing the virtual address found in field v_addr
is examined and the fields hash, p_addr, and count are set
to their current values. If the page is not available
count is set to zero and p_addr is set to NULL.
On success, m_hashpages returns the number of successfully
hashed pages, which may be smaller than nel, if some of
the pages are not available. On error, -1 is returned,
and errno is set appropriately.
The hash method is specified with hashfunction. If NULL,
the current system default is taken. Otherwise,
Linux 2.0 12 January 1999 2
MMLIB(3) Linux Programmer's Manual MMLIB(3)
hashfunction must be a pointer to a function that returns
a hash value for the size bytes long object starting at
addr. The value that m_hashpages writes into the fields
hash may be different to the corresponding return value of
hashfunction.
With m_set_hash_default the default hash function is
changed.
Predefined hash functions start with prefix m_hash_.
Alternate versions with prefix m_libhash_ allow perfor-
mance comparisons with user defined functions, since they
are executed the same way as user defined functions.
m_hash_addrothalf and m_libhash_addrothalf compute
a simple hash function based on the first half of a
page.
m_hash_const and m_libhash_const are constant func-
tions. Useful for measuring calling overheads.
m_merge indicates to the virtual memory system that the
first nel entries in pagepairs should be merged by replac-
ing each page spid, sv_addr by the page dpid, dv_addr. If
possible, the physical page of spid, sv_addr will be
freed. The result of the attempted merge is returned in
the status field which is zero, if the pages have been
merged. On success, m_merge returns the number of merged
pages. On error, -1 is returned and errno is set appro-
priately.
typedef struct
{
pid_t spid; /* source pid */
const void * sv_addr; /* source virtual address */
pid_t dpid; /* destination pid */
const void * dv_addr; /* destination virtual address */
int status; /* return code */
} MEQUALPAIR;
ERRORS
ESRCH The process whose ID is pid could not be found.
EPERM The calling process does not have appropriate
Linux 2.0 12 January 1999 3
MMLIB(3) Linux Programmer's Manual MMLIB(3)
privileges. The effective userid of the calling
process must be equal to the effective userid of
pid, or the superuser.
EFAULT The pointer pageinfos, hashfunction, or contp is
outside your accessible address space.
EINVAL The parameters do not make sense.
ENOPKG The kernel module mergemod has not been installed.
EPROTO The version of the current mmlib uses a different
protocol than the kernel module mergemod. A dif-
ferent library or kernel module must be installed.
EOVERFLOW
Internal error, library/kernel probably corrupt.
BUGS
Currently, only root can use this library. Other users
will receive an EPERM error. Not all cases of EFAULT are
detected. All calls could be faster. The operating sys-
tem may currently over commit memory due to m_merge.
CAVEATS
The information provided by m_getpageinfos and m_hashpages
may not be accurate due to system operation during or
after the calls are performed.
HOW TO CONTRIBUTE
mmlib has been designed to enable people on many levels of
expertise to contribute. You do not need to be a kernel
hacker to contribute.
1. Better hash functions. The ideal hash function
should only consider a few lines of a memory page
and still be more significant than the current one.
2. Better merging strategies. Currently only two are
available mergemem(8) and mergeall(8).
Linux 2.0 12 January 1999 4
MMLIB(3) Linux Programmer's Manual MMLIB(3)
3. Improved security schemes. See section SECURITY
for potential current problems.
4. Make memory redundancy in processes sharable.
Quite often, pages cannot be merged just because of
a small memory offset caused by superfluous details
like command lines of different length. You may
improve sharing large areas using valloc(3) rather
than malloc(3). Maybe malloc's library can be
adapted for larger areas.
5. Improve the kernel module mergemod, if you are a
devoted colonel hacker. In particular more effi-
cient support for SMP is needed.
PORTABILITY GUIDE
Many system dependent aspects are hidden by the mmlib
library. Please consider the following to avoid unneces-
sary system dependence in your code.
1. Page size. The actual system's page size is not
needed. Do not assume a particular size and don't
call getpagesize(2). The hashfunction needed for
m_hashpages receives the current page size as a
separate argument dynamically.
2. Alignment. Do not assume that pages are page
aligned. For example, the address handed over to
hashfunction is only word aligned.
3. Do not assume that all pages are of the same size.
Some memory systems allow different page sizes.
Therefore, getpagesize(2) should be avoided. mmlib
is link-compatible even in such situations.
4. Cache architecture. The provided interface should
perform well on all architectures (virtually and
physically mapped caches). For example on virtu-
ally mapped caches, pages with same content but
incompatible virtual memory location receive dif-
ferent hash values.
5. Do not assume that the field hash carries the same
value as returned by hashfunction. For reasons as
mentioned in 4 a different value may be present.
Linux 2.0 12 January 1999 5
MMLIB(3) Linux Programmer's Manual MMLIB(3)
SECURITY
Calls that refer to processes of different effective
userids can only be performed by the superuser for secu-
rity reasons. Otherwise the content of a page might be
revealed to unauthorized processes.
In a hostile environment only select processes should be
merged whose purpose and operation is well known. Arbi-
trary processes of untrusted users should not be merged.
Sharing pages of unrelated user processes might provide an
indirect hint about the existence of other users' pages of
same content. Statistical information about sharing and
memory usage might be exploited by unauthorized processes
to this end. Even without such information the following
scenario is possible.
A hostile process tries to guess the content of a
confidential page by creating a set of arbitrary
pages containing some guesses. An authorized pro-
cess merges one of those pages with the confiden-
tial page. The sharing of the two pages is not
directly visible to the hostile process. But modi-
fying a shared page takes much longer, because it
causes a copy-on-write page fault.
VERSION
Last modified 1999-01-18.
SEE ALSO
mergemem(1), mergeall(1)
Linux 2.0 12 January 1999 6