next up previous contents
Next: News Cache Extensions Up: Future Work Previous: Future Work

Subsections

Improvements

The following sections describe improvements that will be made to the News Cache to increase its performance. Hence no extra functionality will be visible to the users of the News Cache.

News Database

  In future releases, we will provide the full class hierarchy for the active and overview databases. In the current release we provide only classes to store the databases on non volatile memory (usually a file on the hard disk). However, since the RServer class does not need to provide a persistent news database, it makes sense to store the databases in volatile memory. This will improve the performance for the RServer class.

To reduce the requirements on the filesystem, the format of the article database will change in a future version. Currently we consider to extend the Non Volatile Classes Library to provide the efficient management of articles. However, currently this part of the database has not been designed.

Caching Granularity

The caching granularity of the news cache may either be based on articles or on newsgroups. Article based caching reduces the required disk space and network bandwidth, because only articles being requested by a client will be cached. On the other hand newsgroup based caching reduces the number of network connections made to the news server and the load caused to the news server. Additionally newsgroup based caching reduces the total transmission time, because the articles are requested in fewer, but bigger junks. This improves the response time for successive requests to such newsgroups.

For small newsgroups the granularity of newsgroups seems to be better, because they require very little disk space and all data are requested in one junk. However, for large newsgroups, especially newsgroups with large articles, like newsgroups storing pictures or programs, an article based caching mechanism is better.

In one of the next releases we want to add this as an option to the user configuration. By default the newsgroups will be cached on an article based strategy. Newsgroups specified by the user will be cached based on a whole newsgroup strategy.

Expiration of News Articles

  The News Cache is allowed to allocate a fixed amount of disk space for his news database. If the maximum available space has been allocated by the News Cache, older or less frequently requested articles have to be discarded. The same problem exists for other caches as well and many available publications deal with this topic ([BDH+94], [Sta92], etc.).

Currently we did not analyze this problem in detail. In the current release we use an external program that removes the least recently used articles first. The advantages and disadvantages of different expiration strategies are:

Least Recently Used
expires the article's that have not been accessed for the longest period first.
Least Frequently Used
This strategy removes less frequently used articles first. However, this solution may discard newer less frequently articles in favor of older articles that have been used frequently in the past.
Oldest Article First
to remove older articles first, because they will expire before the other articles expire.
Biggest Article First
expires those articles with the biggest size first. This eliminates articles usually found in binary groups. The elimination of these articles costs the least time and brings the most disk space benefit.

For a solution to this problem, statistics for all solution strategies (including combinations of these solution strategies) have to be collected. Based on this statistics we will decide for the final expiration behavior.


next up previous contents
Next: News Cache Extensions Up: Future Work Previous: Future Work
gschwind@infosys.tuwien.ac.at