next up previous contents
Next: Problem Domain Up: Terminology and Basic Technologies Previous: Caching

Subsections

The Usenet News System

  The ``Network News Protocol'' (NNTP) has been standardized in RFC 977 [KL86]. Extensions made to this protocol are explained in  [Bar97]. A short overview of NNTP (and NNRP, an extension for news readers) will be given in appendix B. The format of news messages has been specified in [HA87].

The Usenet News System currently consists of over 30000 newsgroups organized in a hierarchy, which is represented by a dot notation. For example, newsgroups starting with news. deal with the News System itself and newsgroups starting with comp.lang. deal with computer languages. Table 2.1 shows a small part of this hierarchy.


 
Tabelle 2.1: Newsgroups and their hierarchy
Name Purpose
alt. The alternative hierarchy. Everybody is allowed to create a newsgroup within this hierarchy.
biz. Covers business related stuff.
comp. This hierarchy deals with all aspects of computers.
news. This hierarchy deals with the news system.
rec. Recreational hierarchy.
sci. Hierarchy for scientific newsgroups itself.
soc. Deals with society in general.
soc.culture. Deals with the society's culture.
soc.culture.austrian This newsgroup deals with the Austrian society's culture.
talk. General talk.


This arrangement allows to quickly identify and subscribe to the newsgroups one is interested in.

How Newsgroups and Articles are Related

Each newsgroup holds news articles . A news article may be stored in one or more newsgroups. Figure 2.1(a) shows an example for the relationship between newsgroups and articles, while figure 2.1(b) gives the according general entity entity relationship (EER) diagram [EN94].


    
Abbildung 2.1: Relationship between newsgroups and articles

The following types of newsgroups exist.

Reading and Posting allowed
Everybody is allowed to read articles from or post articles to these newsgroups.
Moderated
These groups can be read by everybody. However, articles being posted to this group are sent to the group's moderator, who decides whether the article's contents suits the newsgroup's topic and should be posted.
Read-Only
These groups can only be read by ordinary users. Posting articles to these groups requires some kind of authorization.

Articles

  An article is a piece of information submitted by a user to one or more newsgroups (articles submitted to several newsgroups are called crosspostings). Articles are also called postings and the submission of an article frequently is called to ``post an article''. Each article carries several identifiers:

Articles start with header lines, followed by a blank line and by the message body. Each header line consists of a keyword, a colon, a blank, and some additional information. The exact description of the format of an article can be found in RFC 1036 [HA87]. The header contains at least the following header lines.

From
The ``From'' line contains the electronic mail address of the person who sent the article.
Date
The ``Date'' line is the date and time when the article was originally posted to the network.
Newsgroups
The ``Newsgroups'' line specifies the newsgroup or newsgroups to which the article belongs.
Subject
The ``Subject'' line gives a short summary of the contents of the article to enable a reader to make a decision based on the subject whether to read the article.
Message-ID
The ``Message-ID'' line gives the article's unique identifier. To ensure the uniqueness of the Message-ID it may not be reused during the lifetime of the article.
Path
This line shows the network path the article took to reach the current system. When a system forwards the article, it should add its own name to the list of systems in the ``Path'' line.


  
Abbildung 2.2: An example news article (header and body)
\begin{figure}
 \begin{center}
 \leavevmode
 \begin{tabular}
{p{14cm}}
 \hline
\...
 ...wscache/

Thomas\end{verbatim}\\  \hline
 \end{tabular} \end{center}\end{figure}

The sample article shown in Figure 2.2 has the subject News Cache Released and has been posted by the user gschwind from host w5.infosys.tuwien.ac.at .

News Servers

Only newsgroups and articles stored on the news server can be retrieved by a client. This implies that each news server has to maintain a database storing its available newsgroups and articles. News servers are responsible to index and expire their articles. The quality of this service depends on the news server's software and its configuration.

Newsgroups may be available on one or more news servers. If newsgroups shall be available on other news servers too, the newsgroups and their articles must be propagated to those news servers. On the Internet those newsgroups and articles are generally distributed using the ``Network News Transport Protocol'' (NNTP).

The administrator of a news server decides which newsgroups can be retrieved by other news servers and which newsgroups will be requested from other news servers. Most newsgroups and articles are usually distributed globally by default. Only newsgroups of questionable value or local interest are not distributed to all news servers.

Clients may read news either using NNTP or the ``Network News Reader Protocol'' (NNRP). The latter is selected using NNTP's mode reader command. NNRP provides only those commands of NNTP being necessary for the retrieval and submission of news articles from the client's point of view. In addition it provides other commands to retrieve a summary or to filter specific information of the available articles. See appendix B, [KL86] and [Bar97] for a detailed discussion.

Distribution Strategy for Articles and Newsgroups

Figure 2.3 shows a small sample news network and possible NNTP- and NNRP-connections.


  
Abbildung 2.3: A sample News network
\begin{figure}
 \begin{center}
 \leavevmode
 
\epsfig {file=eps/nntp_network.eps}

 \end{center}\end{figure}

If an article is posted to a news server the article has to be propagated to all other news servers that hold the according newsgroup. Assume that an article is posted to News Server 4 by Client 4.1 into a newsgroup stored on all news servers.

Initially the article arrives at News Server 4, which propagates the article to the News Servers 1 and 2. Then News Server 1 will propagate it to News Server 3 and 2. The propagation to News Server 3 is successful. However News Server 2 refuses to accept the article since it has already stored this article. News Server 2 can identify this situation, because each article contains a globally unique identifier (as shown in section 2.3.2).

The creation of a newsgroup is done by sending a specially formatted article to a special newsgroup (this newsgroup is usually called control). Creation of newsgroups is distributed in exactly the same way as articles are being distributed.

News Readers

To access articles stored on a news server the user needs a news reader program. The news reader retrieves the articles from the news server using NNRP. The news reader acts as frontend to the news server's database and stays connected to the same news server during the user's session.

The news reader is responsible for displaying a list of available newsgroups. Then the user can select the newsgroup he wants to read. When a newsgroup has been selected the news reader shows a list of available articles in the selected newsgroup. Now the user can select the articles he intends to read.

To recognize which articles have been read by the user most news readers need to connect to the same news server always. This is necessary, because many news readers use information depending on the article's arrival order on the news server, which is different for different news servers.


next up previous contents
Next: Problem Domain Up: Terminology and Basic Technologies Previous: Caching
gschwind@infosys.tuwien.ac.at