NetBSD-5.0.2/share/doc/iso/wisc/net_design.nr

.\"	$NetBSD: net_design.nr,v 1.2 1998/01/09 06:34:48 perry Exp $
.\"
.NC "The Design of the ARGO Network Layer"
.sh 1 "Connectionless Network Layer
.pp
The following sections describe the design of the ARGO
connectionless network layer (CLNL).
The connectionless network service is provided by several
network-layer protocols: ES-IS (ISO 9542), 
CLNP (ISO 8348), and (ISO 8208) X.25.
The protocol CLNP is the primary connectionless network layer
protocol.
It is supported by X.25 when X.25 is used as a subnetwork layer.
X.25 can also be viewed as a link layer protocol in this context.
The ES-IS protocol supports CLNP by providing the following functions:
.ip \(bu 5
automatic mapping of NSAP-addresses to SNPA addresses,
.ip \(bu 5
automatic configuration of networks of end systems and intermediate
systems,  and
.ip \(bu 5
redirection of network-layer traffic in response to 
configuration changes.
.pp
The rest of this chapter describes the design of 
CLNP, the design of  ES-IS,
and the design of the connection-oriented
network layer, including the connection-oriented subnetwork service (X.25).
.pp
CLNP has two subsets defined: the Inactive Network Layer 
protocol subset and the Non-Segmenting protocol subset. 
The Inactive Network Layer subset is a null-function subset 
in which the CLNP is not needed, and the 
protocol consists of sending
a 1-byte header containing the value zero. 
This "subset" is not supported in ARGO.
.pp
The Non-Segmenting protocol subset permits simplification of the DT NPDU
header when it is known that segmentation of the DT NPDU is not required.
ARGO supports this subset.
When this subset is used, 
the segmentation part of the DT NPDU (data packet) header is not present, 
and the \fIdon't segment\fR bit is set in the 
fixed part of the header. 
This subset is chosen by setting the bit
\fICLNP_NO_SEG\fR in the \fIflags\fR argument to \fIclnp_output()\fR.
.pp
Throughout the remainder of this
document,
following definitions apply:
.(b
\(bu DT NPDU: data transfer NPDU.
\(bu ER NPDU: error report NPDU.
\(bu NPDU: either an ER or DT NPDU.
.)b
.sh 2 "DT NPDU Output"
.pp
A CLNP DT NPDU is transmitted by calling \fIclnp_output()\fR. 
.so figs/clnp_output.nr
.\" FIGURE
.CF
outlines the sequence of steps taken by \fIclnp_output()\fR when
transmitting an NPDU. 
The solid lines indicate normal flow of control. The
dashed lines indicate possible error returns (with associated
error code).
.pp
\fIClnp_output()\fR will automatically cache (in the \fIisopcb\fR)
the header of each packet it sends. This cached copy of the header
is used on subsequent sends reducing the amount of time spent generating
the header. Therefore, the first action \fIclnp_output()\fR takes is to
examine the cached header (if any). If the header is still valid (see below)
then it is used. Otherwise, a new header is built.
.sh 3 "When The Cached Header Is Invalid"
.pp
Before any resources are allocated, the options to be sent with the packet
are examined. If any unsupported options are present, the error \fIEINVAL\fR
is returned.
Next, the length of the source and destination 
NSAP addresses (taken from the \fIisopcb\fR)
are checked. The source address length may be zero. This
indicates that \fIclnp_output()\fR should compute the source address based upon
the route taken, in which case CLNP calls
the function \fIclnp_srcroute()\fR.
Source routing 
will be discussed in detail later in this section.
If, in the process of checking
the address lengths, an invalid length is detected, the error
\fIENAMETOOLONG\fR is returned.
.pp
After checking the lengths of the addresses, 
CLNP allocates an \fImbuf\fR in which the DT NPDU header will be constructed. 
If an \fImbuf\fR cannot be found, the error
\fIENOBUFS\fR is returned. Once the \fImbuf\fR is allocated, 
the fixed part of the DT NPDU header is copied into the \fImbuf\fR.
.pp
The next step is to route the DT NPDU. This is accomplished by the
\fIclnp_route()\fR function. 
It is necessary to route the datagram early in the output process because
in many cases, the source address will not be known until the route
has been created.
When a system is multi-homed it has several source addresses. 
The source address to choose depends on the
network interface (thus, the route) used.
.pp
The address part of the DT NPDU follows the fixed part. 
Since appending the address part is the next task, 
the source address must be determined.
Therefore the route must be determined.
.pp
After appending the address part to the fixed part of the
NPDU header, CLNP
appends any options given in the arguments to 
\fIclnp_output()\fR.
The options are specified in a
separate \fImbuf\fR stored in the \fIiso_pcb\fR.
If this \fImbuf\fR
pointer is not null, a copy of the \fImbuf\fR is made, and this copy is 
chained (appended) to the
\fImbuf\fR in which the
NPDU header resides. The options \fImbuf\fR linked in with the DT packet
must be a copy of the options \fImbuf\fR passed to \fIclnp_output()\fR. If
this was not done, then
the options \fImbuf\fR passed would be freed by the interface
driver after the NPDU had been transmitted.
Since a copy must be made, it is possible for \fIclnp_output()\fR to
return \fIENOBUFS\fR at this time.
A later section of this chapter describes
the handling of options in greater detail.
.pp
User data for the packet are passed to 
\fIclnp_output()\fR as an \fImbuf\fR chain.
This \fImbuf\fR chain is appended to the DT NPDU header chain. 
At this point, the DT NPDU is ready for transmission. 
If header caching has not been disabled, a cache entry is made in the 
\fIisopcb\fR.
If the size of the entire packet 
is less than the maximum transmission unit (MTU) of the 
network interface to be used,
the packet is placed on the queue for that network interface,
otherwise \fIclnp_fragment()\fR is invoked to
break up the packet into smaller packets, called
"derived NPDUs", and transmit the derived NPDUs.
.sh 3 "When A Cached Header Exists"
.pp
In this case, \fIclnp_output()\fR updates the segmentation part of the
header (if segmenting is permitted), computes the checksum, and transmits
(or fragments) the packet.
.pp
The cached CLNP header is stored in the \fIstruct isopcb\fR. The field
\fIisop_clnpcache\fR within the \fIisopcb\fR points to an \fImbuf\fR
which contains a \fIstruct clnp_cache\fR:
.(b
\fC
.TS
tab(+);
l s s.
struct clnp_cache {
.T&
l l l l.
+u_short+cni_securep;+/* ptr to security option */
+struct iso_addr+clc_dst;+/* destination of packet */
+struct mbuf+*clc_options;+/* ptr to options mbuf */
+int+clc_flags;+/* flags passed to clnp_output */
+int+clc_segoff;+/* offset of seg part of header */
+struct sockaddr+*clc_firsthop;+/* first hop of packet */
+struct ifnet+*clc_ifp;+/* ptr to interface */
+struct mbuf+*clc_hdr;+/* cached pkt hdr (finally)! */
};
.TE
\fR
.)b
The first three fields \fIclc_dst, clc_options\fR and \fIclc_flags\fR
are used to check the validity of the cache entry. The cache is considered
valid if:
.ip \(bu 5
The options mbuf has not changed.
.ip \(bu 5
The destination of the packet has not changed.
.ip \(bu 5
The route still exists and is up.
.ip \(bu 5
The flags have not changed.
.pp
If all these conditions are met, then the bulk of the \fIclnp_output()\fR
processing is avoided. The fields \fIclc_segoff, clc_firsthop,\fR 
and \fIclc_ifp\fR are used by \fIclnp_output()\fR to transmit the packet.
The field \fIclc_ifp\fR contains the actual cached header which is copied
and then enqueued on the outgoing interface.
.sh 2 "NPDU Input"
.pp
.\" FIGURE
.so figs/clnp_input.nr
All CLNP NPDUs are processed by \fIclnp_input()\fR. 
.CF
outlines
the flow of control within \fIclnlintr()\fR and \fIclnp_input()\fR. 
The solid lines
indicate normal flow of control. The dashed lines indicate 
possible error returns.
.pp
\fIClnlintr()\fR is invoked by a software interrupt. 
This interrupt is posted by a device driver whenever a 
packet is placed in CLNL's input queue 
\fIclnlintrq()\fR, and the queue is empty. 
It is the responsibility of \fIclnlintr()\fR, when invoked, 
to process all packets present on the input queue. 
Thus, to begin the task of processing a packet, \fIclnlintr()\fR
removes the next packet from the queue. 
When an error is discovered during processing, the packet is discarded and
\fIclnlintr()\fR begins afresh.
.pp
Once removed, the type of the NPDU is checked. If the NPDU is an
ES-IS packet, then \fIesis_input()\fR is called. If the NPDU is a CLNP
packet, then \fIclnp_input()\fR is called. Other packets are silently
discarded.
The function \fIclnp_hdr_ck()\fR checks the NPDU for consistency. 
Before checking consistency, \fIclnp_hdr_ck()\fR insures
that the entire NPDU header is located
contigiously in a single \fImbuf\fR (\fIm_pullup()\fR\** performs this task).
.(f
\** If the NPDU header is larger than \fIMLEN\fR (currently 256), then
\fIm_pullup()\fR will allocate a cluster \fImbuf\fR.
.)f
After "pulling" the header into a single \fImbuf\fR, \fIclnp_hdr_ck()\fR
checks for the proper CLNP version and protocol identification. 
It also checks that the lifetime field is greater than zero.
After checking header consistency, the NPDU checksum is computed.\**
.(f
\** If the checksum value is zero, the checksum is not computed. 
The value zero is reserved to mean \*(lqdo not use checksum\*(rq.
.)f
If the checksum is valid, \fIclnp_data_ck()\fR is called to insure
that the amount of data in the \fImbuf\fR chain corresponds to the
amount indicated in the NPDU header.
.pp
Once the consistency of the NPDU has been assured, the various parts of the
packet are extracted. 
Care is taken with each extraction to insure that an attempt is not made
to address data that does not really exist. (Such an attempt could
result in a kernel trap).
.pp
Next, the options part of the NPDU, if present, is checked for validity.
If unsupported options are found, the packet is discarded. 
See the section \*(lqNPDU options\*(rq for details of options processing.
.pp
Finally, after the preceding checks and extractions have been made, the
destination address is examined. 
If the address indicates that the packet's destination is not this
system, the packet is forwarded by calling \fIclnp_forward()\fR. 
See the section \*(lqDT NPDU Forwarding\*(rl for details of packet forwarding.
If this end system is the
packet's destination, processing continues.
.pp
If the packet is not complete, it is passed to \fIclnp_reass()\fR for
reassembly. 
See the section \*(lqDT NPDU Reassembly\*(rq
for details of packet reassembly.
.pp
At this point, a complete NPDU is in hand. 
If the NPDU is a DT NPDU, it is given to the transport layer
by calling the TP input routine. 
Otherwise, it is give to the ER NPDU processing function, 
\fIclnp_er_input()\fR.
.sh 3 "DT NPDU Forwarding"
.pp
Packet forwarding is accomplished by \fIclnp_forward()\fR. 
This is performed regardless of the system's type (end or intermediate).
The task of
forwarding a packet is fairly straight-forward. First, the lifetime
field of the datagram is decremented. 
If this operation changes the value to zero, the packet is discarded.
.pp
If the source route option is present, and the address at the top of the list
matches an address of one of the system's network interfaces, then
the next-source-route-to-be-used offset is adjusted in the option.
Next, the packet is routed by \fIclnp_route()\fR
or \fIclnp_srcroute()\fR. 
If the record route option is present, the address of the outgoing 
network interface is recorded by \fIclnp_dooptions()\fR.
.pp
Finally the packet is dispatched. 
If the size of the entire packet is less than the MTU of the output 
network interface, the packet is enqueued for that interface, 
otherwise \fIclnp_fragment()\fR is invoked to
fragment the packet and enqueue the derived NPDUs.
.sh 2 "NPDU Options"
.pp
The options section of an NPDU consists of a series of triplets:
\fIoption identification\fR, \fIoption length\fR, 
and \fIoption value\fR. 
These triplets are checked each time the options are examined or changed. 
To avoid repeated parsing of the options, the ARGO CLNP
maintains an index. 
This index is organized as a \fIclnp_optidx\fR structure. 
This structure is shown below.
.(b
\fC
.TS
tab(+);
l s s.
struct clnp_optidx {
.T&
l l l l.
+u_short+cni_securep;+/* ptr to security option */
+char+cni_secure_len;+/* length of security option */
+u_short+cni_srcrt_s;+/* offset of src rt option */
+u_short+cni_srcrt_len;+/* length of src rt option */
+u_short+cni_recrtp;+/* ptr to head of recrt option */
+char+cni_recrt_len;+/* length of recrt option */
+char+cni_priorp;+/* ptr to priority option */
+u_short+cni_qos_formatp;+/* ptr to format of qos option */
+char+cni_qos_len;+/* length of qos option */
+char+cni_er_reason;+/* reason from ER pdu option */
};
.TE
.)b
This index allows CLNP quickly to discover the existence 
and value of an option. 
For example, if a security option is present, the \fIcni_securep\fR
field of the option index is non-zero and the value of
\fIcni_securep\fR is an offset to the beginning of the 
security option. 
The function \fIclnp_opt_sanity()\fR 
parses the options and computes the index.
While parsing, it also verifies that the 
options are valid and correctly structured.
If an error occurs while parsing an option, 
\fIclnp_opt_sanity()\fR returns an error code. 
The following sections describe how options are processed
during the send, forward and receive operations.
.sh 3 "Sending Options"
.pp
Options to be sent with a datagram are passed to \fIclnp_output()\fR as
two arguments. An option index is passed along with an \fImbuf\fR
containing the options. 
The options in the \fImbuf\fR must be formatted
exactly as specified by CLNP. 
If the security, quality of service, or
priority options are specified, \fIclnp_output()\fR will not transmit the
datagram and \fIEINVAL\fR is returned.
The system call \fIsetsockopt()\fR is used to set the CLNP options 
to be sent on a datagram. 
See \fIclnp(4)\fR for more information about setting CLNP options.
.pp
If a source route is specified, 
the normal CLNP routing function \fIclnp_route()\fR is not used, and 
\fIclnp_srcroute()\fR is invoked.
.pp
When the DECBIT config option is specified, \fIclnp_output\fR will
automatically add the globally unique quality of service option to the packet.
The sequencing preferred and low delay bits in this option are set.
.sh 3 "Forwarding Options"
.pp
During packet forwarding, the padding, security,
and priority options are ignored. If record route is selected, the
function \fIclnp_dooptions()\fR logs the current network
interface address in the record route list.
.pp
If a source route is specified, 
the normal CLNP routing function \fIclnp_route()\fR is not used, and 
\fIclnp_srcroute()\fR is invoked.
.sh 4 "The Congestion Experienced Bit"
.pp
If a packet is forwarded containing the globally unique quality of
service option, and the interface through which the packet will be 
transmitted has a queue length greater than \fIcongest_threshold\fR,
then the congestion experienced bit is set in the quality of service option.
.pp
The threshold value stored in \fIcongest_threshold\fR may be changed
with the \fIclnlutil\fR utility.
.sh 3 "Receiving Options"
.pp
On receipt, all CLNP options are ignored except the security 
and globally unique quality of service option.
If the security option is found, the packet is discarded.
If the globally unique quality of service option is present, and the
congestion experienced bit is set, then the transport congestion
control function \fItpclnp_ctlinput(PRC_QUENCH2, addr)\fR is called.
The following table summarizes the CLNP option processing.
.(b
.TS
allbox, tab(+);
l l l l.
Option+Send+Forward+Receive
=
Padding+may be set+-+-
Security+reject+ignore+discard
Source Route+\fIclnp_srcroute()\fR+\fIclnp_srcroute()\fR+-
Record Route+-+\fIclnp_dooptions()\fR+-
QOS+added+congestion bit set+tpclnp_ctlinput()
Priority+reject+ignore+-
.TE
.)b
.sh 2 "DT NPDU Segmentation"
.pp
Segmentation is the process by which initial NPDUs are segmented into 
smaller derived NPDUs when the initial NPDU is too large for transmission
on a network interface.
Segmentation is accomplished by \fIclnp_fragment()\fR. 
This function chops the NPDU into pieces and individually places the pieces
in the appropriate network interface's output queue. 
Each piece is made as large as possible. 
Note: The phrase "fragmentation" is used synonymously with "segmentation"
throughout this prose and the CLNP fragmentation code. This is due to 
this author's familiarity with the DoD Internet Protocol which uses
the term "fragment."
.sh 2 "DT NPDU Reassembly"
.pp
Derived NPDUs are put back together by the process called 
reassembly. 
Reassembly is performed only at the destination end system.
When a derived NPDU arrives, it is passed to \fIclnp_reass()\fR. 
This function scans a linked list of NPDUs awaiting reassembly. 
Each packet in the list is represented by a fragment list
descriptor, which is stored in an \fImbuf\fR:
.(b
\fC
.TS
tab(+);
l s s s.
struct clnp_fragl {
.T&
l l l l.
+struct iso_addr+cfl_src;+/* source */
+struct iso_addr+cfl_dst;+/* destination */
+u_short+cfl_id;+/* id of the pkt */
+u_char+cfl_ttl;+/* time to live */
+u_short+cfl_last;+/* offset of last 
+++byte of packet */
+struct mbuf +*cfl_orighdr;+/* ptr to 
+++original header */
+struct clnp_frag+*cfl_frags;+/* linked list 
+++of fragments */
+struct clnp_fragl+*cfl_next;+/* next pkt be-
+++ing reassembled */
};
.TE
\fR
.)b
The fields \fIcfl_src\fR, \fIcfl_dst\fR, and \fIcfl_id\fR are used to
match an incoming derived NPDU with a fragment list. 
\fICfl_orighdr\fR contains a copy of the NPDU header of the first fragment received. 
The linked list of fragments pertaining to the packet is stored in the
\fIcfl_frags\fR field. 
Each NPDU fragment represented by a \fIclnp_frag\fR structure, 
stored in an \fImbuf\fR:
.(b
\fC
.TS
tab(+);
l s s s.
struct clnp_frag {
.T&
l l l l.
+u_int+cfr_first;+/* offset of 
+++first byte of this frag */
+u_int+cfr_last;+/* offset of last 
+++byte of this frag */
+u_int+cfr_bytes;+/* bytes to shave */
+struct mbuf+*cfr_data;+/* ptr to data */
+struct clnp_frag+*cfr_next;+/* next frag */
};
.TE
\fR
.)b
The fields \fIcfr_first\fR and \fIcfr_last\fR indicate the first and
last octet of the fragment. 
\fICfr_data\fR points to an mbuf chain
which contains the data for the fragment.
.pp
If \fIclnp_reass()\fR finds a \fIclnp_fragl\fR structure matching the
incoming derived NPDU, \fIclnp_insert_frag()\fR is called to create
a \fIclnp_frag\fR structure and insert it in the linked list of
packet fragments. 
If no \fIclnp_fragl\fR structure is found, 
\fIclnp_newpkt()\fR is invoked to create a new fragment list structure.
.pp
The last task \fIclnp_reass()\fR performs is to check if the fragment
that just arrived completes the reassembly of the initial NPDU. 
If it does, the reassembled NPDU is rearranged to 
look like it just arrived intact.
It accomplishes this by linking the \fImbuf\fRs holding
the fragments into one \fImbuf\fR chain that represents the initial
NPDU.
A pointer to this \fImbuf\fR chain is returned by \fIclnp_reass()\fR.
.pp
If the newly arrived fragment does not complete an initial NPDU, 
\fIclnp_reass()\fR returns NULL.
.sh 3 "Reassembly Lifetime Control"
.pp
One function of the CLNP is to prevent
a proliferation of fragments awaiting reassembly from
consuming buffers in an end system for indefinite periods of time.
This function is called reassembly lifetime control.
It is accomplished by 
periodic traversal of
the list of \fIclnp_fragl\fR structures, decrementing the 
\fIcfl_ttl\fR field. 
This field is a copy of the NPDU time-to-live
field. If \fIcfl_ttl\fR reaches zero, all resources associated with the
fragment are released.
The procedure
\fIclnp_slowtimo()\fR, which is called by the system
clock every 500 milliseconds (every half-second),
performs the CLNP reassembly lifetime control.
.sh 2 "ER NPDU"
.pp
An ER NPDU is sent to the originator of a packet when a DT NPDU is
discarded and the error report function is not suppressed. Suppression
of the error report function is accomplished by setting the "no ER"
bit in the CLNP header.
A packet is discarded by \fIclnp_discard()\fR. 
Before it
returns the \fImbufs\fR used to store the 
the discarded packet to the \fImbuf\fR free list,
\fIclnp_discard()\fR 
determines if the error report function is suppressed. 
If not, 
an ER NPDU will be sent to the originator of the discarded packet by
calling \fIclnp_emit_er()\fR.
.pp
\fIClnp_emit_er()\fR will create an ER NPDU, address it to the 
originator of the discarded packet, route the NPDU, 
and transmit it, sending the header of the discarded NPDU as data. 
ER NPDUs may not be segmented. 
If the ER NPDU is too large for the outgoing network interface, 
the packet is truncated.
.sh 2 "Raw CLNP"
.pp
In order to test CLNP in isolation from higher layer
protocols, ARGO provides a \*(lqraw\*(rq interface to CLNP.
This raw interface is selected with the \fISOCK_RAW\fR parameter to
the
\fIsocket()\fR
system call.
When a \*(rqraw\*(rq socket is open,
and CLNP receives an NPDU,
CLNP must determine whether the incoming NPDU is destined for
the 
\*(rqraw\*(rq interface or for the interface to the 
OSI transport protocol entity.
ARGO addresses this problem by using non-standard NPDU types
for packets sent on \*(rqraw\*(rq sockets.
The type field in the CLNP NPDU header
is set to \fICLNP_RAW\fR (hex 1d) rather than \fICLNP_DT\fR
in NPDUs that originate from 
\*(rqraw\*(rq sockets.
This non-standard type value is used by \fIclnp_input()\fR
to decide which upper layer protocol should receive the packet.
See \fIclnptest(8)\fR for more information about the.
\*(rqraw\*(rq CLNP interface.
.sh 2 "CLNP Echo"
.pp
In the DoD world, ICMP supports an \fIecho\fR service. 
This allows one to \*(lqping\*(rq a distant gateway and 
to receive an echo response (a packet in return) if the gateway is working.
There is no counterpart to \*(lqecho\*(rq in ISO 8473 (CLNP). 
ARGO provides this non-standard feature in its connectionless
network layer.
.pp
Like raw CLNP, implementing an echo function requires a non-standard
NPDU type value to allow
\fIclnp_input()\fR to differentiate between a DT NPDU to be forwarded
or passed to a higher layer protocol, and an NPDU that is to be echoed.
When requesting an echo, 
the CLNP type field is set to \fICLNP_EC\fR (hex 1E) rather
than CLNP_DT. 
When \fIclnp_input()\fR receives a packet with type
\fICLNP_EC\fR, 
it swaps the source and destination addresses, sets the
type field to \fICLNP_ECR\fR (hex 1F) and forwards
the packet back to the sender. 
See also \fIclnpping(8)\fR.
.sh 2 "Timers"
.pp
The only timer used by CLNP is the 
500 millisecond timer, which is 
user for reassembly lifetime control.
See the section \*(lqReassembly Lifetime Control.\*(rq
.sh 1 "End System to Intermediate System Routing Protocol (ES-IS)"
.\" ROB
.sh 2 "Overview"
.pp
This section describes the implementation of the ES-IS routing protocol.
This protocol is used primarily to resolve NSAP address to SNPA address 
translations. It is also used to identify end systems
and intermediate systems on
the local subnetwork. 
All of this work is accomplished by transmitting
packets of the type End System Hello (ESH), Intermediate System Hello (ISH)
and Request Redirect (RD).
.pp
For the purpose of this section, the following definitions of end system (ES)
and intermediate system (IS) apply.
.ip \(bu 5
An \fIend system\fR is an open system that
is an OSI end system in the standard OSI sense
(that it supports a full OSI protocol suite in addition to the network layer)
and that
implements the functions of the
the ES-IS protocol that are mandatory for end systems,
such as the Query Configuration function and the Record Redirect
function,
but that does not implement
the functions of the ES-IS protocol that are for intermediate systems.
.ip \(bu 5
An \fIintermediate system\fR is an open system that
is an OSI intermediate system in the standard OSI sense
(that it performs packet routing in the network layer)
and that
implements the functions of the
the ES-IS protocol that are mandatory for intermediate systems,
such as the Request Redirect function, 
but not the functions of the ES-IS protocol that are for end systems.
.pp
While system may be an ES or an IS or both according to the
standard OSI definitions, this is not the case in the context of
the ES-IS protocol.
.pp
An ARGO system is by default an end system, by the definitions given above.
An ARGO system can be made to function as an intermediate system
instead of an end system with the \fIclnlutil\fR program. 
See \fIclnlutil(8)\fR for more information.
.sh 2 "Report Configuration Function"
.pp
The report configuration function is used by end systems and intermediate
systems to inform each other of their reachability and current subnetwork
addresses. 
This function is invoked whenever the configuration timer
expires. 
This timer fires at a frequency of once every
\fIesis_config_time\fR seconds. 
By default, this value is 60 (seconds), 
but it may be changed with the \fIclnlutil\fR program.
.pp
The report configuration function is contained in the C function 
\fIesis_config()\fR. Called every \fIesis_config_time\fR seconds, 
\fIesis_config()\fR searches the list of active network interfaces
calling \fIesis_shoutput\fR for each interface that is up, has
broadcast ability and has an ISO address configured.
.pp
The function \fIesis_shoutput()\fR has the responsibility of building and 
transmitting ESH and ISH packets.
It takes several arguments, including  a pointer to a network interface
and
a packet type (ESH or ISH).
If the packet type is ESH, then
each NSAP address configured on the specified interface is added to
the ESH NPDU. ISH NPDUs may only contain a single NSAP address\**.
.(f
\** Actually, ISH packets contain Network Entity Titles (NETs). ARGO
does not make a distinction between NETs and NSAPs.
.)f
After the packet is built, it is transmitted on the subnetwork. ESH packets
are sent to the multicast address \fIall intermediate systems\fR, whereas
ISH packets are sent to the multicast address \fIall end systems\fR.
.pp
Each ISH and ESH NPDU contains 
a holding timer setting. This setting (specified 
in seconds) is used by the receiver of the NPDU to set its
holding timer. When its holding timer expires, the information from
the NPDU is erased. The holding timer value sent on each ISH and ESH NPDU
is contained in the variable \fIesis_holding_time\fR. By default, this
timer setting is 120 seconds. This value may be changed with the 
\fIclnlutil\fR utility program.
.sh 2 "Record Configuration Function"
.pp
The Record Configuration function receives ESH or ISH NPDUs, extracts the
configuration information, and updates kernel-resident tables. 
The two functions \fIesis_eshinput()\fR and \fIesis_ishinput()\fR 
process incoming ESH and ISH NPDUs, respectively.
.pp
The ES-IS entity maintains a table that
associates a SNPA-addresses with NSAP-addresses.
This table is called the \fISNPA cache\fR.
.pp
Whenever an ESH or ISH NPDU is received, 
an entry is made in the SNPA cache
via the \fIsnpac_add()\fR function. 
This entry is kept in the cache until the holding timer expires. 
In addition to adding an entry to the SNPA cache, \fIsnpac_add()\fR creates
a default ISO route toward the sender of the ISH.
One such route is kept so that the ES-IS entity has at most one
route to an IS at any time.
Note that ISHs from different sources will 
cause the route to the source of the earlier ISH to be 
overwritten.
The default route
will be removed when the ISH holding timer expires.
.pp
If, at the time an ESH or ISH NPDU is received, the SNPA cache
contains no entry for the NSAP address in the NPDU just received, 
an ESH or ISH (depending on the system type) NPDU is
transmitted to the sender of the NPDU just received.
.sh 2 "Resolving NSAP addresses to SNPA addresses: Query Configuration Function"
.pp
Whenever a device driver needs to resolve an NSAP address to 
an SNPA address, it calls \fIiso_snparesolve()\fR. This function first looks
up the NSAP address in the SNPA cache. If a match is found, the
corresponding SNPA address is returned. If a match is not found and the
system is an end system, and there is a known intermediate system, then
the SNPA address of the intermediate system is returned. It is assumed that
the intermediate system will forward the packet and transmit a redirect back
(see "Redirection Generation", below).
If a match is not found and the system is an end system, but there is no
known intermediate system, then \fIiso_snparesolve()\fR will return 
the multicast address \fIall end systems\fR. 
In all other cases, \fIiso_snparesolve()\fR will return an error.
This is known as the query configuration function. 
.sh 3 "Configuration Response Function"
.pp
In order for the query configuration function to be effective, the network
entity that receives a CLNP DT sent to the \fIall end system\fR
multicast address must transmit an ESH back to the sender of the DT.
This is called the configuration response function and is accomplished by
calling \fIsh_output()\fR from within \fIclnp_input()\fR.
.sh 2 "Redirection Generation"
.pp
When an intermediate system forwards a packet onto the same interface 
upon which 
the packet arrived, a redirect (RD) NPDU is generated. This NPDU is
transmitted by calling \fIesis_rdoutput()\fR from within \fIclnp_forward()\fR.
Note that end systems may forward packets but they do not generate RD PDUs.
.sh 2 "Redirection Receipt"
.pp
RD NPDUs direct an end system to create an SNPA cache entry 
for an NSAP address, or, if such an entry exists, to change
the SNPA address associated with the NSAP address.
The receipt of RD NPDUs is handled by \fIesis_rdinput()\fR. 
This function
parses the RD NPDU and adds an entry to the SNPA cache for the corresponding
destination NSAP address.
If the redirect is toward an intermediate system,
meaning that the RD NPDU contains an SNPA address
of an intermediate system (gateway),
a route is created for the destination NSAP with the intermediate system as
the first hop, or gateway, in the route.
.sh 2 "Multicast Addresses"
.pp
As specified by the December 1987 NBS agreements, the address
\fIall end systems\fR is {0x09, 0x00, 0x2B, 0x00, 0x00, x04} and the address
\fIall intermediate systems\fR is {0x09, 0x00, x02B, 0x00, 0x00, 0x05}. 
These multicast addresses are only used on the 802.3 subnetwork (baseband).
Broadcast addresses are used on the 802.5 subnetwork (token ring). See
the comment in \fC/sys/netargo/iso_snpac.c\fR for more information on 
multicast addresses.
.sh 1 "Connection Oriented Network Service and Subnetwork Service"
.pp
The following sections describe the design of the Connection Oriented 
Network Service (CONS) and the Connection Oriented Subnetwork Service
(COSNS).
The CONS and COSNS are provided by two functionally separate but related
modules, a connection manager and the ISO 8208 (X.25) protocols.
The connection manager is also known in OSI terminology as a 
subnetwork dependent convergence function, or SNDCF.
In ARGO it is used for more than an SNDCF, and it is a sort of 
"glue" that binds a transport service, a network service, a
subnetwork service, and a device driver together, so 
hereinafter it is called "the glue".
This code performs the some of the functions of ISO 8878,
which specifies how ISO 8208 (X.25) can be used to provide the OSI 
connection oriented network service.
The X.25 protocols are implemented in a coprocessor
made by Eicon Technology, Inc.
The device driver \fBecn\fR is the Unix kernel interface to this
coprocessor.
The sections that follow describe the glue and the \fBecn\fR device
driver.
.sh 2 "The Glue"
.pp
The glue provides 
services to several modules in the kernel:
.ip "Subnetwork service" 5
is provided to other network layer protocols, such as CLNP (ISO 8473).
The ARGO CLNP uses this service.  
The Internet IP could be made to use this service with 
minimal effort, because this service interface is made to look
like a standard Unix BSD link layer service (it has
a device driver interface).
.ip "Network service" 5
is provided to transport layer protocols, such as TP (ISO 8073).
This service interface looks like a standard Unix BSD 
network service (a procedure call interface).
.ip "Transport service" 5
could be provided to the socket module. 
While this is not provided with the ARGO software, the glue 
is designed to permit
such a service to be provided with little additional programming effort.
.pp
Higher layer protocols 
that use a connection-oriented
network or subnetwork service need to manage virtual
circuits in a similar fashion. 
Rather than put connection management functions into each higher
layer protocol (HLP) entity
that uses the CONS or COSNS,
in ARGO the connection management is in one module, the glue.
Other alternatives exist, for example in the OSI world,
one may place in the TP entity the function of connection management for TP,
and implement a network connection management subprotocol
of the transport layer (ISO 8073 DAD1, NCMS). 
In addition, connection management for CLNP may be implemented as part of 
the CLNP entity.
A subnetwork dependent convergence protocol (ISO 8878/A) may
be implemented to support connection management for CLNP.
The approach taken in ARGO is different from those suggested in ISO
for two reasons.
First, ARGO aims to minimize the amount of code written to perform a given
task.
Second, ARGO has several coexisting paths through the network layer,
which the ISO approach does not address.
For example, in both ISO 8878/A and in NCMS it is assumed that if
an incoming call arrives from NSAP \(*b 
while a call to NSAP \(*b is being placed,
the two calls are resolved to one virtual circuit.
This is not feasible in the ARGO scenario, since it may not be known
until after
the calls are established and higher level packets are exchanged 
whether the two calls are to be used
for the same path and for the same higher layer protocols.
A possible alternative approach is to use an NSAP-address for each path
through the network layer
(or protocol suite).
This was rejected in the ARGO design because it puts the burden
on the calling application entity or network entity to 
determine the proper NSAP-address to use to determine the protocol
suite to be used to reach the destination end system.
For this reason, none of the approaches suggested in ISO is adopted
here.
.pp
The glue provided in the ARGO
kernel does not provide the full OSI network service.
It provides that subset of the network service that is used
by ARGO TP and by ARGO CLNP.
The OSI connection-oriented network service elements that are
are provided are described in Chapter Four,
in the section titled "Connection Oriented Network Service".
.pp
Each module using the glue has its own service
interface to the glue.
.\" When X.25 is used as a 
.\"transport service, the standard protocol switch table is used, and the procedure
.\"\fIcons_usrreq()\fR is the protosw entry for a
.\"service in the iso protosw table that provides the 
.\"SOCK_STREAM abstraction in the AF_ISO address family,
.\"with protocol ISOPROTO_X25.
.\"This service is called XTS in the glue code and hereafter
.\"in this document.
.\".pp
When the transport layer uses the glue as a network service,
the interface is the procedure
.(b
\fC
.TS
tab(+);
l s s s.
error = cons_output( isop, m, len, isdgm )
.T&
l l l.
 +struct isopcb +*isop;
 +struct mbuf +*m;
 +int+error, len, isdgm;
.TE
\fR
.)b
.pp
When the network layer uses the glue as a subnetwork service
the interface is the device driver-like procedure
.(b
\fC
.TS
tab(+);
l s s s.
error = cosns_output( ifp, m, dst )
.T&
l l l.
 +struct ifnet +*ifp;
 +struct mbuf +*m;
 +struct sockaddr_iso +*dst;
 +int+error;
.TE
\fR
.)b
.pp
When the glue is used as a connection-oriented service 
(i.e., by TP 0, and by TP 4 during the transport 
connection establishment phase, during which
it is not yet known whether class 0 or class 4 will be used)
the following procedures are used:
.(b
\fC
.TS
tab(+);
l s s s.
error = cons_openvc( copcb, dstaddr, so )
.T&
l l l.
 +struct cons_pcb +*copcb;
 +struct sockaddr_iso +*dstaddr;
 +struct socket+*so;
.T&
l s s s.
 +++
error = cons_netcmd( cmd, isop, vc, isdgm )
.T&
l l l.
 +int+cmd;
 +struct isopcb +*isop;
 +int+channel, isdgm;
.TE
\fR
.)b
.pp
The procedure \fIcons_openvc()\fR places a call.
The procedure \fIcons_netcmd()\fR accepts, rejects, or clears
a call. 
There is no incoming call indication, because
the glue uses the passive open model for accepting calls.
The HLP simply sees a new incoming packet, and is given
a virtual circuit number (channel) along with the incoming packet.
If the HLP chooses to reject the call
it may do so, which will cause the virtual circuit (VC) to be cleared.
.pp
The glue may reject (clear) an incoming call for its own reasons.
The following table lists the reasons that the glue may
clear a call and the ISO 8208 diagnostic code used on the X.25 clear packet
in each case.
For a complete list of the permissible diagnostic codes, see
Figure 14-B of ISO 8208.
.in -5
.(b
.TS
center expand box tab(+);
l l.
Reason+Diagnosic code
=
The VC was opened for use with CLNP +Higher level initiated reset
or TP 4 and has been idle for the   +user resynchronization
maximum inactivity time.            +(0xfa)
_
The HLP closed                      +Higher level initiated disconnection
this network connection.            +- normal (0xf1)
_
The HLP rejected                    +Higher level initiated connection
this network connection.            +rejection - transient condition (0xf4)
_
The X.25 call packet contained      +Higher level initiated connection
facilities that are not supported   +rejection - incompatible
by the glue, or did not contain     +information in user data (0xf8)
necessary information, e.g. calling +     
or called DTE address.              +
_
The X.25 call packet contained      +Higher level initiated connection
call user data that does not        +rejection - unrecognizable protocol
indicate any HLP supported by ARGO  +identifier in user data
HLP supported by ARGO               +(0xf9)
_
The given destination               +OSI Network service problem: NSAP
NSAP-address is not supported       +address unknown (permanent
						            +condition) (0xeb)
_
The X.25 packet or a facility       +Packet not allowed-
therein was too long                +packet too long. (0x27)
.TE
.)b
.in +5
.pp
The glue provides several functions common to all 
modules (HLPs) that use the glue.
Regardless of the HLP,
the DTE addresses and NSAP addresses are associated in the same 
manner.
One same network layer protocol identification scheme
(ISO PDTR 9577) for all HLPs.
Several different HLPs need to close inactive X.25
virtual circuits after a timer expires.
The glue insulates the 
device driver interface to the X.25 coprocessor 
from the HLP.
.pp
TP class 0 connections
.\" and the X.25 "transport service" 
do not share X.25 VCs
.\" with each other or among transport service-level circuits (sockets), 
so
.\" these two modules need to keep X.25
the glue needs to maintain 
a 1-1 correspondence between VCs
and sockets.
.\" For use by TP 0 and XTS,
For use by TP 0, 
one network-level pcb is needed for each socket, and that is a
\fIcons_pcb\fR, described below.
.pp
TP class 4 connections may share VCs, 
and TP 4 makes no correspondence between sockets and VCs.
CLNP regards VCs similarly to TP 4.
A given VC may be used simultaneously for many higher level connections,
but all higher level connections using a given VC must use the same
path or protocol suite.
In other words, a TP4 connection running over CONS may not share a
VC with a TP4 connection running over CLNS/COSNS.
.pp
To manage VCs and to maintain the separation of sharable and non-sharable
VCs, the glue uses the following protocol control block:
.(b
\fC
.TS
tab(+);
l s s s.
struct cons_pcb {
.T&
l l l.
 +struct isopcb+_co_isopcb;
+u_short+co_state; 
+u_char+co_flags; 
+u_short+co_ttl;
+u_short+co_init_ttl;
+int+co_channel;
+struct ifnet+*co_ifp;
+struct protosw+*co_proto; 
+struct dte_addr+co_peer_dte;
+struct ifqueue+co_pending;
};
.T&
l l s.
#define co_next+_co_isopcb.isop_next
#define co_prev+_co_isopcb.isop_prev
#define co_head+_co_isopcb.isop_head
#define co_laddr+_co_isopcb.isop_laddr
#define co_faddr+_co_isopcb.isop_faddr
#define co_lport+_co_isopcb.isop_laddr.siso_tsuffix
#define co_fport+_co_isopcb.isop_faddr.siso_tsuffix
#define co_route+_co_isopcb.isop_route
#define co_socket+_co_isopcb.isop_socket
}+
.TE
\fR
.)b
.pp
The \fIcons_pcb\fR contains
an \fIisopcb\fR so that TP 0 
.\" and XTS 
may use the routines that manipulate \fIisopcb\fR structures for allocating
and 
deallocating PCBs, binding addresses to PCBs,
and finding routes.
.pp
A CONS PCB has states CLOSED, LISTENING, CLOSING, 
CONNECTING, ACKWAIT, and  OPEN.
This represents the state of the VC to the degree necessary to the glue.
The glue uses the passive open model for opening VCs.
The coprocessor device driver always accepts
incoming calls and passes an indication to the glue when
a call is accepted by the coprocessor.
If the user of the glue (the HLP) or the glue itself decides
that the VC is not desired, the VC is cleared.
.pp
The \fIcons_pcb\fR contains a bit mask, \fIco_flags\fR, with values:
.(b
\fC
.TS
tab(+);
l l l l.
#define+CONSF_OCRE+0x40+/* created on OUTPUT */
#define+CONSF_ICRE+0x20+/* created on INPUT */
#define+CONSF_DGM+0x04+/* for datagram use only */
.TE
\fR
.)b
.pp
The flag 
CONSF_DGM means that the VC is being used to provide a
datagram (connectionless, unreliable, unsequenced) 
service to the higher layer, and that requests for additional VCs
from the same higher layer entity
may be served by this VC, effectively 
multiplexing higher layer connections on this VC.
When this flag is set in a \fIcons_pcb\fR, there is no associated
\fIco_socket\fR pointer.
When CONSF_DGM is not set, there is an associated
\fIco_socket\fR pointer, and the VC is being used for
TP 0.
.pp
The flag 
CONSF_ICRE means that the VC was created by 
and incoming call indication.
The flag 
CONSF_OCRE means that the VC was created 
on behalf of an outgoing call request.
.pp
The \fIstruct dte_addr\fR field, \fIco_peer_dte\fR,
contains the peer's DTE address.
The glue locates VCs by searching the list of protocol control
blocks for a PCB with a DTE matching that desired.
.pp
The glue is given an NSAP-address by the HLP entity.
The glue finds the desired DTE address by searching the
ES-IS SNPA cache for an SNPA-address (DTE address) associated
with the NSAP-address given by the HLP entity.
This means that to use the CONS, an entry for each desired
peer must appear in the SNPA cache.
ARGO does not provide the ES-IS protocol for use with ISO 8208, so
"permanent" or static entries must be placed in this cache by hand,
using the utility program \fIclnlutil\fR.
.pp
When an incoming call is accepted, the peer's DTE address is
placed in the SNPA cache along with
an NSAP address generated as follows:
.np
If the incoming call contained the peer's NSAP-address
in an Address Extension Facility (AEF, available with 1984 X.25),
this NSAP-address is used, otherwise
.np
the glue creates a "type-37" address (the format defined by AFI 37
in ISO 8348/AD 2).
.pp
TP 4 can have its outgoing packets sent on more than one VC.
The glue presently contains no mechanism for fanning outgoing
packets onto several VCs, however,
it does not prohibit packets arriving for TP 4 on any VC that 
opened with the protocol identifier for TP.
.pp
The glue has the ability to generate AEFs on outgoing calls, but
this ability is turned off,
since the public data network on which ARGO runs at Wisconsin
does not support 1984 X.25, and so it rejects packets containing
AEFs.
The use of AEFs can be reinstated by making a kernel with the 
option \fBX25_1984\fR or by adding the line
.nf
.in +5
\fC
#define X25_1984
\fR
.in -5
.fi
at the top of the file
\fC/sys/netargo/if_cons.c\fR
and rebuilding the kernel.