
.sh 1 "Analysis"
.lp
The performance of each of the systems as measured by the benchmark
programs will be discussed in this section. Previously planned and
newly suggested enhancements to KSOS-32 that address each performance
area will be referenced and explained.

.sh 2 "Prime"
.lp
This program uses a fairly inefficient method of repeatedly finding
prime numbers using the Pythagorean Sieve method. It is completely
CPU-bound, doing no I/O during its main processing loop.
.lp
The results show that for CPU-bound programs, KSOS-32 offers
performance that is equal to UNIX. This is an important result,
because it allows a good prediction of the performance of a CPU-bound
task under KSOS-32 on any VAX processor that runs UNIX and that the
security mechanisms of KSOS impose no additional overhead to a
CPU-bound program. A task under KSOS-32 will perform nearly
identically with the same task under UNIX, on the same VAX processor
model.
.lp
This allows the general statement that a CPU-bound task under KSOS-32
on a VAX will perform faster than any other hardware and software
security-kernel based system, if UNIX on the same VAX is faster than
that system.
.lp
Any statement that can be made about a CPU-bound program should be
applicable to the CPU-bound phases of any program.
.lp
The current KSOS-32 process scheduler is designed to give good
response time to interactive processes in a "fair" manner. It is very
similar in intent and behavior to the UNIX process scheduler. Other
scheduler designs have been proposed for KSOS-32 that would allow
better support to "real-time" processes that require consistent,
predictable response to interrupts and other events. Such a scheduler
could be easily added to KSOS-32. This is in contrast to UNIX, which
would require considerable work.

.sh 2 "Openit"
.lp
This program repeatedly creates and closes a file. The create
operation implicitly opens the file for writing, but no file I/O is
performed.
.lp
Table 1 shows that KSOS-32 is much slower that UNIX for file
creations.  Examination of KSOS-32 and UNIX source code reveals that
the close operation is very fast for both systems and is certainly
negligible compared with the create operation. Therefore this program
is really a measure of the file creation operation, which for UNIX
involves finding an available "inode" and for KSOS-32 involves finding
a available  "jnode". (These structures serve identical purposes and
are very similar in structure.)
.lp
This large performance difference is not unexpected as the 4.2 BSD UNIX
file system is the Berkeley Fast File System (FFS), while the KSOS-32
file system is very similar in disk layout and performance to the
Version 7 UNIX file system. The performance of the FFS is discussed in
[McKu83]. Due to the organization of the on-disk inode and free space
areas in the FFS, file creations are very fast.
.lp
Further examination of the KSOS-32 activity in this area would show
that KSOS-32 is searching for a free "jnode". This search takes place
by searching an on-disk structure, which requires I/O (to read the
disk) and is an Order(linear) search process.
.lp
Comparing KSOS-32 to a version of System V UNIX or an older version of
BSD UNIX (pre 4.2) which does not use the FFS would be more
appropriate, as these systems used an on-disk organization which is
similar to KSOS-32. However, the benchmark programs would not run on
System V UNIX and older BSD systems are extremely hard to find
(because they performed very similarly to KSOS-32!).
.lp
Suggested file system enhancements will be discussed
together in the "Recommendations" section below.

.sh 2 "RW"
.pp
The "RW" program creates a file, writes a large number of blocks to
the file and reads them back. Only the writing and reading are timed.
This benchmark gives an indication of the performance of the disk
block allocation algorithm, as well as device I/O performance.
.lp
Table 1 shows that KSOS-32 is about an order of magnitude slower that
UNIX for writing and reading. This is most likely due to the
speed of the KSOS-32 device drivers, the fact that the Kernel is doing
no buffering and the slower search for available disk blocks. The
search for available disk blocks requires I/O to search the on-disk
bit map, which further slows the system due to disk arm movement,
exacerbated by the existing disk device drivers.
This search is faster than the Version 6 UNIX file system free block
serach, but slower than the FFS search.
.lp
Possible enhancements to the KSOS-32 file system are described below.
Implementation of these would improve this benchmark to roughly the
same as UNIX performance.

.sh 2 "Copy"
.pp
The "Copy" benchmark is very similar to the "RW" benchmark except that
the source for the blocks is a file. This program copies the large
file that was created in the "RW" benchmark to another file.
.lp
The relative performance figrures show a very small degradation from
the "RW" benchmark. This is mostly due to the additional disk arm
movements required as a block is alternately read from one file and
then written to another file.


.sh 2 "Forkit"
.pp
This benchmark program tests process creation performance. In BSD UNIX
process creations are implemented via a "fork," wherein the parent
process is "cloned." This cloning includes completely duplicating the
virtual memory space of the parent, as well as the state of any open
files, etc. At this point the child and parent processes continue to
execute, but typically take different paths through the program logic.
Often the child will immediately "exec" to load a new program image,
discarding all of the context that was copied from the parent process.
KSOS-32 provides the "fork" and "exec" functionality, but also
provides a single step "spawn" which does not waste time copying the
parent's virtual address space, but immediately loads a new program
image. This feature was not utilized by this benchmark program.
.lp
For the "fork" operation, KSOS-32 is much slower than BSD UNIX.
Again this is no surprise. This is due to BSD UNIX's much better use
of the VAX virtual memory facilities to support the fork operation.
Specifically, UNIX initially duplicates only the page table of the
parent process and uses a "copy-on-write" algorithm. KSOS-32, on the
other hand, copies all of the contents of the memory segments at
"fork" time.
.lp
"Copy-on-write" is implemented by UNIX in the following manner. At
fork time, the page table of the parent is copied into the page table
of the child. Then, in the child's page table, all of the writable
pages are marked as 2"copy-on-write", meaning that until the child
tries to write to the page, they will be shared with the parent. When
the child process does try to write to the page, it will be copied
into a new page allocated for the child and the child's page table
will be modified to point to the new private page instead of the
shared page. This can cut the time required to duplicate the virtual
address space during the fork by a tremendous amount.
.lp
Specific recommendations for speeding up the fork opeartion will be
discussed in the next section.
