You asked for it...

How does this infernal contraption work?

Forward

First off, if you really want to play with this thing then you should probably grab the source code in this tarball. I will link to the files out of this document, but they are all located in the tarball for your viewing convnience. Note that in the links, some of the file extensions have been changed so that the web server will allow them to be exported. The files in the tarball are the actual files used by the scheduler.

I must appologize for all of this ugly code. As usual I waited until the last minute (or the last evening, anyway) before writing it, and it shows. As you will see I made several choices which were based more on time than on correctness. There are several places with vast room for improvement.

Overview

As you have most likely noticed, the scheduling simulator has a CGI front end. It consists of several CGI scripts and some executable files that, when put together, somehow manage to produce meaningful output. Once compiled the executables should be able to run anywhere, but the CGI scripts pretty much depend on a Unix-like environment.

The main script to drive everything is called sched.cgi. It is responsible for parsing the input, spitting out the HTML code and generating a list of simulated processes. The simulated process list is stored in four identical files which are used by the scripts that generate the images.

The image generation scripts run their input files through the program that does the actual scheduling simulation, and then pass the output through gnuplot to generate the graphs. There are two separate scripts, biggraph.cgi to generate the large graph and graph.cgi to generate the three smaller graphs.

Computer Science Stuff

If you don't have some computer science background then none of this will probably make any sense, but then only someone with a bit of CS knowledge is likely to be interested in a "CPU Scheduling Algorithm Simulator" anyway.

The basic premise of this simulator is that a computer sits around waiting for something interesting to do. Interesting things to do are called 'jobs'. When there is only one job waiting to be run, then it doesn't really matter how jobs are scheduled. Things start to get interesting when there are multiple jobs all waiting for time on the CPU.

An explanation of the way different scheduling algorithms work and the trade-offs between them is really beyond the scope of this document. So, I have put it in this document. It will suffice to say here that the purpose of the scheduling simulator is to provide a nice graphical comparison of the performance of the three most common scheduling algorithms with the same set of jobs.

The Main Script

The main sched.cgi script is mostly standard CGI stuff. Most of the script deals with input parsing and verification, which doesn't really have anything to do with the core functionality of the simulator. There are two really good parts to the script: The part where it generates the simulated jobs and the part where it spits out the HTML code for the images.

The Job Simulation

The simulated jobs are generated by a little C program called, surprisingly enough, jobgen. There is nothing fancy about the output, it is just a stream of random integers separated by linefeeds. A zero indicates that no new jobs are submitted in this time quantum. A nonzero value indicates that a job is submitted, and the value indicated the number of time quantums that the job will run for.

Interfacing to the Rest of the Scheduler

Once the job list is generated, the main script finishes spitting out the HTML code and exits. Each of the images is generated by another script using a client-pull methodology. This is great in that the images are generated on demand, but there are some problems with it as I will explain in a minute. The image generation scripts get called using a neat little bit of HTML code that looks something like this: <IMG NOSAVE SRC="graph.cgi?1+/tmp/jobs.tmp.71655.1"> This causes a CGI script to be run to generate each image on the fly. The command line options for the script are embedded in the URL as per the CGI specification.

The astute reader will notice that this creates a great bug via which the scheduler leaks files. Since the main script leaves files lying around in /tmp for the image generation scripts to use, it is up to those scripts to delete their input files. If some network glitch or user impatience causes the image generation scripts not to be called by the client web browser, then then files are never deleted. While this does in fact happen in the real world, it is not a frequent enough occurence that I have expended any time to fix it.

CGI has gotten a bad rap lately as being difficult to implement and even more difficult to debug. There is also much talk of security problems with CGI. Some of this is warranted and some of it is not. For Windows developers especially, CGI is probably a huge headache. I wouldn't really know, I am more of a Unix guy. I will say that there are many places where CGI winds up being a much more lightweight solution than some of the alternatives. At the time I wrote all of this there weren't really any alternatives to CGI in widespread use, so CGI was what I used.

Generating the Images

The two image generation scripts are biggraph.cgi and graph.cgi. They both work in similar ways. Basically, the sched program is called with appropriate command line options to simulate the various CPU scheduling algorithms. The output of sched is sent to gnuplot, which generates the graphs and sends the images to stdout.

The `sched` Program

The sched program consists of the following files:

linkedlist.h: A simple linked list implementation, using C++ templates
main.cpp: Handles most of the needed functionality
procs.h: Defines the Process Control Block structure
sched.cpp: The core scheduler functionality
sched.h: Defines said core functionality

The operation of the program should be more or less clear on inspection. Basically what happens is that the the program instantiates an object for the desired scheduling algorithm, which is specified on the command line. It then starts looking at the input on stdin, taking each line as a time quantum as described above. Every time quantum it decrements the time remaining for the first process in the run queue, and then calls the selected scheduler algorithm. It continues to run until there is no more input and there are no more processes left in the run queue. After all is said and done, it dumps its statistics to stdout.

The output from sched consists of three sections. The first section is an integer on a line by itself corresponding to the length of the longest process in this simulation. The second section is a single line with the average number of processes in the run queue for this simulation, expressed to two decimal places. The third section is a multiline histogram of the average total run time for each job sorted by job length. Each line in the histogram has four numbers: The job length, the average run time for jobs of that length, and the minimum and maximum run times for jobs of that length.

More to come....