First, a basic overview of what CPU scheduling is all about. Computers are funny things. Most of the time, especially with home computers, the CPU is sitting around doing nothing at all. When the CPU does have something to do, more often than not there are a whole bunch of things all going on at once. Very seldom is there exactly one task for the CPU to work on. When the system is lightly loaded, it doesn't much matter what scheduling algorithm is used, any task that comes along will get all the CPU cycles it wants. Things only get interesting when there are a whole bunch of things going on at once.
Now would be a good time to define some terms. I will presume that the reader is familiar with the basic stuff like 'program', 'task', and 'process' which all mean the same thing for the purposes of this discussion. The two terms I want to describe now are 'throughput' and 'latency'. Throughput is a measurement of how much real user work the computer actually gets done. This isn't talked about very often in the personal computer world but it is important for large scientific and industrial computing systems. If you are a large company paying for a big mainframe that sucks up lots of electricity and requires a staff of employees to take care of it, then you want to make sure that every hour you are paying to have this monster computer turned on it is accomplishing as much work as possible. So, you want it to have high throughput. Latency is more applicable to personal computers. The term latency means the same thing as the term 'response time', which is the amount of real time elapsing before a user action results in something that the user can see on the screen. Latency is the main reason that programs like Microsoft Word have splash screens. If the user clicks on the Word icon and nothing happens, they will click on it again. They have then told the computer to start the same program twice which will only make the computer slower. So, Word throws up a splash screen while it loads to reduce the latency: The user sees that their action has done something, and they will patiently wait around while the application starts up. This applies to other user interface stuff as well. On a very heavily loaded system, sometimes when you move the mouse it will take a couple of seconds for the pointer on the screen to move as well. The system can be said to have high latency.
Now on to the scheduling algorithms...
Most modern computer systems use a variation on the Round-Robin scheduling algorithm. The term 'preemptive multitasking' basically means the same thing. The way that the RR algorithm works is that the operating system lets each task run for a certain amount of time, called a time quantum, and then interrupts that task and passes control of the CPU on to some other task. For interactive systems, this scheme is generally best. As we will see when we get to multilevel queues things are a bit more complicated in the real world, but the RR algorithm is sort of the mother of all multitasking systems. For an interactive system this scheduling policy ensures that every application gets some time on the CPU, and it creates the illusion for the user that all of this stuff really is going on at the same time even though the CPU can really only work on one thing at once. Round-robin scheduling reduces latency on heavily loaded systems, because every new task gets a little bit of CPU time right along with all the other tasks, and it can display a splash screen or print out a message that says "Please wait" or do something else that lets the user know that their program is really running. The disadvantage to the Round Robin algorithm is that it has lower throughput. Every time the operating system interrupts a running task and gives the CPU to another task, it takes up a little but of CPU time. This happens hundreds of times every second. All of this time that the operating system is thinking about what process is going to get the CPU next, no real user work is getting done. Remember the big company mainframe that I talked about earlier? Well, some manager is going to get really mad when he hears that his expensive computer spent a hundred billion CPU cycles last month doing no work at all. In reality the CPU doesn't really take up all that much time to switch tasks, but there are other concerns such as memory usage as well. So Round-Robin has good latency or response time, but the throughput isn't so good.
The other two algorithms that the simulator does are FCFS (also called FIFO) and SJN (also SJF). In order of appearance, these stand for First-Come-First-Served, First-In-First-Out, Shortest-Job-Next, and Shortest-Job-First. None of these methods is generally used on home computers, although the Linux scheduler can run this way if you tell it to. The FCFS algorithm is the most basic one, and it is similar to the way older systems such as MS-DOS and the original Mac OS worked. Basically, the first program that comes along gets the CPU. No other process can have it until the first process is done. When it finishes, control passes to the next process in line. That's it. This results in the highest possible throughput because the currently running process gets 100% of the CPU time until it is finished. Which makes the manager of the expensive corporate mainframe really happy, but it pisses off the computer operators. Imagine that the computer is running a big statistical analysis that takes an hour to complete. Until that task is done, absolutely nothing else can happen. The mouse pointer can't move, the keyboard can't be typed on, and nobody can surf the web or play Quake III. Nowadays computers are seldom run like this because CPU time is reasonably cheap, but it used to be more common. Clever operators used to set up their big systems to run with a round-robin timesharing scheme during the day when people were at work using the system, and then at night the system would switch over to FCFS so that it would get as much work done as possible while there were no users requiring interactive applications. Okay, so FCFS has great throughput and terrible latency.
Somewhere along the line, someone got the great idea that they wanted to have great throughput, and have low latency as well. The most obvious way to do this is to run the shortest job before all the other jobs, so that it will finish as quickly as possible and get out of the way. This method is called Shortest Job Next. It is a really great idea, but it requires that you actually know how long each task is going to take. This can work for industrial or scientific computer systems because frequently you actually do know how long something will take. If you have some statistical program to analyze 100,000 survey responses, there are methods for working out how long it will take. The same thing goes for, say, scientific analysis of radio telescope data. Most of the time however there is no way to know how long a program will take. So SJN has good throughput and good latency, the only catch is that it is basically impossible. It does work well for special applications like the examples I mentioned above and for things like printer queues. Since printer queues generally spool jobs before sending them to the printer, the system can look at all the pending print jobs and pick the one with the fewest bytes in it to send to the printer next.
That covers the three basic scheduling algorithms. There are of course variations and combinations of these. For example, some books talk about HRRN or Highest Response Ratio Next. HRRN is a variation on SJN to solve a problem whereby long tasks may never get CPU time. If you imagine a system running with the SJN algorithm that has a steady stream of processes coming in, it may be the case that a really long process never runs because there is always a shorter task waiting for the CPU. The HRRN algorithm fixes this by adjusting the priority of processes which are waiting to be run. If a process which will take a long time waits around for a while as a bunch of shorter processes come and go, the system shortens the time that the scheduler thinks the long process will take. This of course doesn't make the long process complete any faster, but it does make it more likely to be scheduled. This repeats as long as the big job is waiting. Eventually, as the scheduler thinks the long job will be shorter and shorter, it is guaranteed to get the CPU. Just how long this will take depends on how the system is designed.
Lastly, there is the multilevel queue that I mentioned earlier. Multilevel queues are a timesharing scheme like Round-Robin, but there are several different RR queues instead of just the one. Each queue has time quantums of different lengths. As processes age, they are moved to a different queue. Say a system has three different queues. Processes in the first queue get one time quantum each time they run. For short processes this is enough, and they will finish quickly without ever leaving the first queue. If a process stays in this queue for a while and doesn't finish, then the system moves it to the second queue where is gets two time quantums each time it runs. After even longer, it is moved to the final queue where it gets for example five quantums each time it runs. The point of this is that short processes will be switched often creating the illusion of concurrent multitasking, whereas long processes get switched less often to reduce the amount of CPU time the system uses up and increasing throughput for those long tasks.
As I mentioned before, there are all sorts of variations on these basic algorithms. The thing to keep in mind is that the more complicated a scheduling algorithm gets, the more it lowers system throughput. It may be possible to write a really fancy scheduling algorithm that calculates the precise amount of time each task will take and determines its need for user interaction and then schedules tasks based on some fancy mathematical equation. If it takes 50% of the available CPU cycles to do it however, then the computer will have pretty poor performance overall.