CS 350
Project 3 – 10/11/03
Report Due: Monday 10/27/03 after class
Typo in shmat fixed, typo in “In general:” fixed
Objective:To study concurrent process synchronization (or the lack of it) using semaphores and shared memory. The “bounded buffer” problem will be considered for three cases: (1) single producer and single consumer unsynchronized, (2) single producer and single consumer with synchronization, and (3) multiple producers and multiple consumers with synchronization.
You may use any UNIX or LINUX system for these exercises, but your final program must be able to be testable on UNIX. You must check all significant system calls return values for error conditions and take appropriate action (see list of calls at end of handout). Penalties will be given for not doing these checks. Review project 1 for some basics on doing UNIX programming. All code is ANSI C.
In general:
Consider problems 1 and 2 as “warm-up” exercises leading to problem 3 that is the main focus of this project. In both problems 1 and 2, write a program that spawns a child using the fork call. The parent process will be the “producer” and the child process will be the “consumer”. Problem 1 is unsynchronized, and problem 2 will be synchronized using semaphores. A common shared memory “circular” buffer will be used to pass a produced byte of data from producer to consumer. The buffer will consist of an array 10 char types. A char type may be interpreted as a 1 byte signed integer and heretofore will be simply referred to as an integer. Each array element is a “buffer slot”. The data from producer to consumer will be an integer in this shared memory. The producer will deposit integers from the sequence 0, 1, 2, …, MAX-1 into the circular buffer in that order. When the integer MAX-1 is produced, the producer will terminate (“gracefully” of course). In this project assume MAX is 120, the last integer produced will be the MAX-1, and the total number of integers produced is MAX. The producer will also display the integers as they are produced. The consumer will cycle through the buffer slots, copying and displaying the integers and also inserting them into its own private integer array of length MAX Call this array “results[]”. After all integers have been consumed, the consumer will display the resulting array of integers it had consumed. If all is synchronized, this array should be identical to the sequence generated (both values and ordering).
Problem 3 is based on problem 2 with the main difference being that there will be an arbitrary number of producers and consumers. You will now prompt the user to enter number of producers, the number of consumers and the number of one byte integers that the producer will generate. Limit the number of producers and consumers to five each, and the maximum integer to be generated to 119 (MAX-1) as in parts 1 and 2. The original parent process will spawn the specified number of children processes (producers and consumers) using the fork call. The original producer (parent) should become the first producer after the forking is done. In addition in the forking loops, the original producer will create all other producers, and consumers. All producers will concurrently generate the integers from 0 to MAX-1 in the buffer. Once an item is produced by one of the producers, it cannot be re-produced by any other producer. All producers will ideally produce a mutually exclusive subset of the required integers. All consumers will concurrently extract and “consume” the produced data from the shared circular buffer. Once an item is consumed by one of the consumers, it cannot be re-consumed by the other consumer. All consumers will ideally consume a mutually exclusive subset of the integers that were produced and will store them in its own results[] array and display them as in problems 1 and 2. The totality or “union” of these subsets for either producers or consumers must be precisely the same as the required integers required to be created (0, 1, … MAX-1). The shared buffer index, pointed to by “outp”, is the buffer index of the next item to be consumed. outp is shared by all consumers. For example, the value pointed to by outp is *outp. The shared buffer index pointed to be “inp” is the buffer index of the next item to be consumed. inp is shared by all producers. For example the alue pointed to by inp is *inp. In addition, there will be two more shared variables: pr_countp is a pointer to an implicit variable which will be maintained (shared) by all producers in order to determine when all items have be produced, and whose value represents the total number of items produced by all producers so far. co_countp will be maintained by all consumers in order to determine when all items have been consumed, and whoxe value represents the total number of items consumed by all consumers so far. Thus, these shared variable will have to be carefully protected by a mutex semaphore. pr_count is incremented after every production by any of the producers. When count equals to MAX, all items have been consumed, and it is time for all producers to quit. co_count is incremented after every consumption by any of the consumers. When count equals to MAX, all items have been consumed, and it is time for all consumers to quit. The original parent (first producer) which is the parent of all producers and consumers will be responsible for waiting for all children (consumers and children producers) to complete (using the wait call) The parent will then return all system resources (semaphores and shared memory) back to the system. See details below.
Details of the Project:
Problem 1.
The producer and consumer are not synchronized. The producer should loop until it has deposited all the MAX integers in the buffer slots without regard for the possibility that the buffer may become full. It will then wait for termination of the consumer, return the shared memory to the system, and then terminate. The consumer also loops concurrently for MAX cycles reading integers from the buffer without regard for the possibility that all he data from the buffer may already have been consumed..
Problem 2.
Synchronize the producer and consumer using semaphores so that the producer will not produce anything if all ten buffer slots are full (occupied), and the consumer will not consume anything if all buffer slots contain no valid data (are empty, ie not occupied). The action will be the same as in problem 1, but this time it will be synchronized, in addition, all semaphores as well as shared memory must be returned to the system before the producer terminates. Figures 7.12 and 7.13 in he text gives pseudocode for this function. Be careful that the consumer does not consume for more that MAX cycles, because if it goes for more than MAX cycles it should block on empty buffer, and the program will have to be interrupted (control c), and all borrowed resources will have to be manually returned to the system (using ipcsfree).
Problem 3.
After the user inputs the number of producers, consumers and the number of integers to be produced, you should write a “fork loop” similar to that used in project 2. This loop will create the specified number of producer and consumer processes. The original parent will be the first producer (id of 0). Any particular producer process will freely try to produce data in the buffer in competition with other producers. Any particular consumer process will freely try to consume data from the buffer in competition with other consumers. The critical section for all processes will be where data is beint produced and consumed, and where shared variables are accesses. There must be mutual exclusion among all producers and all the consumers. In addition, as in problem 2, producers must block when the buffer gets full (10 items generated without being consumed), and any consumer must block if the buffer goes empty. Counting semaphores are used to control blocking on full and empty. In this problem, you not only must synchronize the consumers and producers, but you must also synchronize the consumers with each other, and producers with each other. As each producer cycles through the circular buffer, any item produced by a producer may no longer be produced again by any of the producers. As each consumer cycles through the circular buffer, any item consumed by a consumer, is no longer available for consumption by any other consumer. A single mutex type semaphore should be able to achieve mutual exclusion among all the processes in this problem. The mutex from problem 2 should be sufficient. In addition, you will need 4 shared variables referenced by pointers: inp, outp, pr_countp, and co_countp described above. inp and outp which point to buffer locations each gets incremented by one mod SIZE (SIZE is set to 10). You will also need one more shared variable pointer, next_prodp. next_prodp is used to keep track of the next integer to produce by all the producers. *next_prodp can be an index into a preloaded array of integers to be produce, or
*next_prodp can be the integer itself, in each case the various producers will have to increment *next_prodp.
These variables are created the same way that the shared buffer was created. When you create these shared variables, UNIX will give you only pointer reference to these variables, which may be called outp,pr_countp, etc. See tutorial below. The parent should also share these variables, because it will be the parent’s job to initialize these variables to zero before any productions take place. Remember that these pointer variables are not the variables themselves, but only pointers to these quantities. Thus to use them you must de-reference them as: *outp, and *pr_countp, etc., for example: *outp = *outp + 1; increments the out variable. All manipulation or accesses to these variables must be protected by the mutex semaphore and should be declared globally as volatile variables. Any semaphore related variables and constructs should also be global. The shared memory pointers variables and semaphores should be created and initialize before any forks by the parent. All child processes will automatically “inherit” these global variables since forked processes are clones of the parent.
There is one fundamental difference between problems 2 and 3. In problem 2, quitting was easy. The producer ran for exactly MAX cycles and quit. The consumer “knew” in advance how many integers would be produced and consumed for exactly MAX cycles and quit. It quit exactly when the buffer emptied. If it ran for MAX+1 cycles, it would have hung, because it would encounter an empty buffer and the semaphore fullsem would cause it to permanently block (hang) since the producer quit.
In problem 3 quitting is not so simple. Each of the producers do not know in advance how many integers it would produce, and each of the consumers do not know in advance how many integers it would consume. The number of produced and consumed items per process is random for any producer or consumer. Thus all producers must increment *pr_countp each time an item is produced. When *pr_countp hits MAX, say for producer n, producer n quits, Before quitting, producer n must make some arrangements for all other producers and the consumers to quit. Once the last item is produced, there is really no reason for the “full” and “empty” semaphores. In fact once the first process quits, there is a possibility that remaining processes can hang on the full or empty semaphores and there being no process remaining to issue the releasing signals. The following is a suggested procedure for gracefully ending the program:
Producers: Each producer is in an infinite loop. Each pass through the loop is controlled by semaphores as in part 2 of the project. Once in the critical section, the producer tests to see if pr_countp is less than the total number of items to be produced (MAX). If true the item is produced as in part 2, and *pr_countp is incremented. If false (ie., *pr_countp is equal to MAX), it is quitting time. But before doing a break out of the loop, it must first disable the “empty” and the “full” semaphores. Since production is complete, any other producers which may blocked on empty, will automatically pass through and detect the complete condition and also quit. Disabling the full semaphore will make sure that no consumer will hang on full. One way (suggested) of disabling the semaphores, is to “saturate” the semaphores with signals by signaling on them in a tight loop which loops a large number of times. Looping for as may cycles as the total number of items produced will work, although it is overkill. Maintain mutual exclusion during this exiting process by signaling on mutex only once before calling break. Remember that even though the producers have completed production, the consumers will continue for a while until the buffer is emptied.
Consumers: The consumers are also in an infinite loop similar to the one for the producers. Once in the critical section, the consumer tests to see if *co_countp is less than the total number of items to be produced (MAX). If true the item is consumed as in part 2, and *co_countp is incremented. If false it is quitting time (ie., *co_countp is equal to MAX). Since both the empty and full semaphores were already disabled by the producers, it is not necessary to do that here. All the consumer has to do is to maintain mutual exclusion during this exiting process by signaling on mutex only once before calling break.
Finally to randomize the consumer action, you should insert the following random delay in each of the consumer loops outside of all semaphore protected areas: sleep((rand()/10000));
Functions to be written: The main function should contain the calls which acquire and initialize all semaphores and shared memory/variables. It will also contain the forking loop. Package the children which are to be the producers as a function, void producer(int idp), and the children which are to be consumers as a function
void consumer(int idc) where the parameters idp and idc is passed to each producer and consumer so you can distinguish one producer (or consumer) from another. Let the original parent after it exits the fork loop be producer 0 (idp=0). After producer(0) exits, it will be in the original parent process and should continue waiting for all other children to complete by calling wait(NULL) calls for each one; it will then return all UNIX system resources and exit also In addition you will write the semaphore wait and signal functions p() and v() respectively as given in this handout.
For all project parts, after debugging these programs, run them a number of times observing their behavior. Write a description and explanation of the behavior, especially the first program in part 1. Capture sample runs for parts 1 and 2, and for part 3 capture the following cases in files:
number of producers number of consumers number if integers produced
5560
5260
2560
1160 (should look like part 2)
Make hardcopy/softcopy of results. Use the logging feature of telnet or its equivalent to log your runs.
A brief tutorial for using semaphores and shared memory in a UNIX environment in the context of this project::
You will need the following outside of main():
#include <stdio.h>
#include <string.h>
#include <sys/ipc.h>
#include <sys/shm.h> /* for shared memory */
#include <sys/sem.h> /* for semaphores */
#include <sys/wait.h>
#include <sys/errno.h>
extern int errno; /* for reporting additional error information */
Shared Memory Functions:
To obtain buffer and variable memory for sharing:
#define SIZE 10 /* size of the shared buffer */
#define VARSIZE 1 /* size of shared variable=1byte*/
#define SHMPERM 0600 /* shared memory permissions */
int segid; /* id for shared memory bufer */
int outid; /* id for shared variable out, prob. 3 only */
int pr_countid; /* id for shared variable count, prob. 3 only */
/* in “pre-fork” parent code: */
segid = shmget (IPC_PRIVATE, SIZE, IPC_CREAT | IPC_EXCL | SHMPERM );
outid =shmget(IPC_PRIVATE,VARSIZE,IPC_CREAT|IPC_EXCL|SHMPERM); /* prob 3 only*/
pr_countid=shmget(IPC_PRIVATE,VARSIZE,IPC_CREAT|IPC_EXCL|SHMPERM); /* prob 3 only*/
/* similar calls for co_countid, and next_prodid … prob 3 only */
/* call shmget in parent before the fork */
Before the shared memory can be used, it must beattachedie., a pointer reference must be acquired in order to use it. If the pointers are declared globally, as below, then the parent and all spawned processes will inherit these pointers and thus can use the shared memory.
volatile char * buff; /* buff[ ] is the circular array used by parent and children in all problems */
volatile char* outp; /* prob. 3 only – pointer to out shared variable */