In programming assignments 3-5, you will be implementing a simple DBMS. For an overview of how you will construct the DBMS, read this page. For this assignment you will implement a simplified version of the Buffer Manager layer, without support for concurrency control or recovery. You will be given the code for the lower layer, the Disk Space Manager.
You will carry out this assignment, and subsequent ones, in teams of three. The teams will be selected by the instructor, you can find your team assignment here. You should begin by reading the chapter on Disks and Files, to get an overview of buffer management. This material will also be covered in class. Your Buffer manager code will provide an interface between the Disk Space manager and the upper levels of code. The Disk Space manager, implemented with the class DB, provides basic routines for allocating and deallocating file pages. We will not be implementing this class directly because this aspect of a system tends to be very system specific (also, we will be using a very simple model of pages). The Buffer Manager provides a simple set of routines for obtaining, pinning, and unpinning pages that are associated with a file (in later layers of the DBMS we will be adding layers that organize pages into files). You should familiarize yourself with what routines are available for low level processes by reading db.h carefully.
To make a local copy of the code you need to implement you should download the file bm.tar.Z. This is a tared archive file. To unpack this file you should do the following:
% uncompress bm.tar % tar xvf bm.tar
This will create a directory BufMgr that contains the code provided to you as well as skeletons for the code you need to write. This code comes with a provided make file makefile. It is unlikely you will need to modify this file (the only files you should need to change are buf.h and buf.C). The code comes with two testing programs, db_tester and bm_tester. These test programs are automatically constructed by the makefile by simply typing "make" (you can remake each one by typing "make db_tester" and "make bm_tester"). db_tester focuses primarily on testing the Disk Space manager code (though it needs the Buffer Manager code to be working) and bm_tester primarily tests the Buffer Manager code. To recompile all of the code you first type "make clean" which will eliminate all current .o and executable files and then type "make" again to recompile.
The simplified Buffer Manager interface that you will implement in this assignment allows a client (a higher level program that calls the Buffer Manager) to allocate/de-allocate pages on disk, to bring a disk page into the buffer pool and pin it, and to unpin a page in the buffer pool.
The methods that you have to implement are described below (the skeletons for this class can be found in buf.h and buf.C):
class BufMgr { public: // Allocate pages (frames) for the pool in main memory. BufMgr(int numbuf); // Should flush all dirty pages in the pool to // disk before shutting down and deallocate the // buffer pool in main memory. ~BufMgr(); // Check if this page is in buffer pool. If it is, increment the pin_count // and return a pointer to this page. If the pin_count was 0 before the // call, the page was a replacement candidate, but is no longer a candidate; // be sure to remove this from the LRU list of candidates. // If the page is not in the pool, choose a frame (from the set of // replacement candidates) to hold this page, read the page (using // the appropriate DB class method) and pin it. // Also, must write out the old page in chosen frame if it is dirty // before reading new page. (You can assume that emptyPage == 0 for // this assignment.) Status pinPage(PageId PageId_in_a_DB, Page*& page,int emptyPage=0); // Should be called with dirty==TRUE if the client has modified the page. // If so, this call should set the dirty bit for this frame. Further, // if pin_count > 0 should decrement it, and if it becomes zero, should // update the LRU list by adding an entry for this frame. // If pin_count == 0 before this call, return error. Status unpinPage(PageId globalPageId_in_a_DB, int dirty=FALSE); // Call DB object to allocate a run of new pages and // find a frame in the buffer pool for the first page // and pin it. (This call allows a client of the Buffer Manager // to allocate pages on disk.) If buffer is full, i.e., you // can't find a frame for the first page, ask DB to deallocate // all these pages, and return error. Status newPage(PageId& firstPageId, Page*& firstpage,int howmany=1); // This method should be called to delete a page that is on disk. // This routine must call the DB class to deallocate the page. Status freePage(PageId globalPageId); // Used to flush a particular page of the buffer pool to disk // Should call the write_page method of the DB class Status flushPage(int pageId); // Flushes all pages of the buffer pool to disk Status flushAllPages(); // Total number of Buffers unsigned int getNumBuffers(); // Count of the number of unpinned Buffers unsigned int getNumUnpinnedBuffers(); };
The buffer pool is a collection of frames (page-sized sequence of main memory bytes) that is managed by the Buffer Manager. It should be stored as an array bufPool[numbuf] of Page objects.
In addition, you should maintain an array of buffer descriptors, one per frame. Each descriptor is a record with the following fields:
The pin_count field is an integer, page number is a PageId object, and dirtybit is a boolean. This describes the page that is stored in the corresponding frame. A page is identified by a page number that is generated by the DB class when the page is allocated, and is unique over all pages in the database. The PageId type is defined as an integer type in minidb.h.
A simple hash table should be used to figure out what frame a given disk page occupies. The hash table should be implemented (entirely in main memory) by using an array of pointers to lists of <page number, frame number> pairs. The array is called the directory and each list of pairs is called a bucket. Given a page number, you should apply a hash function to find the directory entry pointing to the bucket that contains the frame number for this page, if the page is in the buffer pool. If you search the bucket and don't find a pair containing this page number, the page is not in the pool. If you find such a pair, it will tell you the frame in which the page resides.
The hash function must distribute values in the domain of the search field uniformly over the collection of buckets. If we have HTSIZE buckets, numbered 0 through HTSIZE-1, a hash function h of the form h(value) = (a*value+b) mod HTSIZE works well in practice. HTSIZE should be chosen to be a prime number.
When a page is requested the buffer manager should do the following:
To implement the LRU replacement policy, you should maintain a counter that attaches a timestamp to a page whenever it is used (read into the buffer, pinned, or the date updated). When a frame is to be chosen for replacement, you should pick the frame containing the page whose timestamp is the oldest.
Be sure to follow the error protocol described in new_error.h. Note that you will likely want to add new error codes and new error messages to the tables provided for you.
The make file compiles the code using the -g flag. This means that you can debug the executables produced using gdb. I have also set up the code with a command line debugging system. When running either of the test programs you can add command line arguments of db, bm, or gory. These turn on debugging flags in the Disk Space module (db), Buffer Manager module (bm) and some extended (gory details) flags (gory). Note that you may want to add debugging commands in your Buffer Manager code following this protocol.
Print out your versions of buf.h and buf.C. You should test your code using the test routines bm_tester and db_tester and print out the results. Next, write up a team report of how your code is implemented. This report should give an overview of how you completed the BufMgr class and any new classes you created. It should also discuss the algorithms you used to solve the problem. This report should be at least two pages long but no longer than four pages. Each team member should also write up an individual report (at least half a page but no more than a page) discussing their contributions to the coding process and how the overall team interaction went.
You must also submit your code electronically (but only once for each team). To do this go to the link https://webapps.d.umn.edu/service/webdrop/rmaclin/cs4611-1-f2004/uploa d.cgi and follow the directions for uploading a file.
To make your code easier to check and grade please use the following procedure for collecting the code before uploading it:
rmaclin/prog03Note that the suffix of all C++ code files (not .h files) should be ".cc". Only code files (only .cc and .h files) should be stored in this directory.
tar cf prog03.tar login/prog03