Assignment 5: Sort-Merge Join
Due: Friday, December 17, 1999 (NO LATE ASSIGNMENTS)
Instructor: Rich Maclin
40 points
In this assignment, you will implement the sort-merge join algorithm. You will carry out this assignment in teams with the same partner(s) as for the previous assignments.
You should begin by reading the chapter on Implementation of Relational Operations, in particular, the section on Sort-Merge Join.
class sortMerge { public: sortMerge( char *filename1, // Name of heapfile for relation R. int len_in1, // # of columns in R. AttrType in1[], // Array containing field types of R. short t1_str_sizes[], // Array containing size of columns in R. int join_col_in1, // The join column number of R. char *filename2, // Name of heapfile for relation S int len_in2, // # of columns in S. AttrType in2[], // Array containing field types of S. short t2_str_sizes[], // Array containing size of columns in S. int join_col_in2, // The join column number of S. char* filename3, // Name of heapfile for merged results int amt_of_mem, // Number of pages available for sorting TupleOrder order, // Sorting order: Ascending or Descending Status& s // Status of constructor ); ~sortMerge(); };
The sortMerge constructor joins two relations R and S, represented by the heapfiles filename1 and filename2, respectively, using the sort-merge join algorithm. Note that the columns for relation R (S) are numbered from 0 to len_in1 - 1 (len_in2 - 1). You are to concatenate each matching pair of records and write it into the heapfile filename3. The error layer for the sortMerge class is JOINS.
You will need to use the following classes which are given: Sort, HeapFile, and Scan. You will call the Sort constructor to sort a heapfile. To compare the join columns of two tuples, you should create a function tupleCmp to compare the key fields of two tuples. Write the function so that it works as a strcmp function works (in terms of the value it returns). Once a scan is opened on a heapfile, the scan cursor can be positioned to any record within the heapfile calling the Scan method position with an RID argument. The next call to the Scan method getNext will proceed from the new cursor position.
The structure files for the classes you will be using can be found in the directory:
/usr/local/minibase/minibase-2.0/include
You will need to copy files from the src directory for this assignment. To do this you need to follow the same steps as in assignments 1 and 2:
/usr/local/minibase/mini_hwk/assign/SM_Join/src
make setupwhich will copy the appropriate files.
The files are:
You should turn in copies of your code together with copies of the output produced by running the tests provided.