Exercise 9

1 Cost Computations

Assume a table r with 10000 tuples and candidate key A whose range is [1 - 10000]. Assume 50 (fixed size) records per block and no buffering. Given the queries
  SELECT * FROM r WHERE A = 8000;
  SELECT * FROM r WHERE A <> 8000;
  SELECT * FROM r WHERE A > 8000;
  SELECT * FROM r WHERE A BETWEEN 8000 AND 9000;
Determine the number of I/Os for the following situations:
  1. no index, unordered file
  2. no index, ordered file
  3. secondary B+-tree index on r(A), assume a maximum fanout of 100
  4. primary B+-tree index on r(A), assume a maximum fanout of 100
  5. hash index on r(A)

2 Empirical Performance Measurements

The measurements shall be run on a mini-database that reflects the structure of the Wisconsin Benchmark.

2.1 Setup & Initialization

First, you have to create and populate two tables. To create the tables execute the DDL statements that are available via the world wide web (create_tables.sql). Tables r1 and r2 have the following characteristics. To load data use copy the files in the data directory (ctl are control files, dat are data files). Run the load_data script from a unix shell to load the data into the tables you have created.

2.2 Test Queries

Assume the following pairs of semantically equivalent statements (see queries.sql on the web):
Q1:
a) select unique2 from r2 s where s.unique1 = 8530;
b) select unique2 from r2 s where s.unique3 = 8530;
Q2:
a) select avg(unique2) from r2 s where s.unique1 > 1000;
b) select avg(unique2) from r2 s where s.unique3 > 1000;
Q3:
a) select avg(unique1) from r2 s where s.unique1 between 200 and 900;
b) select avg(unique1) from r2 s where s.unique3 between 200 and 900;
Q4:
a) select avg(r.four+s.four) from r1 r, r2 s

  where r.unique1=s.unique1 and s.unique1=8530 and r.unique1 > 200;
b) select avg(r.four+s.four) from r1 r, r2 s

  where r.unique3=s.unique3 and s.unique3=8530 and r.unique3 > 200;
Q5
a) select count(*) from r1 r, r2 s

  where r.unique1=s.unique1 and r.unique1 > 700;
b) select count(*) from r1 r, r2 s
  where r.unique3=s.unique3 and r.unique3 > 700;
Fill out the following performance measurement table. Use set timing on; to measure the time.
 Query    a)      b)  % Difference  Explanation of Difference
Q1        
Q2        
Q3        
Q4        
Q5        



Best regards,
Kristian Torp