# Exercise 9

### 1 Cost Computations

Assume a table r with 10000 tuples and candidate key A whose range is [1 - 10000]. Assume 50 (fixed size) records per block and no buffering. Given the queries
```  SELECT * FROM r WHERE A = 8000;
SELECT * FROM r WHERE A <> 8000;
SELECT * FROM r WHERE A > 8000;
SELECT * FROM r WHERE A BETWEEN 8000 AND 9000;```
Determine the number of I/Os for the following situations:
1. no index, unordered file
2. no index, ordered file
3. secondary B+-tree index on r(A), assume a maximum fanout of 100
4. primary B+-tree index on r(A), assume a maximum fanout of 100
5. hash index on r(A)

### 2 Empirical Performance Measurements

The measurements shall be run on a mini-database that reflects the structure of the Wisconsin Benchmark.

#### 2.1 Setup & Initialization

First, you have to create and populate two tables. To create the tables execute the DDL statements that are available via the world wide web (create_tables.sql). Tables r1 and r2 have the following characteristics.
• |r1| = 8.000 tuples
• Secondary B+-tree index on r1(unique1)
• |r2| = 16.000 tuples
• Secondary B+-tree index on r2(unique1)
• Primary hash index on r2(unique1)
• The values of unique1 and unique3 are identical for each tuple.
To load data use copy the files in the data directory (ctl are control files, dat are data files). Run the load_data script from a unix shell to load the data into the tables you have created.

#### 2.2 Test Queries

Assume the following pairs of semantically equivalent statements (see queries.sql on the web):
Q1:
a) select unique2 from r2 s where s.unique1 = 8530;
b) select unique2 from r2 s where s.unique3 = 8530;
Q2:
a) select avg(unique2) from r2 s where s.unique1 > 1000;
b) select avg(unique2) from r2 s where s.unique3 > 1000;
Q3:
a) select avg(unique1) from r2 s where s.unique1 between 200 and 900;
b) select avg(unique1) from r2 s where s.unique3 between 200 and 900;
Q4:
a) select avg(r.four+s.four) from r1 r, r2 s

where r.unique1=s.unique1 and s.unique1=8530 and r.unique1 > 200;
b) select avg(r.four+s.four) from r1 r, r2 s

where r.unique3=s.unique3 and s.unique3=8530 and r.unique3 > 200;
Q5
a) select count(*) from r1 r, r2 s

where r.unique1=s.unique1 and r.unique1 > 700;
b) select count(*) from r1 r, r2 s
where r.unique3=s.unique3 and r.unique3 > 700;
Fill out the following performance measurement table. Use set timing on; to measure the time.
 Query a) b) % Difference Explanation of Difference Q1 Q2 Q3 Q4 Q5

Best regards,
Kristian Torp