Exercise 9
1 Cost Computations
Assume a table r with 10000 tuples and candidate key A
whose range is [1 - 10000]. Assume 50 (fixed size) records per
block and no buffering. Given the queries
SELECT * FROM r WHERE A = 8000;
SELECT * FROM r WHERE A <> 8000;
SELECT * FROM r WHERE A > 8000;
SELECT * FROM r WHERE A BETWEEN 8000 AND 9000;
Determine the number of I/Os for the following situations:
-
no index, unordered file
-
no index, ordered file
-
secondary B+-tree index on r(A), assume a maximum fanout of
100
-
primary B+-tree index on r(A), assume a maximum fanout of 100
-
hash index on r(A)
2 Empirical Performance Measurements
The measurements shall be run on a mini-database that reflects the structure
of the Wisconsin Benchmark.
2.1 Setup & Initialization
First, you have to create and populate two tables. To create the tables
execute the DDL statements that are available via the world wide web (create_tables.sql).
Tables r1 and r2 have the following characteristics.
-
|r1| = 8.000 tuples
-
Secondary B+-tree index on r1(unique1)
-
|r2| = 16.000 tuples
-
Secondary B+-tree index on r2(unique1)
-
Primary hash index on r2(unique1)
-
The values of unique1 and unique3 are identical for each tuple.
To load data use copy the files in the data
directory (ctl are control files, dat are data files). Run the load_data
script from a unix shell to load the data into the tables you have created.
2.2 Test Queries
Assume the following pairs of semantically equivalent statements (see queries.sql
on the web):
-
Q1:
-
a) select unique2 from r2 s where s.unique1 = 8530;
-
b) select unique2 from r2 s where s.unique3 = 8530;
-
Q2:
-
a) select avg(unique2) from r2 s where s.unique1 > 1000;
-
b) select avg(unique2) from r2 s where s.unique3 > 1000;
-
Q3:
-
a) select avg(unique1) from r2 s where s.unique1 between 200 and 900;
-
b) select avg(unique1) from r2 s where s.unique3 between 200 and 900;
-
Q4:
-
a) select avg(r.four+s.four) from r1 r, r2 s
where r.unique1=s.unique1 and s.unique1=8530 and r.unique1
> 200;
-
b) select avg(r.four+s.four) from r1 r, r2 s
where r.unique3=s.unique3 and s.unique3=8530 and r.unique3
> 200;
-
Q5
-
a) select count(*) from r1 r, r2 s
where r.unique1=s.unique1 and r.unique1 > 700;
-
b) select count(*) from r1 r, r2 s
-
where r.unique3=s.unique3 and r.unique3 > 700;
-
Fill out the following performance measurement table. Use set timing
on; to measure the time.
Query |
a) |
b) |
% Difference |
Explanation of Difference |
Q1 |
|
|
|
|
Q2 |
|
|
|
|
Q3 |
|
|
|
|
Q4 |
|
|
|
|
Q5 |
|
|
|
|
Best regards,
Kristian Torp