Sprog til performance test af databaser

Semester: SW4

Vejleder: Kristian Torp

Beskrivelse

Database benchmarks bruges til at performance teste database systemer, f.eks. for at afdække om en database kan skalere til at indholde 300 GB data, der tilgåes af 15 samtidige brugere. Det er et kæmpe problem med disse benchmarks, at de tager for lang tid at udvikle og vedligeholde i gængse objekt-orienterede sprog som Java, C# og Python.

For at afhjælpe denne situation skal der i projektet designes og implementeres et nyt programmeringssprog, der kan understøtte applikations udviklere i hurtigt og effektivt at udvikle database benchmarks. For at gør det lettere fpr udviklerne at bruge sproget skal det ligne et eksiserende sprog som f.eks. Java. Der kan så opstille en række afbildninger mellem et Java program og et database (SQL) program.

Disse afbildninger kan f.eks være følgende.

Bemærk afbildninger ovenfor er bestemt ikke naturlove med et godt første bud. For at gøre det mere konkret er der til føjet en et længere eksempel i et Java lignende sprog der gemmer "objekter" i en database.

Eksempel

Dette er et længere eksempel på et Java lignende programmeringssprog der mapper mellem Java og SQL operationer på en database. Bemærk der er er en del uklarheder og spørgsmål i eksemplet, dem skal projektet svare på.

/*
 * SW4 project proposal. Examples of using Java++ as a language for
 * benchmarking relational databases.
 */

/**
 * The package specification tells us that all the SQL is executed on
 * the SQL database schema named 'example'.  The normal dot notaion
 * for packages cannot be used as nesting of schemas is not allowed in
 * SQL. */
package example;

import java.util.*; // to use Set 

/**
 * A base class that implements the runnable interface. That the class
 * implements this interface tells us that it can be used for
 * benchmarking, i.e., it can execute SQL statements.
 */
public class Person implements Runnable{

    /**
     * The instance varables will cause the table 'person' to be
     * created in the database with four columns (ssn ,firstName,
     * lastName, age).  'id' is a new reserved word used to uniquely
     * identify the objects as a row in a table. An alternative is to
     * always add an extra id column the tables created.
     */
    private long ssn; // Alternatively the ssn instance variable is
		      // tagged with a new keyword id

    /** First name of person */
    private String firstName;

    /** Last name of person */
    private String firstName;

    /** 
     * Breaks encapsulation but the variable is used to show that
     * direct access is possible to instance variables and this should
     * be considered how to handle this situation.
     */
    public int age; 

    /** 
     * An integer array of phone numbers for the person. How are
     * arrays mapped to the underlying database?
     */
    private int[10] phones.

    /**
     * Constructor, when a constructor is called one or more rows are
     * inserted into the underlying database.
     */
    public Person(String firstName, String lastName){
	this.firstName = firstName;
	this.lastName = lastName;
	this.age = 0;
    }

    /**
     * Destructor, when a descructor is called one or more rows are
     * deleted from the underlying database.
     */
    protected void finalize(){
	// What possible to do in a destructor that is relevant for a
	// benchmark?
	System.out.println("I'm dying");
    }  
  
    /**
     * Nice encapsulation of an instance variable to update its value.
     * When the method is called one or more rows in the underlying
     * database is updated. What to do if the lastName is not changed
     * should this fire an update anyhow? (This could be a compiler
     * option)
     * @param lastName the new last name of the person.
     */
    public void setLastName(String lastName){
	String temp; // local variables are not mapped to column in a
		     // table, they are "plain" programming language
		     // variables
   
	this.lastName = lastName; // SQL update now?


	// When is the SQL update fired if an instance variable is
	// changed more than once in the same method?
	this.lastName = lastName + "AnotherString"; // SQL update now?
    }

    /** 
     * Example of using Class meta data to loop through all objects.
     */ 
    public void updateAge(){
	// local variable, not stored in the database
	Person p;
  
	// loop through all the objects (rows) in a table and update
	// the age, this is Java 1.5 style like syntax.
	foreach p : Person{
	    p.age += 1;
	}

	// Is the the same loop?
	for (int i = 0; i < Person.size(); i++){
	    //Person(i).???? what is the syntax here?
	}

	// Again the same thing?
	Iterator it = Person.iterator();
	while (it.hasNext()){
	    p = it.next();
	    p.age += 1;
	}
    }  

    /**
     * The run method to implement the Runnable interface. What the
     * object does in its life time. Here each object lives in
     * approx. 10 seconds.  each second the age is increated and every
     * third time the lastName variable is updated.
     */
    public void run(){
	for (int i = 0; i < 10; i++){
	    sleep(1000); // should exceptions be handled or ignored?
	    p.age +=1;
	    if (i % 3) {
		setLastName(lastName + i);
	    }
	}
    }

    /**
     * The main method
     */
    public void main(String[] args){

	// A row is created in the database when the constructor is
	// called.
	Person p = new Person("Lars", "Larsen");

	p.age = 34; // this triggers an SQL update statement

	// C++ style of calling destructor to trigger a DELETE statement
	delete p;

        // Create a many objects in many threads. Each thread is a
        // separate connection to the underlying database
        for (int i = 0; i < 1000; i++){
	    Runnable r = new Person();
	    Thread t = new Thread(r);
	    t.start();
	}
    }
}

/**
 * Classes that implements the Timing interface stores how long time
 * is used for execution of methods, constructors, destructors, and
 * direct instance variable access.
 */ 
public interface Timing{
  //TODO: Which methods should there be in this interface or should it
  //just be a flag like the Serializable interface.
}

/** 
 * Classes that implementens this Logging interface stores the SQL
 * statements that each method, constructore, destructor
 * etc. executes. This is for the application developers later can see
 * what SQL statements are actually executed.
 */
public interface Logging{
    // TODO: which methods to implement?
}

/**
 * The person class inherits from the Person class above. This results
 * in a new table being created in the underlying database. In
 * addition this class implements the Timing and Logging interfaces.
 */
public class Employee extends Person implements Timing, Logging{
  // creates a table with five columns, four from the super class
  private int salary;

    /**
     * Constructor that calls super when is the insert then fired?
     */
    public Employee(){
	super(); // call super class constructor then a SQL insert
		 // should not be triggered in the super class
        this.salary = 10.000;
    }

    /**
     * Makes a database SELECT query. Should it be a static method?
     * Logically this leads to the SQL query SELECT * FROM Employee
     * WHERE Employee.id = id
     * @param id gets the Employee object identified by the id specified
     */
    public abstract static Employee get(int id);

    /**
     * Return all objects as a set of Employees. In 1.5 is should
     * @return Set
     */
    public abstract static Set get();

    /**
     * Checks if an Employee exists in the underlying database.  TODO:
     * How is this method implemented?
     * @param e the Employee to look for
     * @return true if employee exists, otherwise false.
     */
    public abstract static boolean exists(Employee e); 

    /**
     * Does a specific object already exist in the table?
     * TODO: how implemented?
     */
    public abstract static boolean exists(int id);


    public static void main(String[] args){

        // Create a many objects in many threads. Each thread is a
        // separate connection to the underlying database
        for (int i = 0; i < 100; i++){
	    Runnable r = new Employee();
	    Thread t = new Thread(r);
	    t.start();
	}

	// how to access timing info
	List timing = Employee.getTiming();
	List statements = Employee.getSQLStatements();
    }
}