- Plagiarism: Every student has to work individually on this project assignment.
All of the code for this project must be your own. You may not copy source code from other students or other sources that you find on the web. You may not share your code with other students. You may not host your code on a public code repository.
Each submission will be checked using plagiarism detection software.
Plagiarism will be reported to School and College Academic Misconduct Officers.
See the University’s page on Academic Misconduct for additional information.
- Start early and proceed in steps. Read the assignment description carefully before you start programming.
- The assignment is out of 100 points and counts for 40% of your final mark.
1 Goals and Important Points
In this assignment, you will implement the minimization procedure for conjunctive queries and a lightweight database engine for evaluating queries called Minibase.
The goals of this assignment are threefold:
- to understand the minimization procedure for conjunctive queries,
- to teach you how to translate conjunctive queries into relational algebra query plans,
- to learn the iterator model for relational operator evaluation and implement the most common operators (e.g., selection, projection, join, aggregation).
The assignment consists of three tasks:
Task 1 (30%): Implementation of the minimization procedure for conjunctive queries.
Task 2 (40%): Implementation of the iterator model and common RA operators.
Task 3 (20%): Optimisation of constructed query plans.
The remaining 10% of your mark is for code style and comments; more details later on.
Task 3 requires elements of independent thinking and creativity, which are necessary for distinction marks according to the Common Marking Scheme.
You should consider the set semantics for each task, meaning that every relation consists of distinct tuples and the input and output of each operator contain no duplicates.
You will start from a skeleton project consisting of two main classes:
- CQMinimizer, which should contain your code for Task 1
- Minibase, which should contain your code for Tasks 2 & 3.
Both classes define the expected command line interface. You are free to modify these classes but must preserve their command line interface as it is essential for marking.
The skeleton project also comes with a parser for our query language, so you do not have to write your own (unless you want to). The main classes show how to parse query strings into Java objects.
The rest of this document describes in detail how to complete the assignment. Take time to read it carefully. Note that some topics will be covered later in the course. Start working as early as possible. This is not a project that should be left to the last minute.
2 Setting Up Local Development Environment
You are free to use any text editor or IDE to complete the project. We will use Maven to compile your project. We recommend setting up a local development environment by installing Java 8 or later and using an IDE such as IntelliJ or Eclipse. To import the project into IntelliJ or Eclipse, make sure that you import as a Maven project (select the pom.xml file when importing).
3 Task 1: CQ Minimization
In Task 1, you will implement the minimization algorithm for conjunctive queries discussed in class. You will build a program that reads a query from the specified input file,minimizes the query, and writes a minimized query in the specified output file.
3.1 Query Language
Task 1 considers the language of conjunctive queries as presented in class, with queries expressed using the rule-based form:
Q(v)/head :- R1(t1), . . . , Rm(tm)/body
where R1, . . . Rm are relation names, v is a tuple of variables, t1, . . . , tm are tuples of terms, and each variable from v appears in the body. A term can be a positive integer constant (e.g., 42), a string constant enclosed in single quotation marks (e.g., ’ADBS’),or a variable. A valid variable name consists of lowercase letters, e.g., a, x , abc are valid variable names; note that this is different from the notation used in class where a, b, c denote constants. A valid relation name consists of uppercase letters, and the relation name in the head does not appear in the body. In our query language, ’ADBS’ is a string constant while ADBS (without quotation marks) is a valid relation name.
Examples of valid conjunctive queries are:
Q(x) :- R(x, ’This is a test string’)
Q(x, z) :- R(4, z), S(x, y), T(12, ’x’)
Q() :- R(y, z), R(z, x)
DBSTUDENTNAME(name) :- STUDENT(id, name), ENROLLED(id, ’ADBS’)
Examples of invalid conjunctive queries in our language are:
Q(y) :- R(x, ’test’) – y does not appear in body
Q() :- – empty body
Q(x, 3) :- R(4, z), S(x, y) – constant in head
Q(y) :- R(y), S() – empty relational atom in body
You can assume that only valid, correctly-typed conjunctive query will be provided as input. For instance, Q(x) :- R(x, 4), R(y, ’4’) is wrongly-typed as the second attribute of R does not have a unique type. Similarly, Q(x) :- R(x, y), R(x, y, z) is also wrongly-typed as the arity of R is not unique. Such queries will never appear as input.
For simplicity, you can assume that every relational atom appearing in the body or the head has no repeated variables. For instance, the atom R(x, x) is invalid according to this restriction, while R(x, xx) and R(3, 3) are valid; the query Q(x, x) :- R(x) is also invalid. But the same variable can appear in different relational atoms in the body.
You do not need to perform any type or naming checks on the input query.
3.2 Query Parser
The provided code contains a parser for our query language. The parser was generated using ANTLR (https://www.antlr.org/) based on the grammar defined in parser/Minibase.g4. Have a look at the grammar to understand the structure of the language constructs. The base directory contains Java classes matching these constructs.
You are free to modify and extend these classes.
The grammar also allows comparison atoms such as x > 3 and x != y to appear in the body and SUM aggregation to appear in the head. We will consider comparison atoms and aggregation in Tasks 2 & 3. For Task 1, we will assume that the body consists of relational atoms only and that the head consists of (non-repeated) variables only.
The parser/generated directory contains the parser classes generated by ANTLR.
Do not modify these generated classes directly. The class from parser/QueryParser.java uses the visitor pattern1 to construct a Query object from ANTLR objects.
To get you started, we provide a simple method in CQMinimizer.java that uses the parser to read a query from a file, print it out, and access fields of the Query object.
本网站支持淘宝 支付宝 微信支付 paypal等等交易。如果不放心可以用淘宝交易！
E-mail: email@example.com 微信:itcsdx