Database代写 | CMPSC 431W Database Management System A4

CMPSC 431W Database Management System Mar. 19, 2021

Assignment 4
Database Design Theory & Query Processing & Indexing

Assigned: Mar 19, 2021 Due: Apr 2, 2021. 11:59:59PM

Objectives

The purpose of this assignment is to go through some exercise of database design theory, basic query pro- cessing and indexing strategies. Specifically, this assignment will help you practice your knowledge on the following points:

• How to translate and ER diagram into a relational database schema. • How to find and process functional dependencies (FDs).
• How to decompose a relational schema so that it can fit into BCNF. • Basic knowledge of DBMS query processing pipeline.

• How to build indexes for tables in DBMS.
• How to choose indexes for specific workloads.

Part I: Database Design Theory [60pts]

1. [20pts] Translate the following ER diagram into a relational database schema. Specifically, you should write down the DDL statements to create each table capturing all the information and constraints in the given ER diagram.

EID SSN salary name

Affiliate

ISA

Subordinate ReportsTo

Offers

DID

dname building

CID cname capacity

Employee

Department

Supervisor

Staff

Faculty

Course

duty

research_field

Teaches since

Staff and Faculty do no overlap.
Staff and Faculty do not cover employees.

1

CMPSC 431W Database Management System Mar. 19, 2021

2. [15pts] A set of attribute X is called closed (with respect to a given set of FDs) if X+ = X. Consider a relation with schema R(A, B, C, D, E) and an unknown set of FDs. For each close attribute set constraint below, find a set of non-trivial FDs over R that is consistent with it.

(1) All sets of attributes are closed.
(2) The only closed sets of attributes are {} and {A,B,C,D,E}
(3) The only closed sets of attributes are {}, {A,B,D} and {A,B,C,D,E}

3. [10pts] Consider a relational schema R(A,B,C,D,E) and a set of FDs: A → D, BC → A, and DE → B. Decompose R into BCNF. Show all of your work and explain, at each step, which dependency violations you are correcting. You have to write down a description of your decomposition steps.

4. [15pts] Consider a table storing sales information for an e-commerce company whose schema is: Sales(name:varchar(20), discount:double, month:char(3), price:double)

Now, you want to verify some FDs on this table and decompose it into BCNF.

1. (1)  Write a SQL query to verify an FD: month → discount. Explain what you expect to see in the results when such FD holds and does not hold.
2. (2)  Same question as (1) but to verify FD: name, month → price.
3. (3)  Assuming both FDs in (1) and (2) holds, write down the schema for all resulting relations after a complete BCNF decomposition. Also, write down the SQL query to populate the new tables with data in the original ‘Sales’ table.

Part II: Query Processing & Indexing [40pts]
1. [10pts] List at lease one advantage and one drawback for storing data in a heap file and a sequential file

over a particular key.

2. [10pts] Explain the differences between logical and physical optimizations in the DBMS query processing pipeline with concrete examples for both of them.

3. [20pts] Consider a relation with the schema “Employee(ssn, firstName, lastName, salary, branchNum- ber)” and we have several query workloads running upon it.

You can assume the following fact of this table: (a) this table has more than a million records; (b) “(first- Name, lastName)” is a key; (c) employees are evenly distributed across 10 branches; (d) Salaries of all employees form a normal distributions around \$3,000.

Notes: You should be explicit of index type you are using when writing CREATE INDEX statements.

1. (1)  [4pts] Write a SQL statement to build an index to facilitate fast point lookup for employee’s first

name.

2. (2)  [4pts] Write a SQL statement to build an index to facilitate fast range search for employee’s salary.

2

CMPSC 431W Database Management System Mar. 19, 2021

(3) [5pts] Suppose you have a B-Tree index built on (firstName, salary) attribute, what types of queries over the employee table will be benefited from it.

(4) [7pts] Suppose you have a workload having equal amount of queries asking for: (a) employees in different branches; (b) employees whose salary are in an arbitrary given range; (c) employees has a specific name.

Right now, we want to accelerate this workload, but you are only allowed to build two indexes over the employee table. Explain what type and on which columns you want to build index on and briefly justify your choice.

3

E-mail: itcsdx@outlook.com  微信:itcsdx