数据库代写|INFO20003 Database Systems SAMPLE EXAM
这是一篇来自澳洲的关于数据库系统样本测验的代考,阅读时间: 15分钟 写作时间: 120分钟
Section 1 – ER Modelling
The Australian Department of Health and Human Services (DHHS) has a requirement to store the vaccination record of every individual (known as a patient) that resides in Australia who is eligible for a Medicare card. For families, children under the age of 15 are stored on the family card (refer Figure 1). Each Medicare card number is associated with only one family. Every Medicare card has a ‘valid to’ date stored as a month and year on the card (refer Figure 1).
Each family member holds a position number on the card. For example to identify Jessica Smith (refer Figure 1) both her Medicare number (1234567890) and position (4) on the Medicare card would be required.
Figure 1: Australian Medicare card (DHHS, Australia)
Each Medicare card is attached to one residential address, contact email, and phone number.
For all patients listed on a Medicare card we record their gender, birthdate, first name, last name, and if they have any known allergies (e.g. Penicillin, Cortisone, Codeine). If patients do have an allergy we need to know what the allergy is and the reaction (severe, moderate or mild).
The DHHS needs to record mandatory vaccinations (e.g. Measles, Polio, Whooping Cough, Diphtheria, Tuberculosis and Tetanus) as well as optional vaccination types (e.g. HPV, Flu,Hepatitis A, Hepatitis B, Cholera, Typhoid, Yellow Fever).
For each vaccination event that is given to patients we must record the vaccination type, date of vaccination and the vaccination batch number. Vaccine producers can produce many different types of vaccines and each vaccine can have many batches.
Patients can receive their vaccination from any registered doctor. Every doctor is identified by a unique medical practitioner number (MPN). We record the Medical Practitioner’s title (Dr,Mr, Mrs, Ms, Prof.), first name, last name, registered business address, email, and business phone numbers.
Q1. Draw a conceptual model in either Crow’s Foot or Chen’s notation for this case study (in your script book). Be sure to write down any assumptions you make. (20 marks)
Section 2 – SQL-DDL
Figure 2: Data model for SQL DDL
Q2. Write SQL statements to create the tables for the above data model. Be sure to specify primary and foreign keys. You do not need to specify whether the fields are NULL/NOT NULL.
Choose appropriate data types for attributes that are not obvious. (5 Marks)
Section 3 – SQL-DML
Figure 3 shows the schema for a small business database that contains data about employees, products, customers and orders.
Figure 3: Data model for SQL DML
Write a single SQL statement to correctly answer each of the following questions (3A – 3D).
DO NOT USE VIEWS or VARIABLES to answer questions. Query nesting is allowed.
The relations are repeated here for your convenience.
employees (empid, lastname, firstname, hiredate, address, phone,
FK
managerid)
orders (orderid,
FK
custid,
FK
empid, orderdate, shippeddate, freight, shipname)
customers (custid, companyname, contactname, address, phone)
orderdetails (
FK
orderid,
FK
productid, quantity, discount)
products (productid, productname, unitprice, discontinued)
Q3A. Write a query that returns customers (company names) and the details of their orders (orderid and orderdate), including customers who placed no orders. (3 marks)
Q3B. Write a query that returns the first name and last name of employees whose manager was hired prior to 01/01/2002. (4 marks)
Q3C. Write a query that returns customers whose company name is ‘Google’, and for each customer return the total number of orders and total quantities for all products that were not discontinued (‘1’ means discontinued, ‘0’ not discontinued). (5 marks)
Q3D. Write a query that returns the ID and company name of customers who placed orders in 2007 but not in 2008. (8 marks)
Section 4 – Query Processing – Joins
Given the schema of Question 3 (SQL-DML), consider the relations Orders and Employees.
Imagine that relation Employees has 1,000 pages and relation Orders 50,000 pages. Consider the following SQL statement:
SELECT *
FROM Employees NATURAL JOIN Orders
WHERE freight > 1000;
There are 502 buffer pages available in memory. Both relations are stored as simple heap files. Neither relation has any indexes built on it.
Q4A. What is the cost (in disk I/O’s) of performing this join using the Block-oriented Nested Loops Join algorithm? Provide the formulae you use to calculate your cost estimate. (3 marks)
Q4B. What is the cost (in disk I/O’s) of performing this join using the Hash Join algorithm?
Provide the formulae you use to calculate your cost estimate. (3 marks)
Q4C. In comparing the cost of different algorithms, we count I/O (page accesses) and ignore all other costs. What is the reason behind this approach? (2 marks)
Q4D. Which approach should be the least expensive for the given buffer size of 502 pages:
- Simple Nested Loops Join
- Page-oriented Nested Loops Join
- Block-oriented Nested Loops Join
- Hash Join
Please write the number of the correct response. No need to provide formulae for question 4D. (2 marks)
Section 5 – Query Processing – Indexing
Given the schema of Question 3 (SQL-DML), consider the relation OrderDetails. Imagine that the relation OrderDetails consist of 100,000 tuples stored in pages and each page stores 100 tuples. Imagine that the quantity attribute can take any value between 0 and 20 ([0, 20]), and imagine that discount can take any value between 0 and 100 ([0, 100]). Suppose that the following SQL query is executed frequently using this relation:
SELECT *
FROM OrderDetails
WHERE quantity > 15 AND discount > 90;
Your job is to analyse the following query plans and estimate the cost of the best plan utilizing the information given about different indexes in each part.
Q5A. Compute the estimated result size of the query, and the reduction factor of each filter. (3 marks)
Q5B. Compute the estimated cost of the best plan assuming that a clustered B+ tree index on quantity is (the only index) available. Suppose there are 200 index pages. (3 marks)
Q5C. Compute the estimated cost of the best plan assuming that an unclustered Hash index on discount is (the only index) available. (2 marks)
Q5D. If you are given complete freedom to create one index to speed up this query, which index would be the best one to answer this query? Please give complete information about the index, e.g. is it clustered or unclustered, is it hash or B+ tree, which attributes will it cover.(2 marks)