# 本次数据库代写是完成医院数据库的查询和B+树索引作业

Question 1
SELECT p.lastname, p.income, p.birthdate, op.sin, op.opdate
FROM Patient p, Operation op, Doctor doc
WHERE (p.city = ‘Vancouver’ OR p.lastname = ‘Smith’)
AND (p.city = ‘Vancouver’ OR p.firstname = ‘bob’)
AND p.msp = op.msp AND op.sin = doc.sin
AND (op.description = ‘appendectomy’ OR op.description = ‘amputation’)
AND doc.doctorname <> ‘Smith’ AND gender = ‘F’)
Sketch a relational algebra expression tree that represents an equivalent logical plan to your answer to the SQL query shown below. Your query should be derived by using the following heuristics:
▪ Replace Cartesian products with joins
▪ Push selections as far down the tree as possible
▪ Push projections as far down the tree as possible, except after selections, add attributes to the projections where necessary to achieve this
▪ Group join operators
Question 2
You are to estimate the cost of the relational algebra query shown below.
cname,cemail,vname,vcity,ceo,pname,ptype,description,sdate(
vname,vcity,ceo(vcountry=’Canada’  employees > 500 (Vendor)) ⨝
pid,pname,description,ptype(Product) ⨝
cid,cname,cemail(ccity = ‘Vancouver’  ccity = ‘Edmonton'(Customer)) ⨝ Sales)
Indexes
▪ Secondary B+ tree on
{vcountry, vcity}, of
height 3 in Vendor
▪ Primary B+ tree on
{ccity, cstreet,
cnumber}, of height 4
in Customer
▪ Linear hash index on
cid in Sales, no bucket
consists of more than
one block
Relation Statistics
Customer Product Vendor Sales
T(R) 200,000 100,000 50,000 10,000,000
B(R) 25,000 10,000 5,000 500,000
V(R, cid) 200,000 200,000
V(C, ccity) 100
V(R, pid) 100,000 100,000
V(R, vname) 50,000 50,000
V(V, vcountry) 5
V(V, employees) 5,000
▪ Primary B+ tree index of on pid in Sales
▪ For B+ tree indexes assume that the root node of the index is retained in main memory
▪ Records of all tables, including intermediate relations. should not span more than one block Attribute, Block and Main Memory Sizes
▪ cid, pid, employees – 4 bytes
▪ ptype, sdate – 8 bytes
▪ cname, ccity, pname, vname, vcity, ceo, vcountry – 16 bytes
▪ cemail – 32 bytes
▪ description – 64 bytes
▪ Block (and main memory frame) – 4,096 bytes
▪ There are 600 main memory frames available for the query
a) Estimate the cost for performing the selection and projection on Vendor, do not include the cost for writing out the result (if any): vname,vcity,ceo(vcountry=’Canada’  employees > 500 (Vendor)). The minimum and maximum values for the employees attribute are 0 and 5,000. [2]
b) State the size of the Vendor selection and projection result in records and bytes. [2]
c) Estimate the cost for performing the selection and projection on Customer, do not include the cost for writing out the result (if any): cid,cname,cemail(ccity = ‘Vancouver’  ccity = ‘Edmonton'(Customer)).
Histograms kept by the DB indicate that 25% of customers live in Vancouver and 15% live in Edmonton. [2]
d) State the size of the Customer selection and projection result in records and bytes. [2]
e) State the size of the Product projection result in records and bytes. [2]
f) Determine the most efficient join order using the dynamic programming approach described in class*. The metric you use to determine join order should be the least number of records. You should not consider the actual cost of performing the join in your calculation, or the cost of producing input for the join. You should only consider left-deep join trees. [10]
*You can do this informally, and do not have to submit the tables shown in the class example
(and Exercise 9). Don’t forget that the size in records of some of the tables is affected by previous operations.

E-mail: itcsdx@outlook.com  微信:itcsdx