W4111 – Introduction to Databases
Section 02, Fall 2019
This assignment requires written answers to questions. You may have to copy and paste some
SQL statements or scripts into the answer document you submit. You may also have to insert
There are 20 questions worth 5 points each. A homework assignment contributes 10 points to
your final score. Dividing your score on this assignment by 10 yields the points credited to your
Question: Suppose you have data that should not be lost on disk failure, and the application is
write-intensive. How would you store the data? (Note: Your answer should consider differences
between primarily sequential writes and primarily random writes).
Question: Both database management systems (DBMS) and operating systems (OS) provide
access to files and implement optimizations for file data access. A DBMS provides significantly
better performance for data in databases. How does the DBMS do this? What information does
it use? How does it optimize access? Provide some examples.
Question: Briefly explain CHS addressing and LBA for disks. Which approach is more common
and why? Is it possible to convert a CHS address to an LBA. If yes, how?
Question: Explain why the allocation of records to blocks affects database-system performance
Question: Give benefits and disadvantages of variable length record management versus fixed
length record management
Question: Build and draw a B+ tree after inserting the following values. Assume the maximum
degree of the B+ tree is 3.
Values: 3, 11, 12, 9, 4, 6, 21, 9, 15, 2
Question: Perform the same insertions in the same order for a hash index. Assume that:
. The size of the hash table is 13.
. The hash function is simple modulo.
. The size of a bucket is one entry.
. The size of each bucket is one value.
. The index algorithm uses linear probing to resolve conflicts/duplicates.
Question: When is it preferable to use a dense index rather than a sparse index? Explain your
Question: Since indexes improve search/lookup performance, which not create an index on
very combination of columns?
Question: Consider the table below. Add indexes that you think are appropriate for the table
and explain your choices. You may use MySQL Workbench to add the indexes. Paste the
resulting create statement in the answer section.
Choosing indexes is not possible without understanding use cases/access patterns. Define five
use cases and the index you define to support the use case. See the answer section for an
CREATE TABLE IF NOT EXISTS `customers` (
id` INT(11) NOT NULL AUTO_INCREMENT,
company` VARCHAR(50) NULL DEFAULT NULL,
last_name` VARCHAR(50) NULL DEFAULT NULL,
first_name` VARCHAR(50) NULL DEFAULT NULL,
email_address` VARCHAR(50) NULL DEFAULT NULL,
job_title` VARCHAR(50) NULL DEFAULT NULL,
business_phone` VARCHAR(25) NULL DEFAULT NULL,
home_phone` VARCHAR(25) NULL DEFAULT NULL,
mobile_phone` VARCHAR(25) NULL DEFAULT NULL,
fax_number` VARCHAR(25) NULL DEFAULT NULL,
address` LONGTEXT NULL DEFAULT NULL,
city` VARCHAR(50) NULL DEFAULT NULL,
state_province` VARCHAR(50) NULL DEFAULT NULL,
zip_postal_code` VARCHAR(15) NULL DEFAULT NULL,
country_region` VARCHAR(50) NULL DEFAULT NULL)
Use Case 1: A user wants to be able find a customer(s) by country_region; country_region and
state_province; country_region, state_province, city; just by city.
<Your index definitions go here>
Question: Assume that:
. The query processing engine can use four blocks in the buffer pool to hold disk blocks.
. The records are fixed size, and each block contains 3 records.
. There are two relations are R and S. Their formats on disk are:
Explain how a partition hash join would perform a natural join. You should illustrate your
explanation using diagrams of the form below for various steps in the processing:
The notation (n,X) means the record in file/relation X with value n for the key. Ti is a temporary
file/relation that is created during the processing. You will likely have to create more than one
Question: Give three reasons why a query processing engine might use a sort-merge join
instead of a hash join? What are the key differences sort-merge and hash join?
Rewrite/transform the following query into an equivalent query that would be significantly more
people.playerid, people.nameLast, people.throws,
batting.teamid, batting.yearid, ab, h, rbi
(people join batting using(playerid))
where teamid=’BOS’ and yearID=’1960′;
Question: Suppose that a B+-tree index on (dept_name, building) is available on relation
department. (Note: This data comes from the database at
What would be the best way to handle the following selection?
Question: Consider the following relational algebra expression
This is a project on the result of a natural join on R and S. Assume that column a comes from R,
column b comes from S and that c is the join column. Also assume that both R and S have
many large columns. Write an equivalent query that will be more efficient, and explain why.
Question: Explain the difference between a serial schedule and a serializable schedule.
Question: What are the benefits and disadvantages of strict two phase locking?
Question: Outline the no-steal and force buffer management policies.
本网站支持淘宝 支付宝 微信支付 paypal等等交易。如果不放心可以用淘宝交易！
E-mail: [email protected] 微信:itcsdx