In this project, you are asked to evaluate and explain and compare the key technologies provided by
real-world data management systems. A list systems is given and you are expected to find out about these
systems, understand how they work, and synthesize a document that critically reviews them. Some of the
systems are from established companies, others are from startups, and some are academic research projects.
You will work in groups of two or three. For three person groups, the list of systems is longer, and a
longer report is expected (details below). Here’s the list:
• Apache Druid
• Google BigQuery
• Amazon Redshift
Three-person groups should also cover:
• Google Napa
To get started I recommend that you watch a tech talk given by people working on each of these systems.
You can find such talks for each of the listed systems at the CMU seminar archive via one of the following
These talks are just a starting point. You are expected to do your own research to find additional sources
(talks, white papers, research papers, benchmark reports, etc.) to inform your report.
For each system on the list, you should explain the technology, its motivation and applicability for database
system implementation. This part should be at least one page per system and no more than two pages per
system (see below for formatting guidelines).
You should then conclude with a comparative evaluation of the systems based on the technology innova
tions they offer. Which of the systems do you find the most compelling, and why? How important is each
technology? Who would care about the benefits it provides? Can the benefits be quantified? How mature
is the technology? Would you invest in this technology (or a company based on the proposed system for
academic prototypes) if you were a venture capitalist? This part should be about 3 pages, covering all of
the systems, and should reflect the consensus of the group. (If you can’t reach a consensus, provide the
competing arguments.) I’m asking you to form opinions and to justify them. There may be no \right”
answer, particularly when predicting future trends; a well-reasoned response is all that is expected.
Some vendors will make strong claims (e.g., \100X speed improvement”). Assess these claims objectively:
are they marketing hype, or is there something truly innovative in the company’s solution? Similarly, a
research paper could potentially include biases in favor of a preferred or proposed approach (e.g., the system
sold by the authors’ employers). When reading papers, evaluate and comment on how objective the results
are. Simply quoting claims without some critical analysis is not enough.
The overall report should be a coherent document with a bibliography. Every source, including web
pages, should be cited. The main text should refer to the citations in conventional bibliographic style. It
is not sufficient to simply list all references used without indicating which information comes from which
reference. Your report should be a single-spaced pdf document with 11 point font and normal margins. It
should be submitted via courseworks by 11:59pm on the due date.
Projects are to be done in teams of two or three. The TAs will facilitate the matching of students if you
have trouble identifying a partner (see below). If for some reason (e.g., a partner drops the class well into
the semester) a student ends up working alone, the student is expected to address five (rather than seven;
you can choose which five) of the systems.
The TAs will be posting links to two Google forms on Ed. You should use one of these forms, depending
on whether you have your own group figured out, or whether you want the TAs to help match you up. You
can also post a message to Ed to find teammates. Either way, the Google form should be completed by
Warning. In past semesters, there have been a number of cases where students have plagiarized text and
thus violated the Computer Science Department’s academic honesty policy. We use software to automatically
identify large segments of copied text from any source (past submissions, current submissions, the web etc.).
To help guide you, here are three ways that you might write a paragraph about database processing on
本网站支持淘宝 支付宝 微信支付 paypal等等交易。如果不放心可以用淘宝交易！
E-mail: email@example.com 微信:itcsdx