本次数据仓库代写Data Warehouse要求使用kettle工具对源系统数据库进行ETL处理，主要有抽取、转换和载入，最后形成包含3张事实表（Fact Table）和5张维度表（Dimension Table）的基于MySQL的数据仓库。
Data Warehousing – Summer Semester 2019 Assignment 2
Data Warehouse Load Assignment (25 marks)
This assignment uses the same case study that you used in assignment one. We have provided you with the following extra details (from the LMS):
- Production System
- The Product and ProductOrder Tables
- Sales System
- The Customer Table (pre-processed)
- The ProductPriceList Table
- The Store Table o The Sale Table
- The SaleItem Table
- A date dimension table that has been set up based on the data for the assignment
In groups of 2, you are required to carry out the ETL process to get the data from these source tables into the data warehouse that you designed. You will design 1 transformation per fact or dimension table in the data warehouse. You are then asked to write a report about the process (see requirements below).
This assignment is worth 25% of your assessment for the subject. Historically, the average mark for this assignment has been around 17/25 marks.
Case Study – Report Requirements
You will write a 2000 word report outlining the main issues and concerns you came across in performing the data load. A suggested outline for the sections of the report is (Please note that this structure is a guide only and can be changed if you wish, word limits are suggestions):
1. Executive summary: An executive summary is a short summary of the key information in the document and is written for a business person. It should contain clear details in summary form of the issues you came across in designing the ETL process. (200 words)
2. Design of the ETL Process: You should discuss and justify the design of your ETL processes.
Include diagrams of the design of your ETL transformations (for each fact and dimension table in your data warehouse) to use as evidence for the discussion. (1200 words)
3. Design of the Data Warehouse: If you had to do some redesign work on your data warehouse, detail what you changed and give reasons for the changes. (300 words)
4. Data Dictionary: Describe the types of additional meta-data that would be required in the Data Dictionary that you developed in Assignment 1, based on the ETL process. You should list and describe the kinds of metadata you need to store in a data dictionary in addition to what was required for assignment 1 – do not simply reproduce the assignment 1 data dictionary (300 words). NOTE: we are not asking for a complete data dictionary here – we are just asking for the TYPES OF ADDITIONAL Meta Data required (see the meta data lecture).
5. Appendix 1 – Work Breakdown: Detail the breakdown of work of the team members for this assignment. This should be a detailed account of what each team member accomplished as part of the assignment. Assessment Criteria
This assignment is due at 9:00am on Wednesday 6th Feb
You will submit only an electronic version. We will endeavour to get your assignment marked within a 2 week period. You will receive feedback on your assignment via the marking scheme. The marker will also give written feedback in the form of comments on the marking sheet.
You are required to submit an electronic version of your work through the LMS (See the Assessment Link on the LMS and look for the link Assignment 2 Submission. You will be submitting this version of your work through TurnItIn (this is a plagiarism checking platform) and will be able to submit more than once. Also, by submitting the Assignment via the LMS you are agreeing that the Assignment is your own work (thus you do not need an assignment cover sheet). Please make sure the Names of each team member appear on the Front Page of the assignment.
You will also submit a single zip file (See Assignment 2 ZIP Submission link in the LMS). This file will contain all of the Pentaho transformations completed to load the data warehouse. When marking some or all of these transformations may be executed!
本网站支持淘宝 支付宝 微信支付 paypal等等交易。如果不放心可以用淘宝交易！
E-mail: [email protected] 微信:dmxyzl003