数据分析代写|COMP7810 Assignment 1

这是一个数据分析的作业代写

Question 1. Design a star schema for the sales information (20 points)

Let us consider the case of a real estate agency whose database is composed by the following
tables:

OWNER (IDOwner, Name, Surnames, Address, City, Phone)
ESTATE (IDEstate, Category, Area, City, Province, Rooms, Bedrooms, Garage, Meters)
CUSTOMER (IDCust, Name, Surname, Budget, Address, City, Phone)
AGENT (IDAgent, Name, Surname, Office, Address, City, Phone)
SALE (IDEstate, IDAgent, IDCust, IDOwner, Time, OfferedPrice, Status)
TIME (TimeID, Date, Month, Year)

Hint: including the fact table and dimension tables.

Question 2. Find a dataset that is personally interesting to you. It may be a publicly-available
dataset, or a dataset for which you have permission to use and share results. There are many
places on to find publicly-available dataset, and simply searching Google for your preferred
topic plus “public dataset” may provide many hits. Here some additional resources to get you
started:

Kaggle Datasets (https://www.kaggle.com/datasets)
US Government datasets (https://catalog.data.gov/dataset)
Center for Disease Control (CDC) data (https://data.cdc.gov)
NASA datasets (https://nssdc.gsfc.nasa.gov)
World Bank Open Data (https://data.worldbank.org)

This should not be the dataset you will use for your group project. It requires your
independent work.

Perform data cleaning and basic data analysis methods on the dataset, using at least two
techniques learned in lecture 3&4. You can use any tools (e.g., excel) or write your own
codes (e.g., Python).

Describe your key findings from the dataset. Make sure you cite the source of the data! (80
points)

Assessment:

You don’t need to submit your code. You just need to:

a) briefly describe how you analyse the dataset (e.g., number of sample/features and
feature types, descriptive summary, boxplot, measure of dispersion, correlation,
regression) (30 points)

b) briefly describe any data cleaning methods you have applied to the dataset (e.g., for
handling missing value, removing noise). If you think your dataset does not need data
cleaning, please describe how you find it is cleaned already (e.g., boxplot shows no
outliers). (30 points)

c) summarize your findings (e.g., two features are related to the dependent variable) (20
points)


程序代写代做C/C++/JAVA/安卓/PYTHON/留学生/PHP/APP开发/MATLAB


本网站支持淘宝 支付宝 微信支付  paypal等等交易。如果不放心可以用淘宝交易!

E-mail: itcsdx@outlook.com  微信:itcsdx


如果您使用手机请先保存二维码,微信识别。如果用电脑,直接掏出手机果断扫描。

blank

发表评论