Python代写 | Course Project Instructions and Assessment Criteria: Machine Learning (2110)

本次澳洲代写主要为Python机器学习模型的assignment

Project Phases
Phase 1 Due at the End of Week 5 (18%):
– Data cleaning and preprocessing (dealing with missing values, dropping ID-like columns,
data aggregation, etc.) as appropriate.
– Data exploration and visualisation (charts, graphs, interactions, etc) as appropriate.
Phase 2 Due at the End of Week 12 (27%):
– Predictive modelling of data as appropriate.

1. This is a loosely defined project to give you the maximum level of flexibility. In
particular, you will need to choose a project dataset yourself.
2. Dataset resources: There are no hard restrictions on the dataset that you can select
for your project. You can choose a public dataset from popular data repositories or
you can find some other suitable dataset from any website on the Internet or
whatever. You can also use data from your work. Please check the bottom of this
page for some suggested resources for finding a suitable dataset for your project.
3. Guidance for selecting a dataset: As a friendly advice, you might want to select a
dataset in line with your future career plans and the particular industries you are
interested in. For example, if you plan on working in the banking industry (or
conducting academic research in this area), you might want to select a finance
related dataset so that your course project can be a talking point during your
interviews.
4. Blacklisted datasets: The dataset you choose apparently needs be appropriate for a
major machine learning course project. For instance, the following datasets are
blacklisted:
Boston Housing
Iris
Titanic
US Adult Income Dataset
Wine
Wisconsin Breast Cancer

5. Minimum requirements: Your dataset must have at least 200 rows and at least 8
descriptive (that is, independent or explanatory) features after dropping all
unnecessary features but before one-hot-encoding of any categorical
descriptive features. Please remember: “features”, “attributes”, or “variables” are all
the same thing: they are just columns in your dataset. Likewise, a “dependent”,
“target”, or “response” feature are all the same thing: it’s the variable you are
predicting (as part of a supervised machine learning problem).
6. Random sampling for very large datasets: There is no upper limit on the number of
rows, but if your dataset has more than 5000 rows, you might want to select a random
subset with at most 5000 rows so that you do not fry your laptop! That is, you will not
lose any points for selecting only a relatively small subset of rows in case your
dataset has too many rows.


程序代写代做C/C++/JAVA/安卓/PYTHON/留学生/PHP/APP开发/MATLAB


blank

本网站支持淘宝 支付宝 微信支付  paypal等等交易。如果不放心可以用淘宝交易!

E-mail: itcsdx@outlook.com  微信:itcsdx


如果您使用手机请先保存二维码,微信识别。如果用电脑,直接掏出手机果断扫描。

blank

发表评论