Python 统计数据代写 | QBUS6180 Statistical Learning and Data Mining Semester 2, 2021
本次美国代写是Python 统计数据挖掘相关的的一个assignment
1. Overview
In this project, your team will analyse marketing data from a bank and a retail company. Your team will have two tasks. The first will be to build machine learning models to predict the success of marketing campaigns. The second will be to uncover insights that can help your clients make better marketing decisions.
2. Problem description
As a team of data scientists and business analysts working for a marketing consulting company, you have been tasked with helping two clients, a bank and a fashion store, to leverage their data to increase the effectiveness of their marketing campaigns.
The two clients provided your team with data from their latest direct marketing campaigns. You have two tasks:
1. To develop statistical learning models to predict whether the marketing campaign will be successful with a customer.
2. To obtain at least three insights that can help the clients make decisions about their marketing campaigns. What types of customers are more responsive to marketing campaigns?
We will refer to these tasks as statistical learning and data mining, respectively.
As part of the project, you need to write a report according to the instructions below.
3.1 Two datasets
This project involves two marketing datasets, one from a bank and another from a fashion store. The assignment requires you to work with both datasets, but you’ll be able to pick one out of the two for some parts of the report.
One dataset primarily has numerical variables, while the other emphasises categorical variables.
3.2 Bank dataset
The bank dataset is from a phone campaign to encourage clients to subscribe to a term deposit.
The dataset has two files, a training dataset and a second dataset without the response labels for the Kaggle competition.
Kaggle randomly splits this second file into validation (50%) and test (50%) cases, but you will not know which ones are which. You get a score equal to the competition metric (to be announced) computed on the validation cases when you submit to the competition. These scores are displayed on the Public Leaderboard and provide an ongoing ranking of teams.
You can use the scores of your submissions to help you select the best model.
You will select one of your submissions to be used as the final model at the end of the competition. Once the competition is over, Kaggle will rank the teams’ final submissions based on the test cases only, and those will be displayed on the Private Leaderboard. Your goal is to score as best as possible on the Private Leaderboard at the end of the competition. Therefore, please be careful not to overfit the validation cases in an attempt to improve your public ranking.
Each row corresponds to a call made to a customer. The response variable,
subscribed, is the last column in the dataset. It indicates whether the client subscribed to a term deposit, which was the objective of the campaign.
The data dictionary file describes the predictor variables.
3.3 Fashion store dataset
The store dataset refers to a promotional e-mail campaign.
Each row refers to a different customer. The response variable,RESP, indicates whether the customer responded to the promotion.
The data dictionary file describes the predictor variables.