数据库代写 | Introduction to Data Science Homework 1
这是一篇美国的R环境经济限时测试代码代写
Introduction
- This problem set puts your new regression (
lm
) skills to use. - Do the written answer portion of the assignment in a Word document called
written_responses.docx
and zip it with the rest of your project files when you submit. - If you are unsure of how to do certain pieces of the problem set (e.g. take means of the data), Google and StackExchange are the best resources.
NIMBY Taxation Background
This exercise uses data from the application found in “State Taxes and Interstate Hazardous Waste Shipments,” by Arik Levinson (American Economic Review, 89(3)). Levinson studies the shipment of toxic waste between U.S. states. In an attempt to keep waste out, state governments often implement what are commonly known as “NIMBY” (Not In My BackYard) taxes on waste dumping. These taxes may differ depending upon where the waste comes from (i.e., a state typically charges a lower tax for dumping waste produced within its own borders than it charges for waste imported from another state).
Levinson’s question is whether these taxes work in deterring waste flows. This is an important question in environmental economics because, in addition to functioning as NIMBY taxes, these taxes also behave like Pigouvian taxes – interstate waste flows are a form of emissions whereby polluters in the origin state impose externalities (e.g., risk of leakage into groundwater) on the residents of the state. If waste flows are not responsive to these taxes, we may worry about the ability of Pigouvian policy to address these externalities.
The problem in determining whether these taxes are effective arises because we (as the economists) do not see everything that makes a state attractive or unattractive as a potential destination for waste. Unobserved state attributes that make some states more appropriate destinations for waste may be correlated with the tax that the state’s legislature sets, i.e. there might be omitted variable bias. To be more precise, state legislatures in states that are good destinations for waste may set higher taxes to deter waste flows. In the end, those states might still receive a higher volume of waste (despite their higher taxes), but a lower volume than they would have received in the absence of those taxes. By failing to adequately control for destination state attributes, we may mistakenly arrive at the result that NIMBY taxes encourage more waste flows.
The purpose of this exercise is for you to demonstrate this result using actual data. The data are available in the data folder as levinson_data.xlsx
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5 ✓ purrr 0.3.4
## ✓ tibble 3.1.6 ✓ dplyr 1.0.8
## ✓ tidyr 1.2.0 ✓ stringr 1.4.0
## ✓ readr 2.1.2 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(broom)
library(readxl)
ps4_dataset <- readxl::read_xls("data/levinson_data.xls")
ps4_dataset
## # A tibble: 16,128 × 79
## `orig id` `dest id` shipments dist dist2 oinc oarea opop odens ocap
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 1 16.1 0 0 23.6 0.0508 4.04 79.6 1.10
## 2 2 1 10.5 358. 127812. 21.1 0.0521 2.35 45.1 0.835
## 3 3 1 0 1418. 2011561 27.5 0.114 3.66 32.2 2.26
## 4 4 1 0 1945. 3784079 35.8 0.156 29.8 191. 2.26
## 5 5 1 5.53 1129. 1275081 30.1 0.104 3.29 31.8 2.26
## 6 6 1 8.63 978. 956302. 41.7 0.00485 3.29 678. 0.439
## 7 7 1 9.24 761. 578994. 34.9 0.00196 0.666 341. 0
## 8 8 1 13.7 361. 130444. 27.5 0.0540 12.9 240. 0.347
## 9 9 1 15.0 181. 32916. 29.0 0.0579 6.48 112. 0.400
## 10 10 1 13.3 733. 537517 26.2 0.0559 2.78 49.7 0.122
## # … with 16,118 more rows, and 69 more variables: o65 <dbl>, ocoll <dbl>,
## # ogen <dbl>, dinc <dbl>, darea <dbl>, dpop <dbl>, ddens <dbl>, dage65 <dbl>,
## # dcollege <dbl>, dcapmil <dbl>, dwest <dbl>, dne <dbl>, dsouth <dbl>,
## # yr90 <dbl>, yr91 <dbl>, yr92 <dbl>, yr93 <dbl>, yr94 <dbl>, yr95 <dbl>,
## # samest <dbl>, tax <dbl>, d1 <dbl>, d2 <dbl>, d3 <dbl>, d4 <dbl>, d5 <dbl>,
## # d6 <dbl>, d7 <dbl>, d8 <dbl>, d9 <dbl>, d10 <dbl>, d11 <dbl>, d12 <dbl>,
## # d13 <dbl>, d14 <dbl>, d15 <dbl>, d16 <dbl>, d17 <dbl>, d18 <dbl>, …
Variable Names and Descriptions
Note the actual variable names in the dataframe do not have the subscripts, I include them here so you know whether they refer to the origin state, destination state, or year.
- Subscript ii: origin state
- Subscript jj: destination state
- Subscript tt: year
Origin-destination variables
- shipmentsi,j,tshipmentsi,j,t: natural log of shipments from state ii to state jj in year tt
- disti,jdisti,j: distance (miles) from state ii to state jj
- dist2i,jdist2i,j: distance squared (miles squared) from state ii to state jj
- taxi,j,ttaxi,j,t: disposal tax per ton faced by waste shipped from origin state ii to destination state jj in year tt
- samesti,jsamesti,j: equal to 1 if origin state ii and destination state jj are the same, equal to 0 otherwise
Origin state characteristics
- oincioinci: origin state ii income in 1989 (thousands of dollars)
- opopiopopi: origin state population in 1990 (millions)
- oareaioareai: origin state area (millions of square miles)
- odensiodensi: origin state population density (persons per square mile)
- o65io65i: origin state percent over age 65
- ocolliocolli: origin state percent with a college degree
- ocapiocapi: origin state waste disposal capacity
- ogeniogeni: origin state natural log of waste generation
Destination state characteristics
- dincjdincj: destination state jj income in 1989 (thousands of dollars)
- dpopjdpopj: destination state population in 1990 (millions)
- dareajdareaj: destination state area (millions of square miles)
- ddensjddensj: destination state population density (persons per square mile)
- d65jd65j: destination state percent over age 65
- dcolljdcollj: destination state percent with a college degree
- dcapjdcapj: destination state waste disposal capacity
- dwestjdwestj: equal to 1 if destination state is in the West region, equal to 0 otherwise
- dnejdnej: equal to 1 if destination state is in the Northeast region, equal to 0 otherwise
- dsouthjdsouthj: equal to 1 if destination state is in the South region, equal to 0 otherwise
- d1j−d48jd1j−d48j: equal to 1 if destination state jj is the same state as the variable name (1, 2, …, 48), equal to 0 otherwise (only one of these will be equal to one for a given row of the dataframe)
Years
- y90ty90t: equal to 1 if year of waste shipment is 1990, equal to 0 otherwise
- y91ty91t: equal to 1 if year of waste shipment is 1991, equal to 0 otherwise
- y92ty92t: equal to 1 if year of waste shipment is 1992, equal to 0 otherwise
- y93ty93t: equal to 1 if year of waste shipment is 1993, equal to 0 otherwise
- y94ty94t: equal to 1 if year of waste shipment is 1994, equal to 0 otherwise
- y95ty95t: equal to 1 if year of waste shipment is 1995, equal to 0 otherwise