Question 7: Wine Data – Regularized Regression (7a) Using wine_data_train, conduct ridge regression with quality as the binary response variable and all other variables in wine_data_train as the predicting variables. (7a.1) 3 pts – Use 10-fold cross validation on the misclassification error to select the optimal lambda value. What optimal lambda value did you obtain? Hint: Make sure to change the value of type.measure in order to perform cross validation on the misclassification error. If needed, you can take a look at the help file by typing ?cv.glmnet. (7a.2) 1.5 pts – Fit a glmnet object with nlambda = 100. Call it ridge_model. (7a.3) 1 pt – Display the estimated coefficients at the optimal lambda value.
Blog
Question 8: Wine Data – Prediction (8a) 6 pts – Using model3…
Question 8: Wine Data – Prediction (8a) 6 pts – Using model3, all_subsets_model, stepwise_model, and ridge_model, give a binary classification to each of the rows in wine_data_test, with 1 indicating a good quality wine. Use 0.5 as your classification threshold. (8b) 4.5 pts – For each model, display its accuracy, sensitivity, and specificity metrics. Hint: confusionMatrix() from the caret package could be used to calculate these metrics. (9b.1) Which model has the largest accuracy? (9b.2) Which model has the largest sensitivity?(9b.3) Which model has the largest specificity? (8c) 1 pt – In this context, should sensitivity or specificity matter more? Explain. Hint: Remember that sensitivity is the proportion of all 1s in the test set that are correctly classified as 1s, while specificity is the proportion of all 0s in the test set that are correctly classified as 0s. (8d) 1 pt – Based on 8b and 8c, which model performed the best?
Multiple Choice Questions 32-33 The following table shows th…
Multiple Choice Questions 32-33 The following table shows the R output of a logistic regression model, where the variables Sepal.Length, Sepal.Width, Petal.Length, and Petal.Width are predictors and the variable virginica is the response. Using the following R output from a fitted logistic regression model, answer Questions 32 and 33. Call:glm(formula = virginica ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width, family = binomial, data = iris)Deviance Residuals: Min 1Q Median 3Q Max -2.01105 -0.00065 0.00000 0.00048 1.78065 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -42.638 25.708 -1.659 0.0972 .Sepal.Length -2.465 2.394 -1.030 0.3032 Sepal.Width -6.681 4.480 -1.491 0.1359 Petal.Length 9.429 4.737 1.990 0.0465 *Petal.Width 18.286 9.743 1.877 0.0605 .—Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1(Dispersion parameter for binomial family taken to be 1) Null deviance: 190.954 on 149 degrees of freedomResidual deviance: 11.899 on 145 degrees of freedomAIC: 21.899
Which methods below can be applied for variable selection wh…
Which methods below can be applied for variable selection when ? Select ALL correct answers. Note: is the number of predicting variables and is the sample size.
Question 5 – Wine Data – Full Model (5a) 2 pts – Using wine_…
Question 5 – Wine Data – Full Model (5a) 2 pts – Using wine_data_train, fit a logistic regression model with quality as the response variable and all other variables as predicting variables. Include an intercept. Call it model3. Display the summary table for the model. (5b) 2 pts – Conduct a multicollinearity test on model3. Using a VIF threshold of 10, what can you conclude? (5c) 2 pts – Estimate the dispersion parameter for model3. Does overdispersion seem to be a problem in this model?
In Poisson regression, when the model is a good fit, the sum…
In Poisson regression, when the model is a good fit, the sum of squared deviance residuals approximately follows the
Question 1: Bike Data – Exploratory Analysis (1a) 2 pts – Us…
Question 1: Bike Data – Exploratory Analysis (1a) 2 pts – Using bike_data_train, create a histogram of the variable bikes. Based on this plot, what generalized linear regression model(s) discussed in this course could be used to model this response variable? Explain. (1b) 2 pts – Using bike_data_train, create a scatterplot of bikes versus each numeric predicting variable (high_temp, low_temp, and precipitation) (3 scatterplots total). Do these variables appear useful in predicting the number of bikes crossing the Brooklyn Bridge on a given day? Include your reasoning.
2.3 A referendum is a general vote by the electorate on a…
2.3 A referendum is a general vote by the electorate on a single political question which has been referred to for a direct decision. (1)
Instructions The R Markdown file / Jupyter Notebook file inc…
Instructions The R Markdown file / Jupyter Notebook file includes the questions, the empty code chunk sections for your code, and the text blocks for your responses. Answer the questions below by completing the R Markdown file / Jupyter Notebook file. You must answer the questions using one of these files. You may make slight adjustments to get the file to knit/convert but otherwise keep the formatting the same. Once you’ve finished answering the questions, submit your responses in a single knitted/converted HTML file. There are 21 questions divided among 8 sections. The number of points for each question is provided. Partial credit may be given if your code is correct but your conclusion is incorrect or vice versa. Next Steps: Save the .Rmd/.ipynb file in your R working directory – the same directory where you will download the “brooklyn_bridge_bike_counts.csv” and “white_wine_quality.csv” data files into. Having all files in the same directory will help in reading the “brooklyn_bridge_bike_counts.csv” and “white_wine_quality.csv” files. Read the question and create the R code necessary within the code chunk section immediately below each question. Knitting this file will generate the output and insert it into the section below the code chunk. Type your answer to the questions in the text block provided immediately after the response prompt. Once you’ve finished answering all questions, knit this file and submit the knitted file as HTML on Canvas. Example Question Format: (8a) This will be the exam question – each question is already copied from Canvas and inserted into individual text blocks below, you do not need to copy/paste the questions from the online Canvas exam. # Example code chunk area. Enter your code below the comment and between the “`{r} and “` Response to question (8a)This is the section where you type your written answer to the question. Depending on the question asked, your typed response may be a number, a list of variables, a few sentences, or a combination of these elements. Data brooklyn_bridge_bike_counts.csv (right-click the link and select to open in new window/tab) white_wine_quality.csv (right-click the link and select to open in new window/tab) Submission Templates You may use either the R Markdown or Jupyter Notebook Starter Template: 6414_Final_Part2_Submission_Summer2021.Rmd (right-click the link and select to open in new window/tab) 6414_Final_Part2_Submission_Summer2021.ipynb (right-click the link and select to open in new window/tab) Ready? Let’s begin. We wish you the best of luck!
2.9 A territory with its own institutions and populations…
2.9 A territory with its own institutions and populations is known as a state. (1)