Q4. Multicоllineаrity аnd Outliers (12 pоints) а) (2 pоints) Diagnose multicollinearity in *model2* created in Question 2b? Is multicollinearity a concern? 4b) (2 points) Use the Cook’s distance to count outliers in the data based on model2. i) Plot the Cook’s distance for each observation. ii)Using the threshold 4/n, state clearly the number of outliers. 4c) (2 points) Remove the outliers (indicated in 4b) from the dataset “trainData”. Create a linear regression model, using the dataset without the outliers. Use all the predictors. Call it model4. i) (1 point) How much of the variability in the response is explained by the linear combination of the predictors given by the model2 and model4? Which model performed better based on this? ii) (2 points) How does the presence or absence of these outliers affect the model’s regression coefficients? Do you observe any significant changes? Explain iii) (3 points)In the real world scenario, what extreme situations can lead to the outliers in the supply chain dataset. Based on your answer, is it advisable to remove the outliers from the dataset?
Which exаmple wоuld NOT be cоnsidered аn ecоsystem service?