Q4. Decision Tree and Random Forest Models  (7 points) U…

Q4. Decision Tree and Random Forest Models  (7 points) Using the dataset “trainData”, fit the following classification models below using all the predictors in “trainData” and “Fraudulent” as the response variable. i) Decision Tree Model (call it model4). ii) Random Forest model (call it model5). Display the summary of both models and state the average accuracy for both resampled models. Which model performed better in terms of mean accuracy?

POISSON REGRESSION We will use the dataset “poisson_data” fo…

POISSON REGRESSION We will use the dataset “poisson_data” for this question ## Features: **Transaction_Hour (Numerical):** Hour of the day when the transaction occurred (0-23) **Previous_Frauds (Numerical):** Number of previous fraudulent transactions by the user (0-5) **Account_Age_Days (Numerical):** Age of the account in days (1-5000) **Fraud_Count (Numerical):** Number of frauds (Response variable) Q6 Poisson Regression  (Use poisson_data for this question) (5 points) a. i) (2 points)  Fit a poisson regression model using all the predictors from the “poisson_data” and “Fraud_Count” as the response variable. Call it ’pois_model1 and display the model summary. ii) (1 point) Interpret the coefficient of “Previous_Frauds” in pois_model1 with respect to the log expected “Fraud_Count”. b. (2 points) Calculate the estimated dispersion parameter for “pois_model1” using both the deviance and Pearson residuals. Is this an overdispersed model using a threshold of 2.0? Justify your answer.

Question 2: Logistic Regression Model (Use trainData for thi…

Question 2: Logistic Regression Model (Use trainData for this question)  (13 points) a. i) (2 points) Create a logistic regression model using “Fraudulent” as response variable and “Account_Age_Days” and “International_Transaction” as predicting variables. Call it model1. Display the summary of the model.   ii) (2 points) Interpret the coefficient of “Account_Age_Days” for model1 with respect to the log-odds and odds of the response.   iii) ( 2 points) What does the value of intercept represent in terms of baseline fraud probability? b i. (2 points) Using the “trainData” dataset, create a logistic regression model using “Fraudulent” as response variable and all variables in “trainData” as predictors (call it model2) and display the summary of model2.  ii. ( 2points) What does the summary of model2 suggests about the likelihood of fraud for international transactions compared to domestic transactions? iii. ( 3 points) What do the null and residual deviance indicate in the summary of the model.  Based on the null and residual deviance values, how well does this model fit the data? What does the change in deviance indicate? Use alpha level of 0.01.