Two models, A and B, were created, fitted to the data, and tested on the test set. Both models’ R2 and RMSE are shown below. Model A: R² = 0.58, RMSE = 22 Model B: R² = 0.55, RMSE = 19 Which statement is most accurate?
Blog
An analyst used data from a pandas DataFrame to fit a multip…
An analyst used data from a pandas DataFrame to fit a multiple linear regression model that predicts annual_spend using both visits_per_month and is_premium_member. The variable is_premium_member takes on the value of 1 if the customer is a premium member and a value of 0 if the customer is not a premium member. The fitted model is given by the equation: Model predicting annual_spend: annual_spend = 120 + 15*(visits_per_month) + 40*(is_premium_member) Interpret the coefficient for is_premium_member (the one with a value of 40).
Which actions generally improve the credibility of a regress…
Which actions generally improve the credibility of a regression model for business decision-making? Select all that apply.
Fill in the best CRISP-DM phase for each activity. Definin…
Fill in the best CRISP-DM phase for each activity. Defining the business success metric (e.g., reduce churn by 2 points) belongs to: [Dropdown1] Fixing missing values and parsing dates belongs to: [Dropdown2] Selecting features and training a regression model belongs to: [Dropdown3] Interpreting RMSE and whether it meets the business threshold belongs to: [Dropdown4]
You need to compute “days since last purchase.” purchase_dat…
You need to compute “days since last purchase.” purchase_date is currently a string like “2025-09-14”. Which of the following is the best first step?
A VP wants a table with: rows = region, columns = quarter, v…
A VP wants a table with: rows = region, columns = quarter, values = total revenue. Which of the following provides the best approach?
Using the same model: revenue = 12,000 + 4.5 * ad_spend Wha…
Using the same model: revenue = 12,000 + 4.5 * ad_spend What is the best interpretation of the intercept (12,000)?
A dataset of customer orders has 50,000 rows. The discount_c…
A dataset of customer orders has 50,000 rows. The discount_code column is missing for 70% of rows. What is the most business-appropriate initial action?
A histogram of order_value shows most orders between $10–$80…
A histogram of order_value shows most orders between $10–$80, but a long tail reaching $1,500. Which of the options provides the best interpretation?
Your transaction table should have one row for each combinat…
Your transaction table should have one row for each combination of order_id and line_item_id. In other words, multiple rows can have the same order_id, and multiple rows can have the same line_item_id; but no two rows should have the same order_id and the same line_item_id. What is the best check?