Statistics; Predicting Response at BookBinders: Logistic Regression

Buy your research paper [Click: to Make your Order].

 Statistics; Predicting Response at BookBinders:  Logistic Regression

As a direct marketer of specialty books, the BookBinders Book Club has achieved steady growth in their customer base.  Yet while sales have grown steadily, profits began falling when the database got larger and when the company diversified its book selection and increased the number of offers sent to customers. The falling profits have led Dave Lawton, BookBinders’ marketing director, to experiment with different database marketing approaches in order improve BookBinder’s mailing yields and profits.


Dave began a series of live market tests, each involving a random sample of customers from the database.  An offer for the current book selection is sent to the sample and then the sample customers’ responses, either purchase or no purchase, are recorded and used to calibrate a response model for the current offering. The response model’s results are then used to “score” the remaining customers in the database and select customers from the full customer database for the ‘rollout’ mailing campaign.


Dave’s first market tests relied on RFM (recency – frequency – monetary) analysis.  Direct marketers have used this approach to predict customer behavior for more than 50 years.  The approach is intuitive, easy to implement, and produced significant improvements in response rates and profits compared with mass mailings to BookBinder’s full database.  Despite this initial success, Dave is eager to evaluate the effectiveness of alternate approaches.  BookBinders offers books in different categories including cooking, art and childrens’ books – and the number of previous book purchases in each category is recorded in each customer’s record in the database.  RFM analysis does not use this or other customer information such as gender, and Dave suspects that a more sophisticated modeling approach could yield superior results to the RFM approach.


Logistic Regression offers a powerful method for modeling response.  Logistic regression is similar to linear regression – the key difference is that the dependent variable is binary (for example, purchase or no purchase) rather than continuous.  For each customer, logistic regression predicts a probability, between 0 and 1, of purchase or response, which can be used for targeting and prediction decisions.  Like linear regression, it can accommodate both continuous and categorical predictors, including interaction terms.  Its use in database marketing has grown as software becomes more readily available and as familiarity with the approach grows.


Dave has just received a dataset containing the responses of a random sample of 50,000 customers to a new offering from BookBinders titled “The Art History of Florence.”   Dave is eager to assess the potential value of logistic regression as a method for predicting customer response and has asked you to complete the following analyses.

  1. Logistic Regression
    Estimate a logistic regression model using BUYER as the dependent variable and the following as predictor variables: (Use ‘Analyze/Regression/Binary Logistic” in SPSS. Save the predicted probabilities by clicking on the ‘Save’ button and then on ‘Probabilities’ under ‘Predicted Values’).

Technical Note:

PURCH is excluded from the set of predictor variables – including it will lead to perfect collinearity since PURCH (the number of books purchased) is equal to the sum of the number of books purchased in the 7 categories. By including the number of purchases in each category, there is no need to include the total number of purchases.

  1. Summarize and interpret the results (so that a marketing manager can understand them). Which variables are significant?  Which seem to be ‘important’?  Interpret the coefficients for each of the predictors.


  1. Decile Analysis of Logistic Regression Results
    1. Assign each customer to a decile based on his or her predicted probability of purchase. (Hint: use “largest values” to create deciles)
    2. Create a bar chart plotting response rate summarized by decile. (Hint: Use deciles as the “Category Axis” and mean “Bought” on the vertical axis)
    3. Generate a report showing number of customers, the number of buyers of “Art History of Florence’ and the response rate to the offer by decile. (Hint: use “Case Summaries,” be sure to uncheck “Display Cases”)
    4. Generate a report showing the mean values of the following variables by probability of purchase decile:
      Total $ spent
      Months since last purchase, and
      Number of books purchased for the seven categories (i.e., children, youth, cookbooks, do-it-yourself, reference, art and geography). (Hint: use “Case Summaries,” be sure to uncheck “Display Cases”)
    5. Summarize and interpret the decile analysis results.




  1. Lift and Cumulative Lift
    1. Use the information from the report in 2c) above to create a chart showing the lift and cumulative lift for each decile. Recall that the lift for a decile is the response rate for that decile divided by the overall response rate.  Similarly, cumulative lift is the cumulative response rate (summing up to and including that decile) divided by the overall response rate.  You may want to use Excel for these calculations.
    2. Create a graph showing the cumulative lift by decile.
  2. Gains, Cumulative Gains and ‘Banana’ Charts
    1. Use the information from the report in 2c) above to create a chart showing the gains and cumulative gains for each decile. Recall that the gains for a decile are the proportion of responders who are in that decile.  Similarly, cumulative gains are the sum of gains up through that decile.  You may want to use Excel for these calculations.
    2. Create a ‘banana’ chart showing the cumulative gains by decile along with a reference line corresponding to ‘no model’. Interpret the Banana chart.
  3. Profitability Analysis

Use the following cost information to assess the profitability of using logistic regression to determine which customers will receive a specific offer:


Cost to mail offer to customer:                          $.50
Selling price (shipping included):                    $18.00

Wholesale price paid by BookBinders:             $9.00

Shipping costs:                                                  $3.00

  1. Create a new variable (call it MAIL) with a value of 1 if the customer’s predicted probability is .083 or greater, and 0 otherwise. Since the breakeven response rate is 8.3%, this variable will be used to determine who gets mailed the offer and who doesn’t.
  2. Generate a report summarizing the number of customers, the number of buyers of “Art History of Florence’ and the response rate to the offer by the MAIL variable.
  3. What would the gross profit (in dollars, and also as a percentage of gross sales) and the return on marketing expenditures have been if BookBinders had mailed the offer to buy “The Art History of Florence” only to customers with a predicted probability of buying of 8.3% or greater?



  1. What are the key learning points from this assignment? What are the managerial implications of your findings?

Buy your research paper [Click: to Make your Order].



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: