Apply

How to Solve Business Problems Using Applied Datascience

International Dean Bassirou Chitou kicked off the weekly academic seminar on January 28, 2020 with the assertion: “To be successful in life, you have to develop skills that are transferable from one field to another.”

During the seminar, Chitouhe presented his paper which uses statistics and econometric analysis to draw conclusions about how businesses in competitive markets can improve customer retention. The paper tackled a case study in which the CEO of a telephone company realized that 30% of his customers were switching to competing companies. To resolve this issue, he analyzed his customers’ database and used the data to determine a retention program.  According to Chitou, there are two ideal responses to this business problem:

    • Maximizing customer retention with efficiency and minimal costs
    • Correctly identifying which customers the company risks losing

To reach these solutions, Chitou utilized a five-step statistical methodology: 

    1. Transforming the raw data into CSV format
    2. Describing the data and finding missing data
    3. Splitting the data into two sets, namely the “training” and the “testing” sets
    4. Creating a data preprocessing pipeline
    5. Building a desired model and assessing machine learning performance. 

Logistics model regression and random forest classification models were used in Chitou’s study, and he emphasized accuracy, precision, recall, and F1 score as the four business metrics of interest. Based on the results, he recommended that the CEO should more efficiently focus on retaining customers, improving the quality of low-cost services, and giving out bonuses to incentivize high performance by employees. 

Next, the Question and Answer session of the presentation allowed participants to better understand this particular modeling and to express their interests in various research topics. Chitou advised the students to always understand any study’s main outcome and the nature of its dependent variables before running any models.