DBPapers

APPLICATION OF MACHINE LEARNING FOR CHURN PREDICTION BASED ON TRANSACTIONAL DATA (RFM ANALYSIS)

Y. Aleksandrova
Thursday 11 October 2018 by Libadmin2018

ABSTRACT

Machine learning covers a wide set of supervised and unsupervised algorithms for solving prediction, classification and anomaly detection problems. One of the areas of their applications is for customer churn prediction. To build a model for predicting the switching of customers, data scientists use different demographics, social, transactional, behavioural metrics and features. At the same time, most of the small Bulgarian companies still don’t have the needed versatile and complete customer data. They rely mainly on information provided by the ERP system that generates mostly transactional oriented data. Small and medium sized enterprises at this stage are not planning major investments in marketing research and additional customer related sources, and are limited to perform modelling and forecasting on transactional data.
The main goal of the current study is to propose a combination of RFM analysis and machine learning algorithms for churn prediction based on mainly transactional data. The dataset is extracted from ERP system of a regional concrete production company in Bulgaria. RFM scores are calculated for every customer for a period of 6 months before the end date of examination. The target value for prediction models is a churn metric indicating whether the customer has made a transaction in the next 6 months following the RFM analysis or not. Several machine learning algorithms has been applied such as Two-Class Boosted Decision Trees, Two-Class Neural Networks, Two-Class Decision Jungle, Two-Class SVM and Two-Class Logistic Regression. The experiments were performed in Azure Machine Learning Studio. Results showed that despite the limitations of RFM scores and metrics by using machine learning algorithms companies can predict with enough confidence the churning of their customers. The best model for churn prediction proved to be Two-Class Decision Jungle, Two-Class Boosted Decision Trees and Two-Class Neural Networks. There are no notable differences when using recency, frequency and monetary values instead their scores (R, F, M and RFM).

Keywords: machine learning, RFM analysis, churn prediction, classification and prediction, data mining


Home | Contact | Site Map | Site statistics | Visitors : 0 / 353063

Follow site activity en  Follow site activity INFORMATICS  Follow site activity Papers SGEM2018   ?

CrossRef Member    Indexed in ISI Web Of Knowledge   Indexed in ISI Web Of Knowledge
   

© Copyright 2001 International Multidisciplinary Scientific GeoConference & EXPO SGEM. All Rights Reserved.

Creative Commons License