Customer Churn for Telecommunication Industries

Using IBM Intelligent Miner

By

Adnan Rahi

CS 595

What is Customer Relationship Management?

Customer relationship management (CRM) is the term used for the business practice to allow businesses with more than few customers to better serve and manage the interaction with those customers. It is easy to manage a few customers without any problem. You know the customers by name, and you and customer’s account manager probably know in some detail what their interest is and what their business is. You probably know what they like and dislike about your company and it s products and services. If you have 10 milion customers, you’d ideally like to have the same kind of relationship with them. It is not cost effective to have multiple people from your company assigned to each customer. This is where you need help from technologies and best practices of customer relationship management. In some circles, CRM is also known as customer profitability management.

The bottom line is that if you want to improve customer profitability, you almost always have to first improve the relationship that you company has with that customer. Often, the best way to improve profitability is to improve customer loyalty.

CRM with an Example

In the table below, you can see a simple example of a customer value matrix of customer current value and potential value. This should be one of the first things that you do with data mining, and you should have it available in order to segment and control your customer population. Keep in mind that this is the very first step in segmenting your customers, and it only gets more complicated from here. The reality is that if you doi even this first step, you will probably be far ahead of your competitors in understanding your customers.

Note that you may have many more customer segments than the six shown in this table, but you will usually have the main segments. Segment 1 has your best customers: they will remain your best customers throughout their lives and their current value matches their potential. Segment 2 is similar, except that they are likely to have low lifetime value, despite their high value today, probably because they are not loyal and are likely to switch to a competitor at some time in their customer life. Segment 6 represents your low value customers that you will treat with your least expensive service. Segments 4 and 5 represent customers who, with the right care and service, can be trasitioned to high value customers, either near term or long term.

Segment / Current value / Lifetime value / Potential value / Potential lifetime value / Current service level / Best service level
1 / High / High / High / High / Gold / Gold
2 / High / Low / High / High / Gold / Gold
3 / High / Low / High / Low / Gold / Bronze
4 / Low / Low / Low / High / Bronze / Gold
5 / Low / Low / High / High / Bronze / Gold
6 / Low / Low / Low / Low / Bronze / Bronze

Thus, at its most basic level, sales and marketing investments should be keeping valuable customers and moving less valuable customer toward more value.

What is Churn Management?

Churn management is a term used in the telecommunications industry to describe the process of ensuring that profitable customers stay with a particular company. Advanced techniques of churn management include the ability to predict whether a given individual is likely to move to another service provider, and to be able to define the correct actions to keep that profitable customer.

The Cost of Churn

Churn costs European and U.S. telcos close to US$4 billion each year, and the global cost of customer defection may well approach a staggering US$10 billion per year. Annual churn rates of 25 to 30 percent are the norm, and carriers at the upper end of this spectrum will get no return on investment on new subscribers. Why? Because it typically takes three years to recover the cost (approximately US$400 in the United States and US$700 in Europe) of replacing each lost customer with a new one (customer acquisition).

In the European and Asian markets in particular, the number of new market entrants is adding to the churn phenomenon. In Europe, 30 new telcos entered the market in 1998, seeking the 15 percent market share that analysts say they will need to survive. The growth in the number of subscribers has eased this situation in the past, but as market growth slows and average revenue-per-user declines, increased competition is likely.

Why Churn is a Problem?

The problem confronting telcos' management is that it is very difficult to determine which subscribers leave the company and why. It is therefore even more difficult to predict which customers are likely to leave the company, and more difficult still to devise cost-effective incentives that will persuade likely "churners" to stay.

Churn is such a massive problem that it affects other aspects of customer relationship management (CRM), such as customer acquisition. A manager must ask himself, "Am I recruiting the right people or are they likely to churn before I have made a return on my investment?" "How is churn affecting the lifetime value of my customer base?" and "Can we get a complete view of our customer information, so that we can profile likely churners?"

It perhaps comes as something of a surprise to learn that telcos, who are among the biggest users of IT systems and sit on a goldmine of customer information, are faced with such problems when it comes to acting on that information to manage their customer relationships. The reason for this is that telcos' IT departments tend to focus on meeting day-to-day operational goals, such as providing and maintaining the switching system needed to allow calls to take place, and the billing systems to charge for calls made. In many cases the technology is not in place to support the complex requests for information from the sales and marketing departments, who must address the issue of churn. In addition, the organization may lack the expertise to support complex data mining and analytical/predictive tasks, which are essential in combating churn (as well as in many other aspects of CRM). The volumes of data that are needed to undertake such tasks are huge and sometimes difficult or impossible to access and consolidate using conventional operational-system tools.

Data Mining Techniques Used for Churn

IBM Intelligent Miner for Data is used to create data mining model. I have used two types of techniques namely classification decision tree and demographic clustering. Classification decision tree handles categorical and continuous data and is very useful for segmenting customers. Demographic clustering clusters records based on similarity score.

The Data

The data consists of 5000 records downloaded from There are 21 fields as shown in the table below.

Data / Type
State / discrete
account length / continuous
area code / continuous
phone number / discrete
international plan / discrete
voice mail plan / discrete
number vmail messages / continuous
total day minutes / continuous
total day calls / continuous
total day charge / continuous
total eve minutes / continuous
total eve calls / continuous
total eve charge / continuous
total night minutes / continuous
total night calls / continuous
total night charge / continuous
total intl minutes / continuous
total intl calls / continuous
total intl charge / continuous
number customer service calls / continuous
Churn / discrete
Using Decision Tree to Segment the Customers

IBM Intelligent Miner lets you apply decision tree method for classification using a systematic wizard. To apply the decision tree you need to specify the input fields and the class label. The input fields are used by the mining function. The class label represents the particular classification based on the attributes and values that contributed to this classification. Values in the Class label field must be categorical or discrete numeric.

For the input fields I used account length, international plan, number customer service calls, total charge, total international calls, total international charge, and voice mail plan. The class label would be churn field.

The Data Mining Model

When the model was run against the 5000 customers, it produced a model with 8 segments (each segment corresponding to a leaf at the end of the decision tree). Each segment is defined by the customer characteristic as mentioned above as the input fields.

In the table below the segments are described.

Segment number / Description of the customers in the segment / Churn Rate % / Number of records affected
1 / Total charge<74, Number customer service call<3.5, International Plan =”yes”, and Total Intl calls < 2.5 / 100 / 73
2 / Total charge<74, Number customer service call<3.5, International Plan =”yes”, and Total Intl calls < 2.5, and Total Intl charge < 3.525 / 1 / 257
3 / Total charge<74, Number customer service call<3.5, International Plan =”yes”, and Total Intl calls >= 2.5, and Total Intl charge >=3.525 / 100 / 61
4 / Total charge<74, Number customer service call<3.5, and International Plan =”yes” / 2.7 / 3844
5 / Total charge<57, Number customer service call>=3.5 / 100 / 165
6 / Total charge<74, Number customer service call>=3.5, and Total charge >= 57 / 4..6 / 197
7 / Total charge>=74,
And Voice Mail Plan =”No” / 100 / 299
8 / Total charge>=74,
And Voice Mail Plan =”Yes” / 7.3 / 104

Using Demographic Clustering

Demographic clustering clusters records based on their similarity score. I used it to find clusters such as “who the churn customers are”, “Which customers are likely to churn”, “which customers are likely to stay”, and “who the gold customers are”. To partition a database so that records that have similar characteristics are grouped together, you must specify active fields. The active fields are used by the mining function for the clustering. You can also specify supplementary fields. The supplementary fields are used to gain statistical information on the clusters that are found. Supplementary fields are not used for the clustering, however, in the clustering results viewer they appear as parts of the clusters. The fields in a cluster are ordered by importance. It is therefore possible that you see supplementary fields among the active fields in a cluster. This means that the supplementary fields would have influenced the creation of the clusters, perhaps more than the fields you specified as active fields.

Active Fields / Supplementary fields
Account length
International plan
Number customer service call
Number vmail messages
Total day calls
Total eve calls
Total intl calls
Total night calls
Total mail plan / Area code

Who are the churn Customers?

After running the above model, 4 clusters were identified.

Cluster # / Cluster % / Intl plan =”yes” % / Voice mail plan =”yes” % / Highest account length
100-120 / Highest number of intl calls
2-3 / Highest number of customer service call
1
2 / 63 / 0 / 0 / 40-60 / 3-4 / 4
1 / 22 / 100 / 0 / 120-140 / 2-3 / 2
0 / 9 / 90 / 0 / 160-180 / 6-7 / 4
3 / 7 / 100 / 100 / 100-120 / 2-3 / 0

The above table shows the overall percentage and the percentage distributed to each cluster. For example, the account length is the highest for 100-120, which means that most customers who churn have their account length 100-120, and the highest account length for the cluster 1 is about 40-60.

The figure below shows the distribution of account length for the first cluster. The

solid bars represent the percentage of account length for all the data and the red

transparent bars represent the distribution in cluster 1.

Some Snap Shots!