Article | Customer experience management, Artificial Intelligence (AI)

Improving Net Promoter Score using machine learning

Aravind Parthasarathy

27 Apr 2021

This article was contributed by Prodapt, a member of TM Forum.

Customers today expect a seamless and hassle-free interaction with their digital service provider. A dissatisfied and frustrated customer will quickly opt to switch providers. It’s therefore of utmost importance for DSPs to monitor Net Promoter Score (NPS).

However, about only 15-20% of customers respond to the NPS survey after their interactions with customer support service. A rule-based approach to get the remaining NPS scores involves tedious and complex steps resulting in a system that is neither scalable nor reusable. Thus, most innovative DSPs are trying to address this problem with a machine learning (ML) approach.

This article details how DSPs can build an ML model to predict NPS scores for the remaining 80% of the customer base who don’t respond to the survey. We also showcase how these predicted scores can be leveraged to bring continuous improvement in customer service operating procedures.

Fig: Key steps in building ML model for NPS prediction

Key steps towards building an ML model for NPS prediction and improvement

1 - Exploratory data analysis

Formulate an effective model-building strategy by discovering patterns and anomalies in the obtained 15-20% of NPS response.

Perform statistical analysis - Identify properties such as delimiters, chat timestamps, and average length of a given interaction. This can be performed for both agent and customer components of text to discover significant patterns and spot anomalies.

Visualize analysis results to draw insights - There is bound to be imbalanced dataset(especially for data such as NPS). It’s important to visualize the distribution of the obtained survey data to get insights such as data distribution, structure, and skewness.

2 - ML model building

Vectorize chat transcripts and remove class imbalance before building classifier models. i.e. the ML will classify the data into the obtained data into different buckets. In this instance, the classifier model will tell if the customer is promoter, or a detractor or neutral:

Use TFIDF for vectorizing corpus of chat-transcripts - Term frequency-inverse document frequency (TFIDF) is recommended to create word vectors as it considers the frequency and relevance of words in the context of the text corpus.

Correct class imbalance - Data imbalances (skewed data such as many detractors/neutral but few promoters) are typically present for NPS scenarios. Ensure that, promoters, neutrals and detractors are represented equally while fitting the model. Class imbalance removal techniques such as under-sampling the data, over-sampling the data or synthetic minority over-sampling technique (SMOTE) can be used to overcome this. Leverage SMOTE for increasing the number of cases in the dataset in a balanced way. The technique works by generating synthetic samples from the minority class that can be used as an input to the classifier model.

Identify the best fit model using confusion matrix and F1 score. A confusion matrix is a table that is often used to describe the performance of a classification model. F1 Score is a measure of the model accuracy. Predict if the remaining 80% of customers will be a promoter, neutral, or detractor based on the chat utterances.

Sample image of confusion matrix and key accuracy metrics

3 - Retrain customer service operating procedures

Now that we have NPS scores for the entire customer base, we can further build specific ML models, which gives more insights to retrain the customer service operating procedures.

Leverage “customer text summarizer” and “agent text readability” scores

Performing “cross-correlation analysis”, i.e. measuring of similarity between predicted NPS score, text summary and readability index helps to identify gaps and retrain operating procedures.

Customer text summarizer – ML generated summary of the customer utterances in the conversation brings out the key essence of long chat transcripts. Gensim based text extractive model is recommended to build this functionality as this involves the selection of phrases and sentences from the source data to make up the new summary.

Agent text readability scoring - Analysis of the agent utterances by working out readability scores helps in retraining the operating procedures of agents/chatbots. This denotes the grade-level at which a person can understand the text. Readability tests such as Coleman – Liau index can be leveraged using text stats-based model

Perform “topic modelling” and correlate topics with customer’s positive and negative sentiments

It is crucial first to understand what the areas are where the virtual agents are performing poorly. For example are there more issues with handling the billing related queries or is it service change request or the tech support? Categorizing all such activities and performing topic modelling (a type of statistical modeling for discovering the abstract “topics” that occur in a collection of documents) can significantly help in uncovering the areas where the agents need to be retrained.

To do this, first identify key topics discussed by customers (for each NPS level). ML techniques such as non-negative matrix factorization or latent dirichlet allocation (LDA) can be used for this exercise. The next step is to cross-correlate identified topics with NPS scores to find which items are strongly correlated with customer’s positive sentiments and which are with the negative sentiments.

Results achieved by a leading DSP in the US after leveraging the ML approach mentioned above

Predicted NPS scores for 100% of customer chat interactions with an overall F1 score of around 82%

Significantly improved operating procedures of gent chatbots/agents, thereby improving the overall customer satisfaction

Contributing Authors
Prashanth Suresh Babu– Sr. Project Manager, Delivery, Prodapt
Sumit Thakur- Sr. Manager, Strategic Insights, Prodapt

Aravind Parthasarathy