consumerGuest Article by Mohamed Zaki and Benjamin Lucas based on research recently funded by a Marketing Science Institute (MSI), Customer Experience Initiative grant for the project titled: “CX Analytics: A Data-Driven Measurement System for Customer Experience and Emotional Complexity”.

Customer Experience (CX) and Services

Unsatisfactory customer experiences result in around $83 billion in losses by US enterprises each year through abandoned purchases and defections (Forbes, 2013). As such, managing end-to-end customer experience and customer emotional needs has risen as a top managerial priority (Zorfas and Leemon, 2016). In basic terms, the more managers can understand about the experiences customers have with their product and service offerings, the more they can measure them again in the future to shape positive experiences. For example, a manager of a service organization may identify that a customer had a positive overall experience, which included a positive social component (e.g. positive interactions with staff, other customers, or their own family and friends during a service experience), but also that the customer engaged in needlessly complex cognition in response to a negatively evaluated attribute during the decision process (e.g. because of an overly complicated pricing system, menu or display of offerings). Knowing this, a manager would know in the future which specific aspect of the service experience to leverage further (social) and which needs improvement (facilitating more fluid customer cognition during the purchase decision process). This of course applies to both online and offline services.

server-90389_1280Data-driven Text Mining in Marketing and Service Research

Because recounts of customer experiences, made by the customers themselves (e.g. online via review platforms and social media) are deeper and more multifaceted than what can be captured by measuring volume and valence along (Archak, Ghose and Ipeirotis, 2011), marketing researchers are beginning to embrace techniques employing data-driven learning to capture richer insights. For example, Tirunillai and Tellis (2014) used Latent Dirichlet Allocation (LDA) to measure latent satisfaction and quality dimensions. The LDA procedure assumes a theory of meaning deriving from co-occurrence patterns within and across documents (within a corpus) (Farrell, 2016). In basic terms, this allows researchers to identify “linguistic regularities” associated with certain objects, activities and concepts (Roy et al., 2015). This focus on ‘automatic’ data-driven text mining is similar to the approach employed in earlier research by Lee and Bradlow (2011) who automated product attribute identification for ranking and comparing brands.

On the subject of customer experience in services, Villarroel Ordenes et al. (2014) were among the first to utilize text analytics to analyze customer feedback to capture key aspects of the CX. The authors employed a linguistics-based text mining approach to automate analysis of sentiment, conveyed in customer feedback relating to car park and transfer services (Read also Franciscos guest article towards current text mining tools)

How Should Text Mining Approaches Evolve for CX Analytics?

One of the challenges in selecting a text mining approach is deciding how the documents under investigation will be represented (i.e. how will a computer ‘read’ them?). Individual terms are commonly represented within document vectors as input for analysis such as classification, whereas some researchers opt to score terms using dictionary-based scoring. Both approaches (not using dictionaries in the former, using dictionaries in the latter) are often implemented in a way such that terms are treated in isolation of one another. Whilst it is possible to produce useful results with such approaches, deeper analysis requires recognition of linguistic patterns, the most basic example being n-gram designations, where contiguous sequences of letters or words are treated as individual features.

Furthermore, most of the current dictionary-based models use external public resources, such as WordNet, the largest online database of English terms, which then requires additional resources devoted to arranging terms into useful representations of constructs under investigation.

Another shortcoming of dictionary-based approaches is that significant semantic information is lost. A clear example is where ambiguous language is used by customers; for example, “my ears were burning”, used to denote something being very unpleasant to hear. A basic sentiment scoring program, referring to a sentiment dictionary may not recognize this phrase as conveying what is most likely negative sentiment.

Summarized comparison between a simple dictionary-based text mining procedure (top), and our latest procedure (bottom). The main differences are more sophisticated document representation, analysis based on combined document clustering and topic modelling, and a focus on building dashboard tools rather than just producing ad-hoc reports.

Summarized comparison between a simple dictionary-based text mining procedure (top), and our latest procedure (bottom). The main differences are more sophisticated document representation, analysis based on combined document clustering and topic modelling, and a focus on building dashboard tools rather than just producing ad-hoc reports.

As another example, positive emotions can sometimes be detected by what is safely assumed a literal statement (e.g. “I had the best time ever at Disney Land!”), but other times this is not so clear. Take for example the use of “I love how…” to start both very positive sentences, but also very negative, sarcastic sentences. Additionally, certain patterns that most likely convey positive emotions contain no positive sentiment words at all, for example the phrase “I promise you won’t be disappointed!”, thus, once again pattern based detection is needed.

What’s Next?

Drawing together the above discussion, future customer experience text mining research will focus on approaches centered around topic modeling, based on sophisticated representations of documents to extract relevant themes and their constituent communicative patterns. Current CX measurement systems cannot be readily adapted to provide a prescription for firms to understand and act upon the underlying causal factors of certain aspects of customer experience. This is vital, since it defines managerial responses to CX. Using text mining and pattern-based representations will facilitate the development of more sophisticated managerial dashboards and social media analytics tools, and allow the extraction of rich insights beyond simple sentiment scores and trending keywords.

As we are currently in the era of digital disruption, organizations should start better capitalizing on the verbatim data extracted from customer feedback systems to measure CX accurately and in a useful manner.

zakiMohamed Zaki

Research Associate
University of Cambridge (Cambridge Service Alliance, Institute for Manufacturing)


image_normalBenjamin Lucas

Assistant Professor
Maastricht University (School of Business and Economics, Department of Marketing and Supply Chain Management, and Business Intelligence and Smart Services Institute)


Photo: NY –


Archak, N., Ghose, A., & Ipeirotis, P. G. (2011). Deriving the pricing power of product features by mining consumer reviews. Management Science, 57(8), 1485-1509.

Farrell, J., 2016. Corporate funding and ideological polarization about climate change. Proceedings of the National Academy of Sciences113(1), pp.92-97.

Forbes (2013). Four Ways To Improve The Buyer Experience Starting Today. Retrieved from:

Lee, T. Y., & Bradlow, E. T. (2011). Automated marketing research using online customer reviews. Journal of Marketing Research, 48(5), 881-894.

Roy, B.C., Frank, M.C., DeCamp, P., Miller, M. and Roy, D., 2015. Predicting the birth of a spoken word. Proceedings of the National Academy of Sciences, 112(41), pp.12663-12668.

Tirunillai, S., & Tellis, G. J. (2014). Mining marketing meaning from online chatter: Strategic brand analysis of big data using latent dirichlet allocation. Journal of Marketing Research, 51(4), 463-479.

Villarroel Ordenes, Francisco, Babis Theodoulidis, Jamie Burton, Thorsten Gruber, and Mohamed Zaki (2014), “Analyzing Customer Experience Feedback Using Text Mining: A Linguistics-Based Approach,” Journal of Service Research, 17 (3), 278-295.

Zorfas, A. and Leemon, D. (2016). An Emotional Connection Matters More than Customer Satisfaction. Harvard Business Review Online. Retrieved from: