We all as consumers “channel surf” at will. We quickly sort through best available promotions via one channel, infer popularity and consumability of a product from crowd reviews/ratings via one or more other channels, measure the price and cost of ownership on another channel, may determine the availability via another channel, and then may eventually buy the product on a different channel.
Depending on who the individual is, which age demographic he/she belongs to, and many other factors, the channels leveraged for the above purposes can be any or all of physical store, web, mobile, social media, customer phone service (or call center), catalog, etc. While the services of high-touch channels among these are typically leveraged during the awareness, consideration, and preference evaluation stages of the purchasing process, the consumer might buy the product at another cheaper channel.
Enterprises traditionally designed channels based on market segmentation studies, whether it was about delivering the right products to the right segments, or creating awareness of products and influencing the appropriate segment, or for any other purpose among the various stages of the typical purchasing process (awareness, consideration, preference, purchase, and post-sale service). A key underlying assumption for such an approach was that consumers with similar demographic characteristics tend to shop and buy in the same way, through the same few channels (for a deeper analysis on the subject, read the beautiful article: The Customer Has Escaped by Paul F. Nunes and Frank V. Cespedes, Harvard Business Review, November 2003).
However, as companies started to understand more about how consumers shop and not just buy (the “channel surfing” behavior) and realize the resultant losses (or avoidable costs) from stranded assets that support the underutilized channels, they have been striving to design channels that cater more to the consumer behavior rather than trying to hold the consumers captive. Instead of forcing consumers into predetermined channels, the strategy is to let the consumer navigate across channels seamlessly as it suits him/her, and in fact sometimes go an extra mile by encouraging/directing the consumer to alternate channels for greater user empowerment. For example, companies have become more open to directing the consumers to a third party source to compare their product with that of the competitors (though this is not a new concept; it has been shown beautifully and funnily in the 1947 movie Miracle on 34th Street)!
In this context, I have been questioning myself on whether we, the Data Science and Analytics professionals, have been providing an effective support to the organizations in the execution of this strategy.
For instance, Personalization is one of the key initiatives at most companies in leveraging the data for better consumer engagement. No doubt, so much advancement has been achieved in personalization, be it in product/service/promotion recommendations for creating awareness or be it in a customized configuration for purchase or be it in the delivery of a post-purchase service. However, are we leveraging the customer-channel bonding in the most appropriate way to deliver best value to the customer such that the conversion can also generate higher value to the organization as well?
If we look at the most common approach for recommender systems for personalization, it involves a user-item matrix (an item can be a product, service, movie, etc.), where each element in the matrix can be an affinity score between user i and item j. The affinity score element can simply be 1 or empty (watched/not-watched, purchased/not-purchased, liked/not-liked, etc.) or it can be a numeric score (or empty) computed as a weighted average of multiple actions (liked, watched/browsed, disliked, shared on social media, added-to-cart, purchased, reviewed, recommended to others, etc.). This score is usually computed from the information aggregated from various channels, as these multiple actions do not necessarily happen on the same channel. Such an aggregated scoring is bound to lose valuable information on how the consumer interacted with the product on a specific individual channel; this “lost in translation” results in missed opportunities for enterprises to understand better the customer-channel bonding, and leads to recommending to a user the same information and services/products/promotions on all or multiple channels.
In other words, with the two dimensional approach for user-item personalization, user-channel bonding is ignored (or rather not extracted properly) at the cost of a deeper three-dimensional user-item-channel personalization.
How do we address this gap? One quick win (not necessarily the most effective solution) can be through an approach to construct a three-dimensional matrix with user-item-channel affinity scores. The user-item matrix itself is largely a sparse matrix (even after aggregation of information from across channels), and this three-dimensional matrix will be even more sparse to maintain the user-item affinity separate for each channel. And so, yes, it is a challenge to expect meaningful user-item insights from a sparser matrix, but then that should not deter the analysts. One approach to overcome this is to include more information for each channel, the information that would have been ignored during aggregation; for example, how much time has the customer spent browsing the product/service on this channel? What time of day does the user interact with the channel (user may use mobile and web at different times)? Did the customer directly search for this product on this channel or was led to it in hops? There is no limit to the level of information that can be gleaned from the user-channel interaction.
Armed with the three-dimensional matrix, we can compute the recommended items/information for a given user separately for each channel (popular Alternate Least Squares Matrix Factorization algorithm can be handy for the purpose); that is only the first step. Such a compiled information of user-item affinity can be cross-referenced among multiple channels and a set of rules can guide the recommender for the best action to take in the context.
For coding enthusiasts: The ALS Matrix Factorization algorithm is part of the Spark MLlib machine learning libraries; the advantage of using Spark for this problem (in contrast to the Apache Mahout or R or other tools) is in the fact that Spark operations can be performed on RDD (Resilient Distributed Dataset, an in-memory structure extracted from large volumes of HDFS stored data), and an RDD can be a wrapped structure with several objects within. While a 2D data frame of a matrix as consumed by R or other tools can be a single object, Data Scientists can explore if the RDD can simultaneously hold multiple data frames. If such a possibility is established, the RDD can hold multiple user-item matrices and thus the 3-D user-item-channel matrix can be operated upon by the Spark MLlib algorithm for optimal outcomes.