In most IoT solutions, device data is collected continuously as streams and persisted into a data lake (or a data storage for simplicity of discussion); what happens next? Usually, a data scientist picks up that data, analyzes for patterns, discovers insights for actions, and builds models to predict future behavior of the devices.

What happens if the data is not rich enough for analysis? It leads to sparse patterns, less meaningful insights, and weak predictions. That exactly is often the problem being experienced by the data scientists in IoT space. To address the sparsity challenge, data scientists have to coordinate/collaborate closely with domain experts and/or learn the domain themselves. While the former adds considerable cycles of time and human efforts across disparate organizational functions, the latter can make data science experts lose valuable time and add frustration (besides, domain knowledge is something that is earned by field experts over many years).

So, is there a better way to solve the problem?

Yes, and that is by deploying the real-time streaming analytics that can:

  • Contextualize the stream data with the device’s or system’s meta data
  • Correlate with ecosystem data; example: 2P and 3P data
  • Annotate the data with thresholds and other meta data
  • Label the raw data with outcome metrics based on domain expertise
  • Compute derived metrics from raw metrics, leading to richer set of features

And, all this is possible in real-time during processing and analytics of data streams (either at the Edge or on the Cloud). That is, allow the domain expert to interact with live data in motion via simple-to-use interactive technologies (complicated interfaces and complex programmer environments limit the domain user’s intent and interactions).

The domain expertise based enrichment of live data in motion adds enormous value, and when persisted into storage, it can further be used by data scientists for greater impact.


We talked about IoT being the connection of the physical and the digital worlds. That is, connecting those things that were physical in nature hitherto and now find a need to be connected to the digital world. This phenomenon about things/objects/entities is also influencing the enterprises in how they are transforming and shaping themselves to survive and thrive in the fast evolving world.

The enterprises across consumer, commercial, public, and industrial sectors that were born in the pre-Internet era (Honeywell, ABB, GE, Philips, Siemens, and so on) are making moves to position themselves as digitally transformed companies. More subtle are the moves being made by the Internet era companies (Google, Amazon, etc.) to integrate themselves with the physical world. Just as the enterprises from physical world have come to realize that they cannot compete unless digital technologies are leveraged to deliver value added products and services on top of physical assets (sometimes even in a freemium model with large physical assets being given free), the digital companies too realize that they cannot continue to enrich the end consumer’s life only through pure software products/services. They recognize the need to blur the line by playing in the field of physical objects that humans touch every day (for instance, besides the much publicized Nest and self-driving car, it can be interesting for you to observe how many of the Alphabet companies are associated with the physical world). Even more interesting are the approaches of enterprises that are born to build physical assets in the digital era (Tesla, for example).

In tandem with the above phenomenon, there are a number of plays across IoT market that numerous enterprises see the opportunity to position themselves. (First, a disclaimer: the below is by no means a comprehensive list of areas or players and neither an endorsement of any enterprise; large dedicated teams of expert market analysts exist across organizations that spend full-time analyzing these areas and enterprises to enable M&As or partnerships or commercial relationships).

Semiconductor Chips: the physical assets have to be added intelligence to perform one or more of sense, connect, store, transmit, and compute functions. Such functionality can be added with semiconductor hardware, and so it became imperative for chip manufacturers (Intel, Nvidia, ARM, Qualcomm, Broadcom, etc.) to partner with various OEMs of the install base. Another angle for them also is the increase of complex compute needs in data centers and on clouds as billions of more devices pump data.

Install base of Physical Things: enterprises across consumer, commercial, public, and industrial sectors with their traditional install base are already in the game, and are now looking to wrest control of the majority pie with deeper integration across the value chain (for example by building the platforms and the value added software services).

Connectivity: Network service providers (AT&T, Verizon, Sprint, etc.); larger the install base better for them with more subscriptions, and so prefer as many physical assets as possible connected natively to the Internet, rather than to a local Edge device. Companies like Cisco (with Jasper offering) also have partnerships with network service providers in cellular connectivity play.

Platform: most misrepresented piece in the IoT space. An Edge device on premise (home, building, plant, etc.), with ability to receive data from sensors/devices, to store and process the data, and then to transmit command and control signals to those end points, is often positioned as a platform; sometimes referred as Edge Platform (Cisco, Dell, etc. play in this space). And, even a smart thermostat is referred occasionally as an IoT platform. And then there are large scale PaaS platforms on the cloud that enable end-to-end (connectivity to Commercialization with all things data in between). GE Predix, Honeywell Sentience, etc. fall under this category (more on the evolution of such cloud platforms in later months). So, your definition of an IoT platform can vary depending on where you are standing.

Cloud: service providers (Azure, AWS, Google Cloud, etc.) offer the infrastructure (IaaS), and they also often offer some out-of-box IoT specific services (PaaS) along with IaaS. To enable IoT platforms on the cloud, there is also a range of technology and commercial plays (Docker Swarm, Kubernetes, Pivotal Cloud Foundry, etc.) to make the platforms compatible or portable to multiple cloud service providers in order to serve expanded markets and geographies.

Security: cuts across all aspects of IoT, starting from securing the local devices to protecting the data and services on cloud, while also ensuring access to authorized people and systems. The security play has different implications, both from technology and commercial perspectives, for consumer IoT vs. industrial IoT solutions with the latter demanding far more stringent cybersecurity needs. Numerous enterprises are making mark in Identity management, device management/authentication, data encryption, secured data access, etc., both on the Edge and the cloud.

Data: its lifecycle management including ingestion, engineering, exploration, quality/integrity assurance & management, governance & compliance, storage, access at scale, and so on, within itself offers an area of unlimited opportunities.

Analytics: plethora of technologies for data discovery, machine learning, deep learning, AI, NLP, NLU, NLG, Multimedia (audio, video, image) analytics, real-time streaming analytics at industrial scale, augmented reality, virtual reality, self-service, visualization, and so on play a critical role in IoT. Numerous startups are emerging in these areas.

Beyond all these, a wide range of technologies that are significantly impacting the IoT include Block Chain, Drones, Robotics, and Biometrics, The list keeps increasing.

And, many commercial areas that influence and shape the IoT world include, System Integrators (SIs), Value-added Resellers (VARs), Commercial Service Providers, Data Aggregators (2nd party and 3rd party data providers), etc. Also include the enterprises that focus on building and delivering pointed IoT solutions embedding processes, methodologies, and practices for a specific domain (buildings, factories, electric grids, agriculture, infrastructure, transportation, oil & gas, etc.).

Global IoT spending is expected to reach a total nearly $1.4 trillion by 2021, per IDC spending guide, as enterprises continue to invest in the hardware, software, services, and connectivity that enable the IoT.

If so, what areas and which players do you think would derive or capture the most value?

In conversations with family and friends, I have come to realize the benefit for broader audience in explaining some foundational concepts on “Internet of Things” (IoT), before delving any deeper. So, this appetizer is for the IoT starters. IoT Ninjas: while this one may not appeal to your taste, upcoming servings will do. Stay tuned.

Internet of Things, as you may have heard, is about connecting “things” to a central entity to gather data and extract insights for actions. There are several concepts, some simple and some complex, hidden in that definition.

First, the connection of things in IoT implies the connecting of two worlds, physical and digital, that have been traditionally disconnected. Ha…sounds like a cliché again. Right? Well, in simple words, it means connecting tangible physical things that were hitherto not instrumented to generate data and/or to receive feedback insights. That is, the “things” in IoT are primarily not the computers, laptops, tablets, smart phones, or other items that were born digital, but rather are items that were first born physical and now have use cases to turn them into digital: daily items we see at home such as thermostats, A/C machines, washers, dryers, toasters and other kitchenware, watches and other wearables, water fixtures, toilets, etc. The list is endless (of course, it is a separate discussion as to the value of connecting a physical entity to the digital world and the willingness of consumer to pay for insights and recommended actions). More than the consumer world, the real impact with IoT is already being realized in the industrial and commercial worlds. Most of the physical things in the manufacturing, mining, oil & gas, and other heavy industries are mechanical or electrical components or a combination of both; same with physical things in city and transportation infrastructure, utilities, and other public/commercial sectors. Rarely were these items invented and instrumented for data. Now being made possible with emerging technologies, harnessing data from these things will elevate one or more of safety, reliability, security, productivity, efficiency, and optimization for an individual and/or enterprise and/or the society as a whole. More on that later.

Now, what makes an IoT solution complete? Connect things, Collect data, Compute analytics, Consume insights and/or Command systems, Control things, and Commercialize solution. The first mile (Connect) is paramount for IoT and the last Mile (control) augments IoT. Without these two, it becomes a traditional data & analytics play. On the other hand, without the data & analytics pieces, connecting things does not hold much value. That is, for IoT data is the blood, connection and collection is the heart, and the analytics is the brain. All pieces have to come together to make the IoT solution meaningful.

So, what does connecting things to a central entity mean? For simplicity sake, there are two primary variations:

  1. Connect multiple sensor devices deployed at a facility to a central computing node (referred to as Edge in IoT parlance) that is still local to the facility. For example, an instrumented light bulb most likely will not have computing infrastructure and so will transmit data to an Edge node in the building. The Edge can collect data from all such light bulbs, compute analytics for insights, and send commands back to light bulb (or its controller).
  2. Connect more complex things to a much larger central computing entity such as a cloud platform or a data center. In this case, the “complex thing” can be a combination of sensor, edge, and gateway software, all three embedded into one device such as smart thermostat. Or it can be a combination of edge and gateway software in one device, while several sensors are deployed separately across a large facility. In that case, on one side the edge talks to the multiple sensors that do not have native connectivity to the cloud directly (as in above # 1), and the gateway talks to the cloud on the other side.

Another distinction to note in IoT is the approach in computing analytics for insights. It is an alloy of operations technology (OT) computing paradigm supported by the information technology (IT) systems. IoT is primarily about extracting insights from thing data and acting on those insights before the value diminishes rapidly in time. So, primary focus of IoT is (near) real-time operations in manufacturing plants, buildings, utilities, physical infrastructure, etc. The analytics requires the compute footprint and configuration to handle in runtime (hundreds of) millions of data events per second generated by millions of connected things.

Stay tuned for more discussion about the IoT Connect and Compute Platforms, and their on-going evolution for greater business value.

A lot of knowledge has been shared on this topic on various forums. And, a significant body of literature has been created by world class experts and researchers on leadership and product management. And yet, I believe there is always something new to share, especially insights gained from my own struggles and failures in product management (summarized through recent reflections as I transition into a product management leadership role at a much larger global company).

And so, here we go…..I have come to realize the benefit of remembering and practicing the following key principles to succeed as a product manager aka product owner:

There is no such thing as a perfect product: First and foremost to remember is the hard truth that, no matter how well you define the functionality, build the product with a great UI, and deliver a great user experience, you will always fall short of 100% customer expectations. No matter how best you do, you will receive some negative feedback. First, customers’ product assessment can be influenced by various biases, including the Negativity Bias. Second, a single product can never solve hundred use cases from hundred different customers. So, the key is to take the feedback in stride, evaluate the requirements objectively, prioritize ruthlessly, and keep iterating to continuously enhance the product.

You need not and will not have all the answers by yourself: The one misconception I used to primarily suffer from in the early stages of my PM career was the need to go find all the answers by myself. And, that created a lot of self-induced pressure. The truth is, there is a reason the stakeholder ecosystem around you will have subject matter experts for each piece of the puzzle, called the product, that you eventually have to put together. It is not your job to always find the answer by yourself, but rather it is your job to gather the relevant experts into the room, huddle with them, foster-mediate-guide the debate among them to a conclusion. Be it the best technology component to use, or best UI design, or best pricing/subscription model, or best marketing/messaging for the product, always leverage the relevant experts.

Situational leadership: You will be tempted to drive all the meetings all the time. You want to drive every discussion on the technology architecture; you want to be front driving the UI discussions; and you want to be in control of everything. Yes, the PM owns the product, and in some organizations the PM is expected to drive and control everything related to the product. I disagree. My dictum is simple: “the more you try to control, the more you lose control”. You have to understand well your own as well as others’ core competencies, and realize how to leverage those differently depending on the context. So, as a PM you should sometimes let the other expert drive, while you can be the front seat passenger navigating the route. And, when you are in that front seat, relax somewhat, focus more on the navigation and speed and less on whether the driver is using one foot or two feet to control the brake and the accelerator!

If you have read this, then you must be a PM or an aspiring PM candidate. Share your experiences and arguments!

More often than not, the analytics users and developers focus on answering “what” questions: “what happened?” or “what is happening?”, depending on the nature of data being handled, be it historical or real-time streaming. Regardless of the temporal aspect, the “what” analysis provides only a small part of a bigger story. Don’t get me wrong; the “what” analysis is a much required foundation for further deeper analytics. But as a business analytics professional, if you provide insights to your key stakeholders (executives, operations managers, etc.) from only the “what” analysis, you are shooting yourself in the foot!

So, what would complete the analytics story?

A primary question that has to be addressed by the analytics for creating greater business value is: “How much impact is there on my key business outcomes?”. The business outcomes for the impact analysis can be financial metrics such as revenue and costs, and qualitative yet valuable metrics such as the customer experience and brand value.

For example, if your job is in product analytics, it is not enough to measure the “what has been happening” or “what happened” trends of product awareness, trial rate, conversion, purchase rate (first or repeat), etc. It is also not sufficient to perform quantitative analytics on various stages of consider-evaluate-buy-experience-advocate-bond-buy(repeat) consumer journey. It is a must to mine deeper insights by measuring “how much impact” those trends have on the business itself. If your job is in web analytics or any other domain monitoring and analyzing various KPIs, it is not enough to measure “what is happening (in real-time)” or “what has been happening” or “what happened”. The value of such “what analysis” is akin to the skin deep beauty.

The same applies even if your job is to develop analytics tools/platforms that help users perform the above variety of analytics. If the product (tool or platform) you are building enables the users to conduct only the “”what” analytics, you are doing a huge disservice to your customers (assuming the customers are still buying that stuff!). And, if you are the customer exploring the analytics tool to buy, you have the right to demand more than the “what analysis” capabilities from your potential vendor. Remember, the right to remain silent does not apply here!

Now, if it is the degree of impact on the business outcomes that is of higher importance, how can one connect those business outcomes to the metrics being actually measured and analyzed? That’s where the business domain knowledge comes into play. For example, see the DuPont model for the Retail industry. The DuPont model establishes a relationship among various value drivers with the financial levers which in turn are connected to the business outcomes. On the other end, the measurable KPIs (for example: page visits, visitors, sessions, duration, bounce rate, etc. in web analytics) can be (and dare I also say “must be”!) quantitatively connected to the value drivers. That can be done through various statistical and/or machine learning techniques applied on the historical data.

Such “How much?” analytics with established relationships to assess the degree of impact being made on the business outcomes, are the ones that truly generate the business value.

In fact, the process of establishing the relationships among various metrics and outcomes through “How much?” analytics will become a valuable foundation for “Advanced What Analytics” such as predictive (“What might happen?”) and prescriptive (“What should happen for optimal outcome?”) for more actionable insights.

That is a nice circle of “What – How Much – What” analytics to complete.

Your thoughts please!

Acknowledgment: DuPont model for Retail developed with inputs from my ex-colleagues (Bernadette, Kim, Robyn, and Victoria), the true industry experts! Any errors are mine.

Biggest Weak Link in Personalization? Missing Customer-Channel Bonding

We all as consumers “channel surf” at will. We quickly sort through best available promotions via one channel, infer popularity and consumability of a product from crowd reviews/ratings via one or more other channels, measure the price and cost of ownership on another channel, may determine the availability via another channel, and then may eventually buy the product on a different channel.

Depending on who the individual is, which age demographic he/she belongs to, and many other factors, the channels leveraged for the above purposes can be any or all of physical store, web, mobile, social media, customer phone service (or call center), catalog, etc. While the services of high-touch channels among these are typically leveraged during the awareness, consideration, and preference evaluation stagescarbon-bond of the purchasing process, the consumer might buy the product at another cheaper channel.

Enterprises traditionally designed channels based on market segmentation studies, whether it was about delivering the right products to the right segments, or creating awareness of products and influencing the appropriate segment, or for any other purpose among the various stages of the typical purchasing process (awareness, consideration, preference, purchase, and post-sale service). A key underlying assumption for such an approach was that consumers with similar demographic characteristics tend to shop and buy in the same way, through the same few channels (for a deeper analysis on the subject, read the beautiful article: The Customer Has Escaped by Paul F. Nunes and Frank V. Cespedes, Harvard Business Review, November 2003).

However, as companies started to understand more about how consumers shop and not just buy (the “channel surfing” behavior) and realize the resultant losses (or avoidable costs) from stranded assets that support the underutilized channels, they have been striving to design channels that cater more to the consumer behavior rather than trying to hold the consumers captive. Instead of forcing consumers into predetermined channels, the strategy is to let the consumer navigate across channels seamlessly as it suits him/her, and in fact sometimes go an extra mile by encouraging/directing the consumer to alternate channels for greater user empowerment. For example, companies have become more open to directing the consumers to a third party source to compare their product with that of the competitors (though this is not a new concept; it has been shown beautifully and funnily in the 1947 movie Miracle on 34th Street)!

In this context, I have been questioning myself on whether we, the Data Science and Analytics professionals, have been providing an effective support to the organizations in the execution of this strategy.

For instance, Personalization is one of the key initiatives at most companies in leveraging the data for better consumer engagement. No doubt, so much advancement has been achieved in personalization, be it in product/service/promotion recommendations for creating awareness or be it in a customized configuration for purchase or be it in the delivery of a post-purchase service. However, are we leveraging the customer-channel bonding in the most appropriate way to deliver best value to the customer such that the conversion can also generate higher value to the organization as well?

If we look at the most common approach for recommender systems for personalization, it involves a user-item matrix (an item can be a product, service, movie, etc.), where each element in the matrix can be an affinity score between user i and item j. The affinity score element can simply be 1 or empty (watched/not-watched, purchased/not-purchased, liked/not-liked, etc.) or it can be a numeric score (or empty) computed as a weighted average of multiple actions (liked, watched/browsed, disliked, shared on social media, added-to-cart, purchased, reviewed, recommended to others, etc.). This score is usually computed from the information aggregated from various channels, as these multiple actions do not necessarily happen on the same channel. Such an aggregated scoring is bound to lose valuable information on how the consumer interacted with the product on a specific individual channel; this “lost in translation” results in missed opportunities for enterprises to understand better the customer-channel bonding, and leads to recommending to a user the same information and services/products/promotions on all or multiple channels.


In other words, with the two dimensional approach for user-item personalization, user-channel bonding is ignored (or rather not extracted properly) at the cost of a deeper three-dimensional user-item-channel personalization.

How do we address this gap? One quick win (not necessarily the most effective solution) can be through an approach to construct a three-dimensional matrix with user-item-channel affinity scores. The user-item matrix itself is largely a sparse matrix (even after aggregation of information from across channels), and this three-dimensional matrix will be even more sparse to maintain the user-item affinity separate for each channel. And so, yes, it is a challenge to expect meaningful user-item insights from a sparser matrix, but then that should not deter the analysts. One approach to overcome this is to include more information for each channel, the information that would have been ignored during aggregation; for example, how much time has the customer spent browsing the product/service on this channel? What time of day does the user interact with the channel (user may use mobile and web at different times)? Did the customer directly search for this product on this channel or was led to it in hops? There is no limit to the level of information that can be gleaned from the user-channel interaction.

Armed with the three-dimensional matrix, we can compute the recommended items/information for a given user separately for each channel (popular Alternate Least Squares Matrix Factorization algorithm can be handy for the purpose); that is only the first step. Such a compiled information of user-item affinity can be cross-referenced among multiple channels and a set of rules can guide the recommender for the best action to take in the context.

For coding enthusiasts: The ALS Matrix Factorization algorithm is part of the Spark MLlib machine learning libraries; the advantage of using Spark for this problem (in contrast to the Apache Mahout or R or other tools) is in the fact that Spark operations can be performed on RDD (Resilient Distributed Dataset, an in-memory structure extracted from large volumes of HDFS stored data), and an RDD can be a wrapped structure with several objects within. While a 2D data frame of a matrix as consumed by R or other tools can be a single object, Data Scientists can explore if the RDD can simultaneously hold multiple data frames. If such a possibility is established, the RDD can hold multiple user-item matrices and thus the 3-D user-item-channel matrix can be operated upon by the Spark MLlib algorithm for optimal outcomes.

(A detailed version of this article appeared as a multi-part blog series at part-1, part-2, part-3, part-4).

Happy recovery from the holidays! I am sure you were one of those numerous gleeful shoppers in the holiday season shopping via web, mobile, physical, and other channels, while also wishing that retailers could do more to enhance your shopping experience!

No doubt that retailers continuously strive for innovative solutions that can enrich customer engagement; particularly the engagement that is seamless across channels, engagement with advance knowledge (predictive) about consumer preferences, and engagement to deliver an offer to the customer at the right moment (real-time). Now, how about a solution that can achieve all the three dimensions (i.e., cross-channel, real-time, and personalized) of the engagement simultaneously?

For example, consider the situation when a consumer is in vicinity of a physical store, and then is at store entrance, and then is inside the store shopping through aisles. There have been various advancements in technologies for retailers to deploy to detect customers in the vicinity of a store location and then track their movements within the store. Some of these technologies include Wi-Fi or Bluetooth based sensors for Geo-positioning, embedding/integrating RFID chips inside customer loyalty cards, and so on. Similarly, tracking and recording the activity of online customers across various webpages has been in vogue via Clickstream technologies (a detailed discussion of tracking technologies is beyond the scope here; focus is around the technologies that help in real-time monitoring of each individual customer to realize an instantaneous engagement, as well as in discovering the pattern of customer activity in a zone; we start with the assumption that regardless of the sensor technology deployed, data from such sources will be available for ingestion to generate a continuous streaming event data).

Given this, how can retailers combine the insights from real-time analysis of dynamic tracking information with the predicted preferences (based on historical data about customers viewed/liked/added-to-cart/purchased) in order to recommend the right promotions/products in a very short time window before the customer walks out of the store? Or how can retailers influence the consumer in the vicinity to come into the store, through a personalized message, and thus improve the conversion in real-time?

For retailers to be able to engage the consumers at the “right moment” armed with “insights in real-time to the seconds and/or minutes” and supported with “accurately predicted” individual consumer preferences is no longer a pipe-dream. This article discusses the relevant technologies that not only have to solve individual disparate problems, but also have to come together and act in unison in order to translate this complex business case into a reality.

  1. The dynamic tracking data is streamed continuously into a streaming analytics engine which continuously analyzes the location data to obtain the insights about the individual consumer activity in real-time (how this is achieved via the “event processing network – EPN” and data ingestion layer, is discussed below);
  2. The data science has to deliver individual consumer’s predicted (personalized) purchase preferences as accurately as possible for greater conversion; while the recommender systems are not new, more significant is the fact that the data science, while predicting the preferences, have to take into account the cross-channel correlated data from web, mobile, store POS, and other channels;
  3. Most importantly, the comprehensive solution has to ensure that these disparate pieces of technologies (i.e., streaming analytics and predictive analytics) are able to communicate and exchange relevant information, possibly via a Business Process Management System (BPMS), to deliver the right recommendation at the right time via the right channel.

Dynamic Tracking and Streaming Analytics: In order to build a model for real-time monitoring of activity, one approach is to first create an Activity Flow Diagram for all of the possible scenarios in which the activity can be defined when a customer is in a particular zone. For example, say a zone is divided into five sub-zones: Z (in vicinity outside the store), Z (in the entrance/checkout area), Z (clothing and apparel aisles), Z (electronics aisles), and Z (food and grocery aisles). See Figure.One can construct the Activity Flow Diagram across these various states (Z0, Z1, Z2, Z3, Z4) by leveraging the Activity Tracker application within the Streaming Analytics engine; the model will enable the system to monitor the customer movement from zone to zone in real-time. The Activity Tracker application can also be leveraged to monitor in real-time various KPIs, such as the duration spent by the customer in each zone. The Duration KPI for each zone can be assigned some threshold (e.g., poor, fair, good) to trigger operational actions, such as texting promotional messages; for example, the smaller the duration consumer spends in Zone 0 and Zone 1 the better, and the larger the duration consumer spends in Zone 2 or Zone 3 or Zone 4 the better. The other KPIs can be the total number of customers in each zone in a given time window, the number of customers that move from Za to Zb as against Za to Zc, and so on.

For the Streaming Analytics engine to monitor the customer flow and provide the necessary insights, a source connector or a stream processing component can first ingest the data coming from various sources (e.g., sensors); the output of the ingestion can become the source stream to the Streaming Analytics engine. Using an event-driven architecture (EDA), the stream processing engine tackles large volumes of raw events in real-time to uncover valuable insights. It does this by correlating events from diverse data sources and by aggregating low-level events into business-level events so as to detect meaningful patterns and trends.

As the real-time monitoring of a customer in a zone is set in place, the second piece of the puzzle is the predictive analytics for recommendations/messages, personalized at each individual customer level to be delivered when the customer is in the zone.

Predictive Analytics for recommended promotions/products: Some of the popular approaches for product/item recommendations have been based on collaborative filtering techniques. The fundamental assumption in these techniques, as it is suitably referred to as a “collaborative” technique, is that an individual consumer tends to view/like/purchase the same items that other consumers with similar patterns of views/likes/purchases have also done. Various algorithms/models in the realm of collaborative filtering techniques vary in their level of efficiency in extracting the “product/item similarity” between two consumers. For example, the Matrix Factorization method, particularly the Alternate Least Squares (ALS) method, has been a preferred choice among collaborative filtering techniques. For more detailed discussion on this method, and how cross-channel information of individual consumers can be captured via this method, see here and here.

Regardless of the approach, the recommender system will produce for each individual user an ordered set of recommended items based on the items not yet “touched by the user,” but have been “touched” by other similar users. It is important to note that the collaborative filtering techniques work well for user-item affinity, where the items (consumer products, movies, etc.) have sufficient longevity to be “touched by” a considerable number of users. But, what if the items are promotions or coupons or other similar things with a short life-cycle to map relevant promotions to users? One approach is to leverage the triangulation of “user <=> products” and “promotions <=> products” and thus deriving the “user <=> promotions” affinity.

With the ordered set of predicted products/promotions available ready and the EPN tool deployed to monitor in real-time the activity of an individual consumer in the zone, how do we combine these two to deliver the right message at the right time via the right medium? We will discuss this in the following section.

Combining Streaming Analytics with Predictive Recommendations: The data ingestion layer in a Streaming Analytics platform can ingest and fuse data from multiple sources. For this solution to be successfully deployed, as the EPN tool tracks the consumer in a store, the business process management system (BPMS) within the Operational Intelligence (OI) platform performs a few steps to achieve the lock-step (see Figure for functional architecture for the solution):

  • Consume the output from the EPN tool with information such as the presence of the consumer in a specific zone, the duration of a consumer in a specific zone, etc.
  • Parse the data with the ordered set of predicted recommendations for the particular consumer
  • Formulate appropriate messages, in context, through various built-in decision rules

For example, when the EPN tool determines that a particular consumer is in Z (clothing and apparel), the BPMS tool can extract the relevant recommendation (product or promotion) for the items relevant to that zone.

While the dimensions of “real-time” and “predictive” are combined through this process, the third dimension of “channel” can also be achieved at the data usage level. That is, by leveraging the insights about a particular consumer’s affinity to items, gained from online activity information, the right message is delivered to the consumer when in a specific zone inside a physical store. The inverse can also be achieved with the mix of these technologies. That is, with the help of the insights gained from the activity in various zones in a physical store, a consumer can be engaged with a personalized message via mobile and online channels.

SUMMARY: Overall, the article discussed how a mix of technologies perform individual tasks of a complex business use case and also act in lock-step in an OI platform to deliver a unified message for an enriched real-time cross-channel customer engagement:

  • Ingest high velocity data from various sensor sources for real-time analysis
  • Glean insights from the real-time monitoring of a consumer in a zone
  • Produce personalized recommendations for consumers based on historically aggregated data from both in-store and online/mobile activity
  • Combine insights from real-time activity with predicted recommendations for more relevant messages, in context
  • Deliver messages to the consumer instantaneously for greater conversion
  • Trigger enhanced engagement (e.g., sending a store representative for interactive engagement) that can dynamically adapt based on the situation at-hand.