Automating Online-Offline Data Merger for Integrated Marketing

Chenshuo Sun, Anindya Ghose, and Xiao Liu, 2018, 18-136-11

Increasingly, firms aspire to monetize user data in order to better understand consumer behavior and offer curated services. For example, precise prediction of user engagement from mobile apps improves advertising ROI. Accurate clustering and similar user detection enable better consumer segmentation and targeting.

Past efforts have used offline shopping data (e.g., Nielsen scanner panel data) to predict what consumers intend to purchase next, or to analyze online social network proximity to find users with similar interests. These studies rely on a single aspect of consumer behavior. However, today’s consumers follow an omni-channel approach in their path to purchase. Apple Retail, for example, allows customers to sample new products in-store even though they may end up buying the same products online. As a consequence, firms can collect valuable consumer data across their offline and online channels.

Although firms have made significant investments in data analytics, they are now grappling with how to automatically link and query the entirety of the data to better understand their customers. They need methods and tools that enable automated integration of disparate data sources in order to paint a 360-degree view of consumer behavior. Artificial intelligence (AI) can facilitate solutions to this problem.


Chenshuo Sun, Anindya Ghose, and Xiao Liu propose an innovative AI methodology, multi-view representation learning, to create an online-to-offline (O2O) data merging scheme. They focus on two questions: (1) Does the O2O scheme allow one to find similar users and segment users better? (2) How could marketers utilize the O2O scheme to predict user behavior more precisely?

They examine these questions by applying a unique data set that consists of online app behavior (both installation and engagement) and offline location-visit behavior. Their results show that the proposed complementary-based data merger, in the context of leveraging independent online and offline behavioral data to paint a holistic representation of consumer behavior, significantly outperforms using a single aspect of behavioral data and alternative data merging methods. The mechanism is that when online data are sparse, exploiting the proposed method prompts offline behavior to complement the online counterpart.

Moreover, they conclude that in choosing the optimal data merging method, one should incorporate the characteristic of data into the equation; without considering this factor, improperly combining multiple data sources may not be able to generate additional values and may sometimes even backfire.

Put into Practice

On the substantive front, their report quantifies the conventional wisdom that capitalizing on consumers’ omni-channel behavioral data can generate perks, in that it helps business data owners to achieve better user segmentation and engagement prediction.

Chenshuo Sun is a Ph.D. student in information systems, Anindya Ghose is the Heinz Riehl Professor of Business, and Xiao Liu is the Assistant Professor of Marketing, all at the Stern School of Business, New York University.

Related Links

Mobile Targeting Using Customer Trajectory Patterns
Anindya Ghose, Beibei Li, and Siyuan Liu [Report] (2017)

New Frontiers in Mobile Marketing Using Data Analytics
Anindya Ghose [Webinar] (2017)


  • Corporate: FREE
  • Academic: FREE
  • Subscribers: FREE
  • Public: $18.00



Employees of MSI Member Companies enjoy the benefits of complete online access to content, member conferences and networking with the MSI community.



Qualified academics benefit from a relationship with MSI through access to, conferences and research opportunities.



The public is invited to enjoy partial access to content, a free e-newsletter, selected reports and more.




Learn more about becoming a member of the institute

Read More

Stay Informed

The MSI Mailing List

Subscribe to our email list to stay informed about upcoming events, news, etc.