
Overview
A digital engagement provider faced challenges enabling actionable insights from customer interactions over WhatsApp. While they had robust outbound and inbound journeys powered by WhatsApp Cloud API, visibility into engagement metrics was lacking. Xenonstack partnered with the client to deliver a modern, cost-effective analytics solution leveraging a serverless data lake architecture built on AWS. The solution—using Apache Iceberg, Glue, Athena, and QuickSight—provides marketing and operations teams with daily-refreshed dashboards for campaign optimization, without requiring real-time infrastructure.
Client Needs and Obstacles
Customer Information
- Industry: Digital Engagement / MarTech
- Location: Italy
- Company Size: Mid Size Enterprise
Business Challenges
The client's enterprise customers initiated a request for a comprehensive analytics dashboard to monitor and improve their WhatsApp-based customer engagement strategies. Specifically, they sought visibility into:
-
Interaction Analytics: How customers engage across outbound campaigns and inbound service journeys.
-
Behaviour Analytics: Patterns across messages, session drop-offs, click-throughs, and escalations.
-
Promo Analytics: Effectiveness of promotional messaging in terms of views, clicks, and conversion behaviors.
A key requirement was cost-efficiency over real-time responsiveness. Customers explicitly stated that daily refresh of data was acceptable, removing the need for real-time streaming or low-latency pipelines. The system also had to scale seamlessly with existing campaign volume growth without requiring major re-architectures.
Technical Challenges
-
Interaction and campaign data were fragmented across multiple services with no centralized storage or transformation layer.
-
Prior reporting workflows involved manual data extraction and transformation processes.
-
The team lacked automation and scheduling capabilities for recurring analytical workloads, leading to delays and resource strain.
-
Real-time ingestion and querying using tools like Amazon Redshift and low-latency pipelines were deemed cost-prohibitive.
-
WhatsApp campaigns and user journeys evolved rapidly, introducing frequent changes in log structures.
Xenonstack's Approach
Solution Overview
Xenonstack designed a cost-effective analytics solution for tracking WhatsApp customer interactions and promotional campaign performance. The enterprise was actively running outbound campaigns and handling inbound service journeys over WhatsApp using the WhatsApp Cloud API but lacked visibility into customer behavior, engagement, and conversion patterns.
The business requested the client to provide rich analytics dashboards for marketing and operations teams to track campaign performance, behavioral trends, interaction funnels, and promotional offer engagement metrics. However, they explicitly ruled out real-time analytics due to cost concerns, requesting a solution with daily data refresh cycles that still delivers comprehensive insights.
To meet these goals, Xenonstack designed a modern serverless data architecture leveraging AWS components. Data from WhatsApp journeys and promo campaigns is ingested via Kinesis Data Streams and Firehose into an S3-based Raw Vault. AWS Glue ETL jobs process and transform this data into a curated Iceberg-based Business Vault. AWS Athena is used to query the Business Vault, with insights presented in rich, interactive dashboards built on Amazon QuickSight.
The solution includes KPIs such as message delivery rates, user engagement, drop-offs by funnel stage, promo redemption rates, and campaign response timelines. These metrics enable the enterprise to fine-tune outbound messages, identify journey bottlenecks, and improve promotional targeting.
By using a fully serverless, event-driven architecture, the solution eliminated the need for heavy infrastructure management while ensuring scalability and cost control. With daily data refresh and self-service dashboard access, business teams now operate with data agility—shortening the feedback loop for campaign improvements and driving higher ROI from WhatsApp communications.
AWS Services Used
-
Primary data lake storage for both Raw and Business Vaults.
-
Raw interaction logs from Kinesis Firehose are stored in the S3 Raw Vault.
-
Transformed datasets, modeled using Apache Iceberg, are stored in the Business Vault and queried via Athena.
-
S3 provides cost-effective, scalable, and durable storage for the entire pipeline.
-
Used as the configuration store for WhatsApp campaigns, customer journey metadata, Conversations, Conversations Delivery Status received from WhatsApp, Custom Datasets uploaded by Enterprises required in Customer Journey like store locations etc.
-
Real-time changes in campaign status are captured using DynamoDB Streams, which trigger the data ingestion pipeline.
Amazon Kinesis (Data Streams & Firehose)
-
DynamoDB Streams are piped into Kinesis Data Streams, which buffers and manages real-time ingestion.
-
Kinesis Firehose delivers logs into the S3 Raw Vault with minimal latency and no infrastructure management.
-
Runs daily ETL jobs to transform raw logs into structured data models using PySpark.
-
Glue handles schema mapping, partitioning, and compaction for Iceberg tables in the Business Vault.
-
Integrated with Data Catalog for metadata discovery and governed querying.
-
Provides serverless SQL querying over Iceberg tables stored in Amazon S3.
-
Used by both backend systems and business dashboards to access curated campaign metrics.
-
Integrated with Glue Catalog for table discovery and query optimization.
-
Used to build interactive dashboards for business users.
-
Embedded within the client's platform for real-time campaign insights and funnel performance.
-
Supports multi-tenant access and role-based data views.
-
Powers backend logic for data transformations, campaign state updates, and API triggers.
-
Stateless execution ensures scalability and cost efficiency.
Amazon API Gateway + AWS AppSync
-
Manages communication between the front-end configuration interface and backend services.
-
Enables GraphQL queries to fetch campaign metadata and journey state.
-
CloudFront delivers secure access to the client's user interface.
-
WAF provides web application firewall protection, blocking malicious traffic and ensuring regulatory compliance.
-
Used for real-time indexing of customer interactions to support operational monitoring and quick lookups.
-
Will be extended in future phases to support RAG (Retrieval-Augmented Generation) for GenAI fallback systems.
Amazon Bedrock (Planned)
-
Will be used to integrate GenAI capabilities, including natural language-based journey configuration, automated analysts, and decision agents.
-
Bedrock will power foundation model workflows without requiring custom model hosting.
Architecture Diagram
Fig - Architectural Diagram
How the solution was implemented
The architecture is split into two logical layers:
Application Layer (Left Side):
-
The client's platform serves as the front-end for campaign configuration and user interaction.
-
Clerk handles user authentication and authorization.
-
API Gateway connects users to backend services, protected by AWS WAF.
-
Backend logic and configurations are handled via Lambda Functions, Amplify, and AppSync, which interface with DynamoDB for storing campaign metadata and journey states.
-
Changes in DynamoDB are tracked through DynamoDB Streams and indexed into Amazon OpenSearch for near real-time operational analytics.
Analytics Layer (Right Side):
-
DynamoDB Streams feed data into Kinesis Streams, which is then ingested by Kinesis Firehose into an S3 Raw Vault.
-
AWS Glue ETL Jobs run daily to transform raw interaction logs into structured datasets in an Apache Iceberg-based Business Vault on S3.
-
Amazon Athena queries the Business Vault, and results are visualized via Amazon QuickSight dashboards, embedded directly into the client's portal.
Methodology used
Agile Delivery Model
-
The project followed 2-week sprints with bi-weekly demos to stakeholders.
-
Continuous feedback loops enabled iterative improvement of dashboard KPIs and backend performance.
DevOps Best Practices
-
Infrastructure as Code (IaC) using AWS CloudFormation/Terraform.
-
CI/CD pipelines for Lambda, Glue, and frontend deployments.
Migration approach (if applicable)
This was a new implementation, not a migration. However, the data modeling approach evolved from a traditional flat-file schema to a layered vault model (Raw Vault and Business Vault). Historical campaign logs from DynamoDB were backfilled into the Raw Vault to ensure historical completeness.
Integration with existing systems
Integrated seamlessly with:
-
WhatsApp Cloud API for campaign interactions.
-
Clerk for user authentication.
-
Existing DynamoDB campaign configuration store.
-
Embedded QuickSight dashboards in the client's internal portal, eliminating the need for separate BI tools or manual reporting.
-
OpenSearch to allow operational users to query recent events without needing access to the analytics pipeline.
Security and compliance considerations
Authentication and Authorization
-
Handled via Clerk and enforced using IAM roles and policies.
Perimeter Protection
-
API Gateway is secured with AWS WAF.
Data Security
-
S3 buckets use server-side encryption (SSE-KMS).
-
Data-in-transit is secured with TLS/SSL.
Audit & Logging
-
CloudWatch used for logging and Glue job monitoring.
-
Athena and QuickSight logs used for tracking usage and cost.
Compliance
-
Solution aligns with GDPR data minimization principles and includes access controls and auditability.
Deployment and testing strategy
Deployment
-
Environments: Dev, QA, Production
-
Used CI/CD pipelines (GitHub Actions) for deploying Glue scripts, and infrastructure and Amplify for deploying Lambda functions, GraphQL Schema Updates and FrontEnd Portal.
Testing
-
Unit tests for Lambda and ETL logic
-
Integration tests using synthetic data
-
User acceptance testing with marketing and operations teams on QuickSight dashboards
Timeline and major milestones
Phase |
Timeline |
Key Milestones |
---|---|---|
Discovery & Planning |
Week 1–2 |
Business workshops, technical discovery |
Architecture Design |
Week 3 |
Finalization of Iceberg and ETL pipeline blueprint |
MVP Development |
Week 4–6 |
Raw Vault, basic ETL, basic dashboards |
UAT & Feedback Loop |
Week 7–8 |
Stakeholder demos, performance testing |
Full Rollout |
Week 9+ (Ongoing) |
Embedded dashboards, OpenSearch, IAM, cost tuning |
Technical Innovations and Industry Standards
The solution designed by Xenonstack for the client leverages multiple AWS best practices to deliver a scalable, cost-efficient, and secure analytics platform for WhatsApp engagement.
AWS Best Practices
The architecture follows AWS's serverless-first approach, using managed services like AWS Lambda, DynamoDB, Glue, and Athena to eliminate infrastructure management while ensuring high availability and fault tolerance. Storage is decoupled from compute using Amazon S3 and Iceberg tables, enabling scalable and cost-efficient data querying.
Innovative Implementation Highlights
A key innovation was the use of a dual-layer vault structure — a Raw Vault for ingesting unprocessed logs and a Business Vault powered by Apache Iceberg for analytics-ready data. This modular architecture allows schema evolution without impacting analytics consumers, improving maintainability and data agility.
The integration of OpenSearch for real-time indexing alongside a batch-processed Iceberg warehouse is another unique hybrid design that supports both operational and analytical workloads.
Well-Architected Framework Alignment
The solution aligns with the AWS Well-Architected Framework, particularly in the pillars of cost optimization, security, and operational excellence. IAM policies, WAF, and encryption practices ensure secure data handling, while the modular design supports scalability and cost control.
DevOps and CI/CD Practices
Infrastructure was provisioned using Infrastructure as Code (IaC) and deployed through automated CI/CD pipelines, promoting repeatability and reducing deployment risks. Agile delivery with sprint-based iterations ensured continuous improvement and stakeholder alignment.
Outcomes and Value Delivered
Business Outcomes and Success Metrics
The implementation of Xenonstack's AWS-powered analytics platform delivered substantial cost savings, operational efficiencies, and strategic advantages for the client and its enterprise customers.
Cost Savings
-
By adopting a serverless architecture (using AWS Glue, Athena, S3, and Lambda), the customer avoided the need for always-on infrastructure like Redshift or EMR.
-
Estimated infrastructure cost reduction: ~60% compared to real-time streaming or traditional data warehouse solutions.
-
Athena and Glue usage is optimized through partitioning and batch scheduling, ensuring compute costs are incurred only during daily processing windows.
Revenue Increases / New Revenue Streams
-
Enabled the client to offer campaign analytics as a value-added service to enterprise clients, unlocking a new monetizable feature tier.
-
Clients using the dashboards have improved customer targeting, leading to increased campaign ROI and better upsell opportunities.
Time-to-Market Improvements
-
Prior to this solution, it took 1–2 weeks to generate actionable campaign insights.
-
With the automated pipeline and daily dashboard updates, feedback loops are now within 24 hours, enabling faster iteration and time-to-market for new campaigns.
Operational Efficiencies
-
Eliminated 6–8 hours/week of manual report generation by automating ETL and dashboard updates.
-
Reduced need for DevOps involvement due to serverless and auto-scaling infrastructure, improving engineering velocity and reducing overhead.
Competitive Advantages Gained
-
Delivered self-service analytics for business users without dependency on technical teams.
-
The use of Iceberg enabled schema evolution without downtime, giving the client agility in adapting to evolving customer data.
-
The client now positions itself as a data-driven campaign partner, differentiating from competitors that offer only execution platforms.
ROI and Payback Period
-
The solution achieved full ROI within the first 3 months, primarily through reduced operational costs and monetization of analytics features.
-
Ongoing operational costs are minimal due to pay-per-use pricing on Glue, Athena, and S3, ensuring long-term sustainability.
Technical Benefits
The architecture designed by Xenonstack brought significant technical advancements to the client's analytics capabilities, with measurable improvements in performance, scalability, security, and developer productivity.
Performance Improvements (with Metrics)
-
Query latency reduced by over 70%, with typical Athena queries on Iceberg tables completing in under 3 seconds, compared to several minutes with previous flat file scans.
-
ETL processing time was optimized through partitioning and parallelism in AWS Glue, completing daily batch jobs in under 15 minutes for millions of records.
Scalability Enhancements
-
Entire architecture leverages serverless AWS services (S3, Lambda, Glue, Athena, Kinesis), allowing it to auto-scale with traffic and data volume without manual intervention.
-
Data ingestion pipeline handles high message volumes via Kinesis, with seamless scaling to accommodate campaign spikes.
Reliability and Availability Improvements
-
The solution architecture is built on highly available AWS services, with no single point of failure.
-
Daily ETL failures are handled with built-in retry mechanisms in Glue, and Athena/AWS QuickSight ensures query/reporting availability even during processing windows.
Security Posture Strengthening
-
Fine-grained access control enforced via IAM roles, AWS WAF, and Clerk for user authentication.
-
Data encryption both in transit (TLS) and at rest (SSE-KMS) is implemented for all S3 and DynamoDB resources.
-
Operational audit trails are available via CloudWatch for both infrastructure and data access.
Reduced Technical Debt
-
Replacing manual pipelines and ad-hoc reports with automated, repeatable workflows has significantly reduced maintenance burden.
-
Schema evolution via Apache Iceberg eliminated the need for manual code changes on schema updates, improving long-term flexibility and maintainability.
Improved Development Velocity
-
New campaign metrics and dashboards can be rolled out within days instead of weeks, thanks to modular data vault architecture and agile CI/CD workflows.
-
Business users access insights independently, reducing reporting dependency on engineers by over 90%.
Key Insights and Challenges Overcome
Significant Challenges Encountered During Implementation
Cost Confusion Around Athena Queries
-
Early query testing revealed unexpectedly high costs due to unoptimized data access patterns in Athena. Querying unpartitioned datasets and wide scans of large S3 files drove up usage and cost quickly.
Schema Evolution Complexity
-
The campaign and interaction logs were subject to frequent changes. New event types, metadata fields, and message formats caused traditional schema-bound systems to break or require manual refactoring.
Adoption of QuickSight by Non-Technical Users
-
Business users initially found QuickSight dashboards overwhelming due to technical language, cluttered KPIs, and unclear visual priorities.
Kinesis Throughput Tuning
-
During stress tests, Kinesis Firehose buffering configurations were insufficient for peak message bursts, leading to temporary ingestion lags and retries.
How These Challenges Were Addressed
Athena Cost Optimization
-
Apache Iceberg was implemented with columnar storage and strategic partitioning (e.g., by Enterprise ID / date), which dramatically reduced the amount of data scanned per query. Glue jobs were also tuned to optimize file sizes and compaction routines.
-
Apache Iceberg's schema evolution capabilities were used to support backward- and forward-compatible schema changes without reprocessing historical data. This enabled dynamic tracking of new fields as campaigns evolved.
QuickSight Usability
-
User feedback loops were introduced to simplify the UX. Dashboards were redesigned with cleaner visual layouts, contextual tooltips, and business-oriented KPIs to boost adoption.
Kinesis Buffering Fixes
-
Buffer sizes and flush intervals for Kinesis Firehose were adjusted based on peak traffic simulations. Additional monitoring was implemented using CloudWatch to proactively detect lag. Also for unpredictable workload cases, We used On-Demand Streams instead of provisioned streams.
Adjustments Made to the Original Plan
-
Originally scoped for only outbound WhatsApp campaigns, the solution was extended to include two-way customer interaction analytics based on evolving stakeholder needs.
-
Real-time querying was initially scoped but due to cost concerns, We switched to Daily Refresh of Iceberg Data Lakehouse.
-
Dashboard delivery timelines were adjusted by 1 week to incorporate user onboarding sessions and improve early adoption outcomes.
Best Practices Identified
-
The team prioritized customer needs (daily insights, not real-time) over tech trends, avoiding costly or over-engineered real-time pipelines. This ensured high alignment between architecture and ROI from day one.
-
Separating the data pipeline into Raw Vault and Business Vault layers allowed flexibility, easier debugging, and faster schema iterations without disrupting downstream consumers.
-
Rapid campaign changes required dynamic data models. Choosing Apache Iceberg upfront enabled safe, non-breaking schema evolution—crucial for agility.
Practices That Contributed to Success
-
Frequent demos and UAT sessions with marketing and operations teams helped prioritize dashboard KPIs and refine usability.
-
Building on fully managed services like Glue, Athena, S3, and Kinesis reduced operational overhead and improved reliability without sacrificing scalability.
-
All infrastructure and ETL jobs were deployed using CI/CD pipelines and Infrastructure as Code (IaC), ensuring repeatability, version control, and rapid rollback if needed.
Approaches That Could Benefit Other Implementations
Hybrid Analytics Model
-
Combining OpenSearch for operational metrics and Iceberg for strategic insights offered the best of both worlds—real-time visibility and deep analytical power.
Self-Service Dashboards Embedded in Internal Portals
-
Providing access to insights directly inside the client's platform eliminated the need for third-party BI tools and empowered business teams to explore data independently.
Scalable Design with Cost Controls
-
Implementing query partitioning and optimizing Glue jobs allowed the solution to scale affordably, which is vital for high-volume marketing and customer interaction data.
Strategic Roadmap and Future Enhancements
In the upcoming phases, the client aims to evolve from a reactive analytics platform into an AI-augmented decisioning engine. The roadmap includes:
GenAI-Based Journey Configuration
-
Marketing users will be empowered to configure WhatsApp journeys using natural language prompts, powered by Amazon Bedrock Foundation Models. This enables campaign creation and modification without technical dependencies.
RAG-Powered Fallback Handling
-
Retrieval-Augmented Generation (RAG) workflows will be implemented to dynamically respond to customer queries when predefined journey flows fail. This enhances personalization and recovery, improving user experience and campaign retention.
AI-Driven Analysts & Recommendation Agents
-
Automated agents will analyze campaign performance metrics (e.g., drop-offs, conversion rates) and proactively recommend "next best campaigns" to meet business KPIs like engagement or sales targets. These agents will act as intelligent co-pilots for marketing teams.
Additional AWS Services to Be Implemented
To support these enhancements, the following AWS services are planned:
Amazon Bedrock
-
For integrating foundation models (Claude, Titan, or Llama) to enable natural language prompt engineering, campaign configuration, and conversational agents.
Amazon OpenSearch
-
To serve as a real-time vector store and retrieval engine for the RAG-based fallback flow, ensuring fast access to relevant campaign FAQs, past conversations, or product data.
Future Optimization Plans
-
Vector Index Optimization: Improve latency and relevance scoring in OpenSearch for faster, more contextual fallback responses.
-
Dynamic Budget Allocation Models: Use AI agents to forecast optimal spend per campaign/channel based on historic ROI and sales targets.
-
Athena Cost Governance: Continue partition tuning and implement Athena Workgroups with usage limits to control long-term query spend.
Ongoing Partnership Activities
Xenonstack will continue to serve as a strategic AWS and AI/ML partner, assisting the client with:
-
GenAI capability development and governance
-
Prompt engineering and agent fine-tuning
-
Continuous optimization of Iceberg-based analytics pipelines
-
Regular business reviews and roadmap alignment with AWS
Take next steps in WhatsApp engagement with Apache Iceberg on AWS
Consult with our specialists about building a comprehensive WhatsApp analytics foundation. Discover how companies across industries are leveraging AWS and Apache Iceberg to transform their messaging strategies into data-driven decision engines.