Introduction to Analytics PlatformAn analytic platform is an integrated and proper solution configured to address the demands of users, unusually large data-driven companies, on the insufficiency of relational database management system in providing contextual analyze Data out of all the stored information. It joins different tools for creating analytics system collectively with an engine to execute, a DBMS to store and manage the data, data mining process, and a mechanism for obtaining and preparing the data which is not stored.
Challenges for Building eCommerce Analytics PlatformAn e-commerce portal wants to build a near real-time analytics dashboard to see the number of orders getting shipped every moment to improve the performance of the product supplying by them.
Solution and Tools for Building the eCommerce Analytics PlatformLet’s have a quick view of the tools required in this problem to overcome -
- Apache Spark - A fast engine for large scale data processing.
- Scala language - A dynamic, interpreted general-purpose programming language.
- Kafka - A distributed streaming platform, publish-subscribe messaging system.
- CloudxLab - It provides a real cloud-based environment for learning various tools.
Building Near Real Time Analytics PlatformDescription of each stage for building the solution i.e Near Real Time Analytics Platform -
- When a user buys a product, the id related to the particular product with the order status and time gets passed to the Kafka topic.
- Spark streaming code takes the data from Kafka topic in a window of a few seconds, processes it as to count each different order status in that few seconds of an open window.
- As soon as Spark stream processes the total count of each unique order, the state gets pushed to new Kafka topic.
- Now a Node.js server will start consuming messages as soon as new messages are available in the one minute Kafka topic, and then the consumed message gets emitted to the browser via Socket.io.
- As socket.io-client in the browser receives a new “message” event, data in the event start getting processed.
- If the order status of the buyer is “shipped” in the received data, then it gets attached to HighCharts series and gets presented on the browser of the buyer.