Streaming Data Lake and Analytics with Apache Nifi - XenonStack

Data Lake Services with Apache NiFi

Apache NiFi for Data Ingestion delivers an easy to use, a powerful, and reliable system to process and distribute the data over several resources. Apache NiFi works in both standalone mode and cluster mode.

Apache NiFi is used for routing and processing data from any source to any destination. The process can also do some Data Transformation.

It is a UI based platform where we need to define our source from where we want to collect data, processors for the conversion of the data, a destination where we want to store the data.

Big Data Analytics and Ingestion

Data Collection and Data Ingestion are the processes of fetching data from any data source which we can perform in two ways –

In Today’s World, Enterprises are generating data from different Sources and building Real Time Data lake; we need to Integrate various sources of Data into One Stream.

In this Blog We are sharing how to Ingest, Store and Process Twitter Data using Apache Nifi and in Coming Blogs, we will be Sharing Data Collection and Ingestion from Below Sources

  • Data ingestion From Logs
  • Data Ingestion from IoT Devices
  • Data Collection and Ingestion from RDBMS (e.g., MySQL)
  • Data Collection and Ingestion from Zip Files
  • Data Collection and Ingestion from Text/CSV Files

Objectives for Enterprise Data Lake

  • A Central Repository for Big Data Management
  • Reduce costs by offloading analytical systems and archiving cold data
  • Testing Setup for experimenting with new technologies and data
  • Automation of Data pipelines
  • MetaData Management and Catalog
  • Tracking measurements with alerts on failure or violations
  • Data Governance with clear distinction of roles and responsibilities
  • Data Discovery, Prototyping, and experimentation

Goals of Data Ingestion

Apache NiFi Architecture

Apache NiFi provides an easy to use, the powerful, and reliable system to process and distribute the data over several resources.

Apache NiFi is used for routing and processing data from any source to any destination. The process can also do some data transformation.

It is a UI based platform where we need to define our source from where we want to collect data, processors for the conversion of the data, a destination where we want to store the data.

Each processor in NiFi have some relationships like success, retry, failed, invalid data, etc. which we can use while connecting one processor to another. These links help in transferring the data to any storage or processor even after the failure by the processor.

Benefits of Apache NiFi

  • Real-time/Batch Streaming
  • Support both Standalone and Cluster mode
  • Extremely Scalable, extensible platform
  • Visual Command and Control
  • Better Error handling

Features of Apache NiFi

  • Guaranteed Delivery – A core philosophy of NiFi has been that even at very high scale, guaranteed delivery is a must. It is achievable through efficient use of a purpose-built persistent write-ahead log and content repository.
  • Data Buffering / Back Pressure AND Pressure Release – Apache NiFi supports buffering of all queued data as well as the ability to provide back pressure as those lines reach specified limits or to an age of data as it reaches a specified age (its value has perished).
  • Prioritized Queuing – Apache NiFi allows the setting of one or more prioritization schemes for how data from a queue is retrieved. The default is oldest first, but it can be configured to pull newest first, largest first, or some other custom scheme.
  • Flow Specific QoS – There are points of a data flow where the data is critical, and it is less intolerant. There are also times when it must be processed and delivered within seconds to be of any value. Apache NiFi enables the fine-grained flow particular configuration of these concerns.
  • Data Provenance – Apache NiFi automatically records, indexes, and makes available provenance data as objects flow through the system even across fan-in, fan-out, transformations, and more. This information becomes extremely critical in supporting compliance, troubleshooting, optimization, and other scenarios.
  • Recovery / Recording a rolling buffer of fine-grained history – Apache NiFi’s content repository is designed to act as a rolling buffer of history. As Data ages off, it is removed from the content repository or as space is needed.
  • Visual Command and Control – Apache NiFi enables the visual establishment of data flows in real-time. And provide UI based approach to build different data flow.
  • Flow Templates – It also allows us to create templates of frequently used data streams. It can also help in migrating the data flows from one machine to another.
  • Security – Apache NiFi supports Multi-tenant Authorization. The authority level of a given data flow applies to each component, allowing the admin user to have a fine grained level of access control. It means each NiFi cluster is capable of handling the requirements of one or more organizations.
  • Parallel Stream to Multiple Destination – With Apache NiFi we can move data to multiple destinations at one time. After processing the data stream, we can route the flow to the various destinations using NiFi’s processor. It can be helpful when we need to back our data on multiple destinations.

Apache NiFi Clustering

When we require moving a large amount of data, then the only single instance of Apache NiFi is not enough to handle that amount of data. So to handle this we can do clustering of the NiFi Servers, this will help us in scaling.

We just need to create the data flow on one node, and this will make a copy of this data flow on each node in the cluster.

Apache NiFi introduces Zero-Master Clustering paradigm in Apache NiFi 1.0.0. A previous version of Apache NiFi based upon a single “Master Node” (more formally known as the NiFi Cluster Manager).

If the master node gets lost, data continued to flow, but the application was unable to show the topology of the flow, or show any stats. But in Zero-Master we can make changes from any node of the cluster.

And if master node disconnects, then automatically any active node is elected as Master Node.

Each node has the same the data flow, so they work on the same task as the other nodes are working, but each operates on the different datasets.

In Apache NiFi cluster, one node is elected as the Master(Cluster Coordinator), and another node sends heartbeats/status information to the master node. This node is responsible for the disconnection of the other nodes that do not send any pulse/status information.

This election of the master node is done via Apache Zookeeper. And In the case when the master nodes get disconnected, Apache Zookeeper elects any active node as the master node.

Building Data Lake using Apache NiFi 

Fetching Tweets with NiFi’s Processor

NiFi’s ‘GetTwitter’ processor is used to fetch tweets. It uses Twitter Streaming API for retrieving tweets. In this processor, we need to define the endpoint which we need to use. We can also apply filters by location, hashtags, particular IDs.

  • Twitter Endpoint – Here we can set the endpoint from which data should get pulled. Available parameters –
    • Sample Endpoint – Fetch public tweets from all over the world.
    • Firehose Endpoint – This is same as streaming API, but it ensures 100% guarantee delivery of tweets with filters.
    • Filter Endpoint – If we want to filter by any hashtags or keywords
  • Consumer Key – Consumer key provided by Twitter.
  • Consumer Secret – Consumer Secret provided by Twitter.
  • Access Token – Access Token provided by Twitter.
  • Access Token Secret – Access Token Secret provided by Twitter.
  • Languages – Languages for which tweets should fetch out.
  • Terms to Filter – Hashtags for which tweets should fetch out.
  • IDs to follow – Twitter user IDs that should be followed.

Fetching Tweets With Apache Nifi Processor

Now processor GetTwitter is ready for transmission of the data(tweets). From here we can move our data stream to anywhere like Amazon S3, Apache Kafka, ElasticSearch, Amazon Redshift, HDFS, Hive, Cassandra, etc. NiFi can move data multiple destinations parallelly.

Data Lake using Apache NiFi and Apache Kafka

For this, we are using NiFi processor ‘PublishKafka_0_10’.

In the Scheduling Tab, we can configure how many concurrent tasks to be executed and schedule the processor.

In Properties Tab, we can set up our Kafka broker URLs, topic name, request size, etc. It will write data to the given topic. For the best results, we can create a Kafka topic manually of a defined partitions.

Apache Kafka can be used to process data with Apache Beam, Apache Flink, Apache Spark.

Data Integration Using Apache Nifi and Apache Kafka

Configuring Processor for Data Integration Using Apache Nifi and Apache Kafka

Integration Using Apache NiFi to Amazon RedShift using Amazon Kinesis

Data Integration Using Apache NiFi to Amazon RedShift with Amazon Kinesis Firehose Stream

Now we integrate Apache NiFi to Amazon Redshift. NiFi uses Amazon Kinesis Firehose Delivery Stream to store data to Amazon Redshift.

This delivery Stream should get utilized for moving data to Amazon Redshift, Amazon S3, Amazon ElasticSearch Service. We need to specify this while creating Amazon Kinesis Firehose Delivery Stream.

Now we have to move data to Amazon Redshift, so firstly we need to configure Amazon Kinesis Firehose Delivery Stream. While delivering data to Amazon Redshift, firstly the data is provided to Amazon S3 bucket, and then Amazon Redshift Copy command is used to move data to Amazon Redshift Cluster.

We can also enable data transformation while creating Kinesis Firehose Delivery Stream. In this, we can also backup the data to another Amazon S3 bucket other than an intermediate bucket.

So for this, we will use processor PutKinesisFirehose. This processor will use that Kinesis Firehose stream for delivering data to Amazon Redshift. Here we will configure AWS credentials and Kinesis Firehose Delivery Stream.

Configuring Processor For Data Integration Using Apache NiFi to Amazon RedShift with Amazon Kinesis Firehose Stream

Big Data Integration Using Apache NiFi to Amazon S3

Data Integration Using Apache Nifi to Amazon S3

PutKinesisFirehose sends data to both Amazon Redshift and uses Amazon S3 as the intermediator. Now if someone only wants to use Amazon S3 as the storage so NiFi can also use for sending data to Amazon S3 only.

For this, we need to use NiFi processor PutS3Object. In it, we have to configure our AWS credentials, bucket name, and path, etc.

Configuring Processor for Data Integration using Apache Nifi to Amazon S3

Partitioning in Amazon S3 Bucket

Most important aspect while storing data in S3 is partitioning. Here we can partition our data using expression language in the object key field. So Right now we have used day wise partitioning.

So tweets should be stored days folder. And this partitioning approach can be beneficial while doing twitter analysis. Suppose we want to analyze tweets for this day or this week.

So using partitioning, we don’t need to scan all tweets we stored in S3. We will just define our filters using partitions.

Expression Used: ${now():format(‘yyyy/MMM/dd’)}/${filename}

It will create a path in our S3 Bucket like this: Year/Month/Date/filenames.

Data Lake Services using Apache NiFi to Hive

For transferring data to Hive, NiFi has processors – PutHiveStreaming for which incoming flow file is expected to be in Avro format and PutHiveQL for which incoming FlowFile is projected to be the HiveQL command to execute.

Now we will use PutHiveStreaming for sending data to Hive. For twitter we have output data as JSON, so we need to convert it first to the Avro format and then we will send it to the PutHiveStreaming.

In PutHiveStreaming, we will configure our Hive Metastore URI, Database Name, and table name. For this, the table which we are using must exist in Hive.

Data Integration Using Apache Nifi to Hive

Data Lake using Apache NiFi to ElasticSearch

Data Integration Using Apache Nifi to Elastic Search

Now we will visualize the incoming data in Kibana, for that we have routed the data to ElasticSearch.

Defining ElasticSearch http-basic

For routing data ElasticSearch, we will use NiFi processor PutElasticSearchHttp. It will move the data to the defined ElasticSearch index. Here we have set our ElasticSearch’s URL, index, type, etc.

Configuring Processor For Data Integration Using Apache Nifi to Elastic Search

Now, this processor will write data twitter data to the ElasticSearch index. And firstly we need to create the index into ElasticSearch and need to do mapping manually for some fields like ‘created_at’ because we need this to type ‘Date.’

Big Data Visualization in Kibana for Data Lake Services

Setting Up Dashboard in Kibana

Firstly we need to add the created index into Kibana.

Data Visualization in Kibana

Kibana Dashboard

Integrating Apache Spark and NiFi for Data Lake Services

Integrating Apache Spark and Nifi for Data Lake

Apache Spark is used widely for large data processing. Spark can process the data in both i.e. Batch processing Mode and Streaming Mode.

Apache NiFi to Apache Spark data transmission use site to site communication. And output port is used for publishing data from the source.

Integrating Apache Spark and Nifi For Data Lake

In the above data flow, we have used processor TailFile in which we have configured ‘nifi-app.log’ file to tail. It will send all the information to the output port ‘spark. Now, we can use this outport port while writing spark job.

In the same way, we can send out twitter records to any output port. And this output port can be further used for the spark streaming.

Integrating Apache Flink With Apache NiFi for Data Lake

Integrating Apache Flink With Apache Nifi For Data Lake

Apache Flink is an open source stream processing framework developed by the Apache Software Foundation. We can use this for stream processing, network/sensor monitoring, error detection, etc.

Apache NiFi to Apache Flink data transmission also uses the site to site communication. And output port is used for publishing data from the source.

Integrating Apache Flink with Apache Nifi For Data Lake

  • NiFi Source to pull data from the NiFi output port
  • NiFiSink to push data to NiFi input port
  • NiFiDataPacket to represent data to/from NiFi

Performance & Scaling Results For Apache NiFi

We have tested the data flows on the four-node Apache NiFi Cluster. We used NiFi processor GenerateFlowFile for load testing. This processor creates FlowFiles with random data or custom content. We have tested the data transmission to Amazon S3 and Apache Kafka.

The results shown in the table is the data processed by the NiFi(per five minutes)

Scaling Results For Apache Nifi

Note: These tests are performed using Amazon EC2 instances(m4.large). For Kafka, we have used a three-node Apache Kafka Cluster.

How Can XenonStack Help You?

XenonStack provides Data lake Services for building Real-time and streaming Big Data Analytics applications for IoT and Predictive analytics using Cloud Datawarehouse for Enterprise and Startups

Leave a Comment

Name required.
Enter a Valid Email Address.
Comment required.(Min 30 Char)

[wpforms id="7646"]
<div class="wpforms-container wpforms-container-full optin-monster-forms" id="wpforms-7646"><form id="wpforms-form-7646" class="wpforms-validate wpforms-form" data-formid="7646" method="post" enctype="multipart/form-data" action="/blog/data-lake-services/"><noscript class="wpforms-error-noscript">Please enable JavaScript in your browser to complete this form.</noscript><div class="wpforms-page-indicator progress" data-indicator="progress" data-indicator-color="#72b239" data-scroll="1"><span class="wpforms-page-indicator-page-title" ></span><span class="wpforms-page-indicator-page-title-sep" style="display:none;"> - </span><span class="wpforms-page-indicator-steps">Step <span class="wpforms-page-indicator-steps-current">1</span> of 3</span><div class="wpforms-page-indicator-page-progress-wrap"><div class="wpforms-page-indicator-page-progress" style="width:33.333333333333%;background-color:#72b239;"></div></div></div><div class="wpforms-field-container"><div class="wpforms-page wpforms-page-1 "><div id="wpforms-7646-field_10-container" class="wpforms-field wpforms-field-pagebreak" data-field-id="10"></div><div id="wpforms-7646-field_24-container" class="wpforms-field wpforms-field-html form-popup-header-wrapper" data-field-id="24"><div id="wpforms-7646-field_24"><div class="form-popup-header"> <h2>Contact Us<span>!</span></h2> <p>Digitally Transform your Organization with <strong>XenonStack</strong><p> </div></div></div><div id="wpforms-7646-field_21-container" class="wpforms-field wpforms-field-radio custom-radio-btn-wrapper wpforms-list-2-columns wpforms-conditional-trigger" data-field-id="21"><label class="wpforms-field-label wpforms-label-hide" for="wpforms-7646-field_21">Sevices <span class="wpforms-required-label">*</span></label><ul id="wpforms-7646-field_21" class="wpforms-field-required wpforms-image-choices wpforms-image-choices-modern"><li class="choice-1 depth-1 wpforms-image-choices-item"><label class="wpforms-field-label-inline" for="wpforms-7646-field_21_1" tabindex="0"><span class="wpforms-image-choices-image"><img src="" alt="Application Modernisation " title="Application Modernisation "></span><input type="radio" id="wpforms-7646-field_21_1" class="wpforms-screen-reader-element" name="wpforms[fields][21]" value="Application Modernisation " tabindex="-1" required ><span class="wpforms-image-choices-label">Application Modernisation </span></label></li><li class="choice-2 depth-1 wpforms-image-choices-item"><label class="wpforms-field-label-inline" for="wpforms-7646-field_21_2" tabindex="0"><span class="wpforms-image-choices-image"><img src="" alt="Big Data Analytics" title="Big Data Analytics"></span><input type="radio" id="wpforms-7646-field_21_2" class="wpforms-screen-reader-element" name="wpforms[fields][21]" value="Big Data Analytics" tabindex="-1" required ><span class="wpforms-image-choices-label">Big Data Analytics</span></label></li><li class="choice-3 depth-1 wpforms-image-choices-item"><label class="wpforms-field-label-inline" for="wpforms-7646-field_21_3" tabindex="0"><span class="wpforms-image-choices-image"><img src="" alt="Cloud Migrations" title="Cloud Migrations"></span><input type="radio" id="wpforms-7646-field_21_3" class="wpforms-screen-reader-element" name="wpforms[fields][21]" value="Cloud Migrations" tabindex="-1" required ><span class="wpforms-image-choices-label">Cloud Migrations</span></label></li><li class="choice-4 depth-1 wpforms-image-choices-item"><label class="wpforms-field-label-inline" for="wpforms-7646-field_21_4" tabindex="0"><span class="wpforms-image-choices-image"><img src="" alt="Data Visualization" title="Data Visualization"></span><input type="radio" id="wpforms-7646-field_21_4" class="wpforms-screen-reader-element" name="wpforms[fields][21]" value="Data Visualization" tabindex="-1" required ><span class="wpforms-image-choices-label">Data Visualization</span></label></li><li class="choice-5 depth-1 wpforms-image-choices-item"><label class="wpforms-field-label-inline" for="wpforms-7646-field_21_5" tabindex="0"><span class="wpforms-image-choices-image"><img src="" alt="Robotic Process Automation" title="Robotic Process Automation"></span><input type="radio" id="wpforms-7646-field_21_5" class="wpforms-screen-reader-element" name="wpforms[fields][21]" value="Robotic Process Automation" tabindex="-1" required ><span class="wpforms-image-choices-label">Robotic Process Automation</span></label></li></ul></div><div id="wpforms-7646-field_23-container" class="wpforms-field wpforms-field-pagebreak" data-field-id="23"><div class="wpforms-clear wpforms-pagebreak-left"><button class="wpforms-page-button wpforms-page-next" data-action="next" data-page="1" data-formid="7646">Next</button></div></div></div><div class="wpforms-page wpforms-page-2 " style="display:none;"><div id="wpforms-7646-field_25-container" class="wpforms-field wpforms-field-html form-popup-header-wrapper" data-field-id="25"><div id="wpforms-7646-field_25"><div class="form-popup-header"> <h2>How can we get in Touch</h2> <p>Fill the form and we will revert back to you soon.<p> </div></div></div><div id="wpforms-7646-field_20-container" class="wpforms-field wpforms-field-name col-12 col-sm-10 col-md-8 form-group" data-field-id="20"><label class="wpforms-field-label" for="wpforms-7646-field_20">Name <span class="wpforms-required-label">*</span></label><input type="text" id="wpforms-7646-field_20" class="wpforms-field-large wpforms-field-required" name="wpforms[fields][20]" placeholder="Name" required></div><div id="wpforms-7646-field_2-container" class="wpforms-field wpforms-field-email col-12 col-sm-10 col-md-8 form-group" data-field-id="2"><label class="wpforms-field-label" for="wpforms-7646-field_2">Email <span class="wpforms-required-label">*</span></label><input type="email" id="wpforms-7646-field_2" class="wpforms-field-large wpforms-field-required" name="wpforms[fields][2]" placeholder="Email" required></div><div id="wpforms-7646-field_3-container" class="wpforms-field wpforms-field-text col-12 col-sm-10 col-md-8 form-group" data-field-id="3"><label class="wpforms-field-label" for="wpforms-7646-field_3">Organization <span class="wpforms-required-label">*</span></label><input type="text" id="wpforms-7646-field_3" class="wpforms-field-large wpforms-field-required" name="wpforms[fields][3]" placeholder="Organization" required></div><div id="wpforms-7646-field_12-container" class="wpforms-field wpforms-field-pagebreak next-btn-wrapper" data-field-id="12"><div class="wpforms-clear wpforms-pagebreak-left"><button class="wpforms-page-button wpforms-page-next" data-action="next" data-page="2" data-formid="7646">Next</button></div></div></div><div class="wpforms-page wpforms-page-3 last next-btn-wrapper" style="display:none;"><div id="wpforms-7646-field_34-container" class="wpforms-field wpforms-field-html form-popup-header-wrapper" data-field-id="34"><div id="wpforms-7646-field_34"><div class="form-popup-header"> <h2>Share Your Requirements</h2> </div></div></div><div id="wpforms-7646-field_7-container" class="wpforms-field wpforms-field-checkbox col-12 col-sm-12 col-md-12 custom-checkbox form-group wpforms-list-2-columns wpforms-conditional-field wpforms-conditional-show" data-field-id="7" style="display:none;"><label class="wpforms-field-label" for="wpforms-7646-field_7">Application Modernization Services <span class="wpforms-required-label">*</span></label><ul id="wpforms-7646-field_7" class="wpforms-field-required"><li class="choice-1 depth-1"><input type="checkbox" id="wpforms-7646-field_7_1" name="wpforms[fields][7][]" value="Application Re-platform " required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_7_1">Application Re-platform </label></li><li class="choice-2 depth-1"><input type="checkbox" id="wpforms-7646-field_7_2" name="wpforms[fields][7][]" value="Application Migration " required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_7_2">Application Migration </label></li><li class="choice-3 depth-1"><input type="checkbox" id="wpforms-7646-field_7_3" name="wpforms[fields][7][]" value="Cloud Native Transformation" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_7_3">Cloud Native Transformation</label></li><li class="choice-4 depth-1"><input type="checkbox" id="wpforms-7646-field_7_4" name="wpforms[fields][7][]" value="Application Assessment" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_7_4">Application Assessment</label></li><li class="choice-5 depth-1"><input type="checkbox" id="wpforms-7646-field_7_5" name="wpforms[fields][7][]" value="Application Re-engineering" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_7_5">Application Re-engineering</label></li></ul></div><div id="wpforms-7646-field_28-container" class="wpforms-field wpforms-field-checkbox col-12 col-sm-12 col-md-12 custom-checkbox form-group wpforms-list-2-columns wpforms-conditional-field wpforms-conditional-show" data-field-id="28" style="display:none;"><label class="wpforms-field-label" for="wpforms-7646-field_28">Data Visualization Services <span class="wpforms-required-label">*</span></label><ul id="wpforms-7646-field_28" class="wpforms-field-required" data-choice-limit="1"><li class="choice-1 depth-1"><input type="checkbox" id="wpforms-7646-field_28_1" data-rule-check-limit="true" name="wpforms[fields][28][]" value="Data Visualization Cloud Services" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_28_1">Data Visualization Cloud Services</label></li><li class="choice-2 depth-1"><input type="checkbox" id="wpforms-7646-field_28_2" data-rule-check-limit="true" name="wpforms[fields][28][]" value="Dashboard and User Experience Design" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_28_2">Dashboard and User Experience Design</label></li><li class="choice-3 depth-1"><input type="checkbox" id="wpforms-7646-field_28_3" data-rule-check-limit="true" name="wpforms[fields][28][]" value="Data Visualization Integration " required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_28_3">Data Visualization Integration </label></li><li class="choice-4 depth-1"><input type="checkbox" id="wpforms-7646-field_28_4" data-rule-check-limit="true" name="wpforms[fields][28][]" value="Analytics and Reporting Solutions " required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_28_4">Analytics and Reporting Solutions </label></li></ul></div><div id="wpforms-7646-field_35-container" class="wpforms-field wpforms-field-checkbox col-12 col-sm-12 col-md-12 custom-checkbox form-group wpforms-list-2-columns wpforms-conditional-field wpforms-conditional-show" data-field-id="35" style="display:none;"><label class="wpforms-field-label" for="wpforms-7646-field_35">Data Visualization Tools <span class="wpforms-required-label">*</span></label><ul id="wpforms-7646-field_35" class="wpforms-field-required" data-choice-limit="1"><li class="choice-1 depth-1"><input type="checkbox" id="wpforms-7646-field_35_1" data-rule-check-limit="true" name="wpforms[fields][35][]" value="Microsoft Power BI" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_35_1">Microsoft Power BI</label></li><li class="choice-2 depth-1"><input type="checkbox" id="wpforms-7646-field_35_2" data-rule-check-limit="true" name="wpforms[fields][35][]" value="Amazon QuickSight" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_35_2">Amazon QuickSight</label></li><li class="choice-3 depth-1"><input type="checkbox" id="wpforms-7646-field_35_3" data-rule-check-limit="true" name="wpforms[fields][35][]" value="Google Data Studio" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_35_3">Google Data Studio</label></li><li class="choice-4 depth-1"><input type="checkbox" id="wpforms-7646-field_35_4" data-rule-check-limit="true" name="wpforms[fields][35][]" value="Tableau" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_35_4">Tableau</label></li></ul></div><div id="wpforms-7646-field_29-container" class="wpforms-field wpforms-field-checkbox col-12 col-sm-12 col-md-12 custom-checkbox form-group wpforms-list-2-columns wpforms-conditional-field wpforms-conditional-show" data-field-id="29" style="display:none;"><label class="wpforms-field-label" for="wpforms-7646-field_29">Big Data Services <span class="wpforms-required-label">*</span></label><ul id="wpforms-7646-field_29" class="wpforms-field-required"><li class="choice-1 depth-1"><input type="checkbox" id="wpforms-7646-field_29_1" name="wpforms[fields][29][]" value="Modern Data Integration " required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_29_1">Modern Data Integration </label></li><li class="choice-2 depth-1"><input type="checkbox" id="wpforms-7646-field_29_2" name="wpforms[fields][29][]" value="Big Data Governance and Security" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_29_2">Big Data Governance and Security</label></li><li class="choice-3 depth-1"><input type="checkbox" id="wpforms-7646-field_29_3" name="wpforms[fields][29][]" value="Enterprise Data Strategy" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_29_3">Enterprise Data Strategy</label></li><li class="choice-4 depth-1"><input type="checkbox" id="wpforms-7646-field_29_4" name="wpforms[fields][29][]" value="Data Catalog" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_29_4">Data Catalog</label></li><li class="choice-5 depth-1"><input type="checkbox" id="wpforms-7646-field_29_5" name="wpforms[fields][29][]" value="Data Discovery " required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_29_5">Data Discovery </label></li></ul></div><div id="wpforms-7646-field_8-container" class="wpforms-field wpforms-field-checkbox col-12 col-sm-12 col-md-12 custom-checkbox form-group wpforms-list-2-columns wpforms-conditional-field wpforms-conditional-show" data-field-id="8" style="display:none;"><label class="wpforms-field-label" for="wpforms-7646-field_8">Data Ingestion Tools <span class="wpforms-required-label">*</span></label><ul id="wpforms-7646-field_8" class="wpforms-field-required"><li class="choice-1 depth-1"><input type="checkbox" id="wpforms-7646-field_8_1" name="wpforms[fields][8][]" value="Amazon Kinesis" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_8_1">Amazon Kinesis</label></li><li class="choice-2 depth-1"><input type="checkbox" id="wpforms-7646-field_8_2" name="wpforms[fields][8][]" value="Apache Kafka" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_8_2">Apache Kafka</label></li><li class="choice-3 depth-1"><input type="checkbox" id="wpforms-7646-field_8_3" name="wpforms[fields][8][]" value="Google Pub/Sub" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_8_3">Google Pub/Sub</label></li><li class="choice-4 depth-1"><input type="checkbox" id="wpforms-7646-field_8_4" name="wpforms[fields][8][]" value="Apache Pulsar" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_8_4">Apache Pulsar</label></li></ul></div><div id="wpforms-7646-field_30-container" class="wpforms-field wpforms-field-checkbox col-12 col-sm-12 col-md-12 custom-checkbox form-group wpforms-list-2-columns wpforms-conditional-field wpforms-conditional-show" data-field-id="30" style="display:none;"><label class="wpforms-field-label" for="wpforms-7646-field_30">Data Processing Tools <span class="wpforms-required-label">*</span></label><ul id="wpforms-7646-field_30" class="wpforms-field-required"><li class="choice-1 depth-1"><input type="checkbox" id="wpforms-7646-field_30_1" name="wpforms[fields][30][]" value="Apache Spark" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_30_1">Apache Spark</label></li><li class="choice-2 depth-1"><input type="checkbox" id="wpforms-7646-field_30_2" name="wpforms[fields][30][]" value="Apache Flink" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_30_2">Apache Flink</label></li><li class="choice-3 depth-1"><input type="checkbox" id="wpforms-7646-field_30_3" name="wpforms[fields][30][]" value="Apache Beam" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_30_3">Apache Beam</label></li><li class="choice-4 depth-1"><input type="checkbox" id="wpforms-7646-field_30_4" name="wpforms[fields][30][]" value="Amazon EMR" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_30_4">Amazon EMR</label></li><li class="choice-6 depth-1"><input type="checkbox" id="wpforms-7646-field_30_6" name="wpforms[fields][30][]" value="Google Cloud Dataproc" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_30_6">Google Cloud Dataproc</label></li></ul></div><div id="wpforms-7646-field_31-container" class="wpforms-field wpforms-field-checkbox col-12 col-sm-12 col-md-12 custom-checkbox form-group wpforms-list-2-columns wpforms-conditional-field wpforms-conditional-show" data-field-id="31" style="display:none;"><label class="wpforms-field-label" for="wpforms-7646-field_31">Cloud Services <span class="wpforms-required-label">*</span></label><ul id="wpforms-7646-field_31" class="wpforms-field-required"><li class="choice-1 depth-1"><input type="checkbox" id="wpforms-7646-field_31_1" name="wpforms[fields][31][]" value="Cloud Governance and Security" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_31_1">Cloud Governance and Security</label></li><li class="choice-2 depth-1"><input type="checkbox" id="wpforms-7646-field_31_2" name="wpforms[fields][31][]" value="Cloud Native Microservices" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_31_2">Cloud Native Microservices</label></li><li class="choice-3 depth-1"><input type="checkbox" id="wpforms-7646-field_31_3" name="wpforms[fields][31][]" value="Cloud Infrastructure Automation" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_31_3">Cloud Infrastructure Automation</label></li><li class="choice-4 depth-1"><input type="checkbox" id="wpforms-7646-field_31_4" name="wpforms[fields][31][]" value="Managed Cloud Services" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_31_4">Managed Cloud Services</label></li><li class="choice-5 depth-1"><input type="checkbox" id="wpforms-7646-field_31_5" name="wpforms[fields][31][]" value="Cloud Data Migration" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_31_5">Cloud Data Migration</label></li></ul></div><div id="wpforms-7646-field_32-container" class="wpforms-field wpforms-field-checkbox col-12 col-sm-12 col-md-12 custom-checkbox form-group wpforms-list-2-columns wpforms-conditional-field wpforms-conditional-show" data-field-id="32" style="display:none;"><label class="wpforms-field-label" for="wpforms-7646-field_32">IT Infrastructure <span class="wpforms-required-label">*</span></label><ul id="wpforms-7646-field_32" class="wpforms-field-required"><li class="choice-1 depth-1"><input type="checkbox" id="wpforms-7646-field_32_1" name="wpforms[fields][32][]" value="AWS" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_32_1">AWS</label></li><li class="choice-2 depth-1"><input type="checkbox" id="wpforms-7646-field_32_2" name="wpforms[fields][32][]" value="Google" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_32_2">Google</label></li><li class="choice-3 depth-1"><input type="checkbox" id="wpforms-7646-field_32_3" name="wpforms[fields][32][]" value="Azure" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_32_3">Azure</label></li><li class="choice-4 depth-1"><input type="checkbox" id="wpforms-7646-field_32_4" name="wpforms[fields][32][]" value="Private Cloud" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_32_4">Private Cloud</label></li><li class="choice-6 depth-1"><input type="checkbox" id="wpforms-7646-field_32_6" name="wpforms[fields][32][]" value="Data Center" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_32_6">Data Center</label></li></ul></div><div id="wpforms-7646-field_33-container" class="wpforms-field wpforms-field-checkbox col-12 col-sm-12 col-md-12 custom-checkbox form-group wpforms-list-2-columns wpforms-conditional-field wpforms-conditional-show" data-field-id="33" style="display:none;"><label class="wpforms-field-label" for="wpforms-7646-field_33">AI Services <span class="wpforms-required-label">*</span></label><ul id="wpforms-7646-field_33" class="wpforms-field-required"><li class="choice-1 depth-1"><input type="checkbox" id="wpforms-7646-field_33_1" name="wpforms[fields][33][]" value="Computer Vision Services" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_33_1">Computer Vision Services</label></li><li class="choice-2 depth-1"><input type="checkbox" id="wpforms-7646-field_33_2" name="wpforms[fields][33][]" value="Robotic Process Automation" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_33_2">Robotic Process Automation</label></li><li class="choice-3 depth-1"><input type="checkbox" id="wpforms-7646-field_33_3" name="wpforms[fields][33][]" value="Enterprise Operational Analytics" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_33_3">Enterprise Operational Analytics</label></li><li class="choice-4 depth-1"><input type="checkbox" id="wpforms-7646-field_33_4" name="wpforms[fields][33][]" value="AI Based Development" required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_33_4">AI Based Development</label></li><li class="choice-5 depth-1"><input type="checkbox" id="wpforms-7646-field_33_5" name="wpforms[fields][33][]" value="AI Strategy Consulting " required ><label class="wpforms-field-label-inline" for="wpforms-7646-field_33_5">AI Strategy Consulting </label></li></ul></div><div id="wpforms-7646-field_36-container" class="wpforms-field wpforms-field-checkbox col-12 col-sm-12 col-md-12 custom-checkbox form-group wpforms-list-2-columns wpforms-conditional-field wpforms-conditional-show" data-field-id="36" style="display:none;"><label class="wpforms-field-label" for="wpforms-7646-field_36">Robotic Process Automation Platform</label><ul id="wpforms-7646-field_36"><li class="choice-2 depth-1"><input type="checkbox" id="wpforms-7646-field_36_2" name="wpforms[fields][36][]" value="Blue Prism " ><label class="wpforms-field-label-inline" for="wpforms-7646-field_36_2">Blue Prism </label></li><li class="choice-4 depth-1"><input type="checkbox" id="wpforms-7646-field_36_4" name="wpforms[fields][36][]" value="UiPath" ><label class="wpforms-field-label-inline" for="wpforms-7646-field_36_4">UiPath</label></li><li class="choice-5 depth-1"><input type="checkbox" id="wpforms-7646-field_36_5" name="wpforms[fields][36][]" value="Automation Anywhere" ><label class="wpforms-field-label-inline" for="wpforms-7646-field_36_5">Automation Anywhere</label></li></ul></div><div id="wpforms-7646-field_11-container" class="wpforms-field wpforms-field-pagebreak" data-field-id="11"><div class="wpforms-clear wpforms-pagebreak-left"></div></div></div></div><div class="wpforms-field wpforms-field-hp"><label for="wpforms-7646-field-hp" class="wpforms-field-label">Name</label><input type="text" name="wpforms[hp]" id="wpforms-7646-field-hp" class="wpforms-field-medium"></div><input type="hidden" name="wpforms[recaptcha]" value=""><div class="wpforms-submit-container" style="display:none;"><input type="hidden" name="wpforms[id]" value="7646"><input type="hidden" name="wpforms[author]" value="3"><input type="hidden" name="wpforms[post_id]" value="344"><button type="submit" name="wpforms[submit]" class="wpforms-submit om-trigger-conversion mon-btn" id="wpforms-submit-7646" value="wpforms-submit" aria-live="assertive" data-alt-text="Submitting..." data-submit-text="Submit">Submit</button></div></form></div> <!-- .wpforms-container -->
[wpforms id="1328"]
<div class="wpforms-container wpforms-container-full subscription-form optin-monster-forms" id="wpforms-1328"><form id="wpforms-form-1328" class="wpforms-validate wpforms-form" data-formid="1328" method="post" enctype="multipart/form-data" action="/blog/data-lake-services/"><noscript class="wpforms-error-noscript">Please enable JavaScript in your browser to complete this form.</noscript><div class="wpforms-field-container"><div id="wpforms-1328-field_1-container" class="wpforms-field wpforms-field-email col-12 col-sm-12 col-md-12 form-group" data-field-id="1"><label class="wpforms-field-label wpforms-label-hide" for="wpforms-1328-field_1">Email <span class="wpforms-required-label">*</span></label><input type="email" id="wpforms-1328-field_1" class="wpforms-field-large wpforms-field-required" name="wpforms[fields][1]" placeholder="Email address" required></div><div id="wpforms-1328-field_8-container" class="wpforms-field wpforms-field-hidden" data-field-id="8"><input type="hidden" id="wpforms-1328-field_8" name="wpforms[fields][8]" value="Subscribe"></div></div><div class="wpforms-field wpforms-field-hp"><label for="wpforms-1328-field-hp" class="wpforms-field-label">Phone</label><input type="text" name="wpforms[hp]" id="wpforms-1328-field-hp" class="wpforms-field-medium"></div><input type="hidden" name="wpforms[recaptcha]" value=""><div class="wpforms-submit-container" ><input type="hidden" name="wpforms[id]" value="1328"><input type="hidden" name="wpforms[author]" value="3"><input type="hidden" name="wpforms[post_id]" value="344"><button type="submit" name="wpforms[submit]" class="wpforms-submit om-trigger-conversion btn" id="wpforms-submit-1328" value="wpforms-submit" aria-live="assertive" data-alt-text="Sending..." data-submit-text="Subscribe">Subscribe</button></div></form></div> <!-- .wpforms-container -->