Data Mart can be called as a mini Data Warehouse design that shows the contents to be needed only to the client-side, i.e. it holds the overview of the data. The primary purpose of using Data Mart is to get the only information required that can be not very easy to get from the Warehouse. Mostly it exists for the business purposes of large scale and within a large scale businesses most of the time data retrieving occur as an issue, so to overcome that Data Mart is used.
Let's take an example of Banking System
Within a city, there are different branches for different areas or localities. Still, the central office exists at a particular location where the data of all the branches merges. So to fetch the particular information it will be the best option to get it from the nearby branch rather than the main one because the main office will require more time to provide one’s information. Still, it will be easy for the nearby branch to fulfil one’s query. So here, the main branch acts as a Data Warehouse and the local offices' act as a Data Mart.
What is a Data Mart ETL?
Here, ETL provides extraction, transformation and loading. In this, one can perform changes in the form of calculations, concatenations, etc. Data Mart ETL requires various inputs from stakeholders like testers, developers, etc. So to create a data warehouse needs data to be collected from multiple sources and then load them at a single position by merging all the required entities is what a database within the data warehouse and this all has to be done with the ETL process in Data Warehouse. To further continue with the banking example, by using the Data Transformation using ETL process, the employee can easily make changes and can extract, transform the data quickly of any of the customers. This helps one to maintain, update the record and can promptly respond to the daily queries.
ETL Process in Data Warehouse
Step 1: Data is being fetched from various sources to add it to the Data Warehouse. Step 2: In the second step, the data is being processed under ETL for transformation and cleansing. Step 3: And then finally add it to the Warehouse at the end of the ETL.
Reasons for Creating Data Mart
ETL process in Data Warehouse provides various reasons to use the best ETL process.
It is easy to create.
It provides the best overview with in-depth down knowledge to the contents been fetched.
It Reduces the cost than that of Warehouse implementation.
Users are provided with a clear view of the data rather than that of Data Warehouse.
Customarily accessing data becomes accessible.
Response time of the end-users also improves.
It focuses on a single functional or staging area.
It is flexible.
It Reduces data volume.
Data can be stored on different hardware/software platforms.
Types of Data Mart
The three main types of Data Marts are: 1. Dependent: This Data Mart ETL directly fetches the information from the central data. It follows a top-down approach. 2. Independent: it is created without the help of data Central Data Warehouse. These are created within the organizations for the small groups. It is neither connected with any of the venture Data Warehouse or any Data Mart. 3. Hybrid: With a top-down approach, it collects the data from various sources as well as from the Data Warehouse. It improves the speed and end-users focus on the benefits with such a procedure.
Steps to Implement Data Mart
Implementing the ETL process in Data Warehouse is the best but complicated procedure. It involves several steps to start with:
1. Design Phase
It is the first phase of implementation that firstly takes the initiative to gather the information regarding requests and then the logical and physical design is created. It involves various tasks:-Task 1: To identify the data, firstly gather the business and technical requirements. Task 2: Then select the best suitable data subset. Task 3: Finally, design the logical and physical structure to it. What Technologies are Needed? With the help of the tools, one can create the UML and ER diagrams that may append the meta-data into a logical and physical database.
2. Constructing Phase
In this phase, physical and logical structures are created. It involves the following task:Task 1: The only level of it to implement the physical database design that is created in the earlier phase like, the database schemas as tables, indexes, etc. are created. What Technologies are Needed? To construct the data mart, one will need the relational database management system. Also, RDBMS provides various features for the data mart access like Security, multiuser support, etc.
3. Populating Phase
Before entering this phase, one needs to clean and format the data that has been fetched from the various sources and finally then put the correct data into the particular data mart. ETL process in Data Warehouse includes various tasks rather than the steps:Task 1: To extract the source data. Task 2: To perform the various operations like cleaning and transforming the data. Task 3: To load the data into the data mart. Task 4: And at last, metadata is created and stored. What Technologies are Needed? It can be done by using Continuous ETL (Extract Transform Load) tools. It helps in performing all the tasks performed within the populating phase and then loads the data back into the data mart.
It is a phrase that can be used to put the data into use by responding to the queries of the end-users. In this end-users submit their questions and in return, the results displayed by fetching the results from the database. It includes various tasks rather than the steps:Task 1: It setups the meta-layer that helps in translating the database structures and the object names into the term of business. Task 2: It maintains the database structures. Task 3: Setups the API and also the interface if needed. What Technologies are Needed? The data can be accessed using a GUI or Command-line. By using GUI, the data can be shown in the form of graphs.
The final phase that covers various errands: Task 1: To add and manage the data into the data mart. Task 2: To optimise the system to attain enhanced performance. Task 3: To plan various scenarios and to ensure the availability during system failures. What Technologies are Needed? One can use GUI or Command line for the management of data mart.