In case, if a big ETL job fails, then data was written partially or not written at all.
In case the data coming into the lake changes then there will be compatibility issues.
No Data Versioning is provided by the data lake, due to which we can't go back to the previous version if required.
Data lake doesn't provide metadata management functionality itself.
Development of Delta Lake and Data Lake
It helps concurrent read/write operations and enables efficient CRUD and rollback capabilities.
It has the functionality to upgrade the schema of the table according to need or change the schema of data coming in as required.
Data stored in the lake is versioned and we can go back to the previous version if required.
Data consistency is fulfilled, As there is functionality to have concurrent jobs writing and reading to/from the same data.
It enables us to keep track of the order of data and ultimately which helps in making Data lineage using external tools like Atlas.
Delta lake fulfills the ACID properties, which means there is no problem or corrupted data.
ACID transaction: Delta lake fulfills the ACID properties, which means there is no problem or corrupted data.
Consistent data when mixing streaming and batch: If we do changes in business logic, we can assure that the data is consistent in both sinks.
Data versioning: Delta lake provide us with the functionality of versioning. Data is stored and versions of the same are created.
Metadata handling: With Delta, we can use the transaction log users to see metadata about all the changes that were applied to the data.