Data Lake

Data lakes are centralized repositories for storing large amounts of raw data, including system data and data for reporting and advanced analytics. They may contain structured, semi-structured and unstructured data as well as images, audio and video.

 

Data lakes differ from data warehouses in that warehouses employ files-and-folder hierarchies to organize the data stored while the data lake’s flat architecture uses metadata tags to help with searching and identifying relevant information.

 

Most organizations operate both a data lake and a data warehouse, as they serve different needs and this chart from AWS illustrates the ideal uses of the two storage strategies.

 

It’s critical to secure a data lake so it isn’t accessed without authorization or altered for malicous purposes.

 

Seeking Clarity?

View the Cybersecurity Dictionary for top terms searched by your peers.