Skip to content

Azure Data Lake: Grow Fearless – Part 1

Rapid data growth – be aware and be prepared 

Keep growing is one of the main topics for companies. When we talk about growth, we think of more employees, more branches, more cash flow, etc. Thanks to today’s technology, we are able to record all of that information digitally and summarize it into more valuable data. Data gives us opportunities to have a better insight into the business, from demand and supply to business performance and customer relations. Data goes beyond record information nowadays. It also enables us to do correlation and forecast, facilitates short term planning and long term strategy making. 

Rapid data growth shouldn’t be a problem

Besides all the opportunities and excitement growth brings, managing fast-growing data starts to be challenging. Where and how to store data? Which part of data is useful information and which part is just eating storage resources? How could we extract value from data? 

The awareness for data management is growing these years as we can see more data related positions in the market: data analyst, data engineer, and data scientist. On the one hand, data professionals extract value from data as business required. On the other hand, each step of data transformation, transition, process, and analysis means that more data are generated. 

We see this rapid growth leads to pressure for storage. If we think about storage as part of a business growth strategy, what would be the ideal situation? 

  • No bottleneck from storage. Business growth should never feel the limit of storage. Once the business starts to feel the limit, it normally means that business growth has to slow down to put extra energy somewhere else. Ideally, storage should be unlimited. 
  • Centralized management. Data will be hard to manage if it is stored in different software and in different ways. Instead of using different software to handle different data stores and ingestion, it is better to use a single data storage solution that is compatible with various data types and ingestion. 
  • Facilitate business process. We prefer to put books on a shelf instead of throwing them into a big box because a shelf gives us the possibility to organize books into different layers by category and easily to find later. A storage solution should also provide the possibility for businesses to organize data so that it is easy to retrieve useful information.

1-click data platform brochure cover

Tired of big investments and complex timelines when it comes to setting up a data platform? We will set up your complete data platform infrastructure fully automated in just one click. 4x faster than usual and 3x lower platform enablement costs.

 

Azure Data Lake – big data, easy thing 

Introduction to Azure Data Lake

“Azure Data Lake is a scalable data storage and analytics service”. It has two parts: Azure Data Lake Storage and Data Lake Analytics. 

Azure Data Lake Storage Gen2

The latest version of Azure Data Lake Storage is Azure Data Lake Storage Gen2 (ADLS Gen2), which is further developed based on Azure Blob Storage and Azure Data Lake Storage Gen1. ADLS Gen2 has unlimited storage. It is more than a big hard drive providing you a worry-free storage solution. It also helps you to store strategically and be fully prepared for data ingestion and consumption. 

Apart from Storage, Data Lake Analytics is another important part of Azure Data Lake. It is a Software-as-a-Service (SaaS), which provides on-demand analytics job service. You can focus on processing data itself without configuring the cluster in front. Development not only can be done on the Azure portal itself, it can also be done in Visual Studio that is widely used. You can query data by using U-SQL which is a new language but similar to T-SQL with the expressive feature of C#.

ADLS and DLA work very well with each other, but it does not mean you have to use both of them. Those are two separate components, which can be used individually in combination with other software. For example, ADLS Gen2 also works very well with Azure Synapse Analytics and Azure Databrick.

In my next blog, I’ll explain why you should consider Azure Data Lake, and I will mainly focus on the benefits of Azure Data Lake Storage Gen2.

Do you want to learn more about Azure Data Lake? Click on the button below and watch my recorded webinar “Data Lake – Grow Fearless”. Please contact me if you have any questions.