In this exercise you will explore the concept of a lake database and you will learn how to use readily available database templates for lake databases.
The lake database in Azure Synapse Analytics enables you to bring together database design, meta information about the data that is stored and a possibility to describe how and where the data should be stored. Lake database addresses the challenge of today's data lakes where it is hard to understand how data is structured.
The tasks you will perform in this exercise are:
- Exercise 4 - Lake Databases and Database templates
- Task 1 - Create and configure a lake database
- Task 2 - Create a lake database table from data lake storage
- Task 3 - Create a custom lake database table and map data into it
- Task 4 - Create a complex lake database using database templates
In this task you will create a new lake database.
-
In Synapse Studio, navigate to the
Data
hub, select theWorkspace
section and then select+
followed byLake database (preview)
to trigger the creation of a new lake database. -
Configure the properties of the lake database as follows:
- Name:
Database1
- Input folder:
database1/
- Data format:
Parquet
Select
Publish
to publish the new lake database. - Name:
In this task you will create a new lake database table using files from the data lake storage account.
-
In Synapse Studio, navigate to the
Data
hub and select the data lake account underLinked
,Azure Data Lake Storage Gen2
. Select thedatabase1
file system, and then select thefact-sale
folder, followed by theDay=20191201
folder. In this folder, locate thesale-small-20191201-snappy.parquet
file. -
In Synapse Studio, navigate to the
Data
hub, and select theWorkspace
section followed byLake database
. In the context menu associated with theDatabase1
database, selectOpen
to edit the lake database.In the database editor, select
+ Table
followed byFrom data lake
. -
Configure the properties of the new table as follows, then select
Continue
:- External table name:
FactSale
- Linked service:
asadatalake01
- Input file or folder:
database1/fact-sale
- External table name:
-
Select
Preview Data
. -
Observe the data preview, then select
Create
to finalize the process. -
In the table designer, select
Columns
, followed by+ Column
andPartition column
. -
Use
Day
as the name of the partition column andinteger
as data type. SelectPublish
to publish the new table. -
In Synapse Studio, navigate to the
Develop
hub and create a new SQL script. Make sure theBuilt-in
serverless SQL pool is selected as well as theDatabase1
database.Set the content of the script to the statement below and run the script.
SELECT COUNT(*) FROM FactSale
In this task you will create manually a new lake database table and map data into it from the data lake storage account.
-
In Synapse Studio, navigate to the
Data
hub, and select theWorkspace
section followed byLake database
. In the context menu associated with theDatabase1
database, selectOpen
to edit the lake database. -
In the database editor, select
+ Table
followed byCustom
. Set the name of the table toCustomer
. -
In the table editor, select the
Columns
tab, add the following standard columns and then selectPublish
:CustomerId
(PK
, typeinteger
)FirstName
(type string)LastName
(type string)
IMPORTANT
The table must be published before advancing to the next step, otherwise the data flow debug session will not be able to start properly.
-
In the table editor, select
Map data (Preview)
to stard the Map Data tool. If this is the first time you are doing this, you might pe prompted to turn on data flow debug. If this happens, leave the default selections and selectOK
to start the data flow debug session. -
In the
New data mapping
dialog, configure the following properties:- Source type:
Azure Data Lake Storage Gen2
- Source linked service:
asadatalake01
- Dataset type:
DelimitedText
- Folder path: `database1-staging1
- Sources: select the
customer.csv
file
Select
Continue
to proceed. - Source type:
-
Configure the data mapping properties as follows:
- Data mapping name:
Customer Mapping
- Target database:
Database1
Select
OK
to finalize the process. - Data mapping name:
In this task you will use a lake database template from the Synapse Knowledge Center to create a complex lake database.
-
In Synapse Studio, navigate to the
Home
hub and then selectKnowledge center
. -
In the Knowledge center, select
Browse gallery
. -
In the Gallery, select the
Database templates
tab and then select theBanking
category. -
Observe the set of tables and then select
Create database
to create a new lake database from the template. -
In Synapse Studio, open the newly created lake database in the editor and explore its content.