Tutorial: Join data from multiple tables
Applies to: ✅ Microsoft Fabric ✅ Azure Data Explorer ✅ Azure Monitor ✅ Microsoft Sentinel
Joining data from multiple tables allows for a more comprehensive analysis by combining information from different sources and creating new relationships between data points. In the Kusto Query Language (KQL), the join and lookup operators are used to combine data across tables.
In this tutorial, you'll learn how to:
The examples in this tutorial use the StormEvents
table, which is publicly available in the help cluster. To explore with your own data, create your own free cluster.
The examples in this tutorial use the StormEvents
table, which is publicly available in the Weather analytics sample data.
To run the following queries, you need a query environment with access to the sample data. You can use one of the following:
- A Microsoft account or Microsoft Entra user identity to sign in to the help cluster
- A Microsoft account or Microsoft Entra user identity
- A Fabric workspace with a Microsoft Fabric-enabled capacity
There are two tables in the Samples database related to storm events. One is called StormEvents
and the other is called PopulationData
. In this section, you'll join the tables to perform data analysis that wouldn't be possible with one table alone.
Use the take operator to see what data each table contains.
StormEvents
| take 5
The following table shows only 6 of the 22 returned columns.
StartTime | EndTime | EpisodeId | EventId | State | EventType | ... |
---|---|---|---|---|---|---|
2007-09-20T21:57:00Z | 2007-09-20T22:05:00Z | 11078 | 60913 | FLORIDA | Tornado | ... |
2007-12-20T07:50:00Z | 2007-12-20T07:53:00Z | 12554 | 68796 | MISSISSIPPI | Thunderstorm Wind | ... |
2007-12-30T16:00:00Z | 2007-12-30T16:05:00Z | 11749 | 64588 | GEORGIA | Thunderstorm Wind | ... |
2007-09-29T08:11:00Z | 2007-09-29T08:11:00Z | 11091 | 61032 | ATLANTIC SOUTH | Waterspout | ... |
2007-09-18T20:00:00Z | 2007-09-19T18:00:00Z | 11074 | 60904 | FLORIDA | Heavy Rain | ... |
PopulationData
| take 5
Output
State | Population |
---|---|
ALABAMA | 4918690 |
ALASKA | 727951 |
ARIZONA | 7399410 |
ARKANSAS | 3025880 |
CALIFORNIA | 39562900 |
Both tables contain a State
column. The StormEvents
table has many more columns, and the PopulationData
has only one other column that contains the population of the given state.
Join the PopulationData
table with StormEvents
on the common State
column to find the total property damage caused by storms per capita by state.
StormEvents
| summarize PropertyDamage = sum(DamageProperty) by State
| join kind=innerunique PopulationData on State
| project State, PropertyDamagePerCapita = PropertyDamage / Population
| sort by PropertyDamagePerCapita
Add | render columnchart
to the query to visualize the result.
Tip
There are many types of joins that you can perform with the join
operator. See a list of join flavors.
The lookup operator optimizes the performance of queries where a fact table is enriched with data from a dimension table. It extends the fact table with values that are looked up in a dimension table. For best performance, the system by default assumes that the left table is the larger fact table, and the right table is the smaller dimension table. This is exactly opposite to the assumption that's used by the join
operator.
In the help cluster, there's another database called ContosoSales
that contains sales data. The following query uses lookup
to merge the SalesFact
and Products
tables from this database to get the total sales by product category.
SalesFact
| lookup Products on ProductKey
| summarize TotalSales = count() by ProductCategoryName
| order by TotalSales desc
Output
ProductCategoryName | TotalSales |
---|---|
Games and Toys | 966782 |
TV and Video | 715024 |
Cameras and camcorders | 323003 |
Computers | 313487 |
Home Appliances | 237508 |
Audio | 192671 |
Cell phones | 50342 |
Music, Movies and Audio Books | 33376 |
Note
The lookup
operator only supports two join flavors: leftouter
and inner
.
Joins can also be done based on query results from the same table.
Say you want to create a list of states in which both lightning and avalanche events occurred. Use the join operator to merge the rows of two tables—one containing data on lightning events and the other containing data on avalanche events—based on the State
column.
StormEvents
| where EventType == "Lightning"
| distinct State
| join kind=inner (
StormEvents
| where EventType == "Avalanche"
| distinct State
)
on State
| project State
Output
State |
---|
OREGON |
UTAH |
WYOMING |
WASHINGTON |
COLORADO |
IDAHO |
NEVADA |
- Learn about different kinds of join-operator
- Learn how to perform cross-database and cross-cluster queries
- Follow the create geospatial visualizations tutorial