AzureDiagnostics 表的查询

microsoft.automation 查询

自动化作业中的错误

查找报告过去一天自动化作业中的错误的日志。

AzureDiagnostics 
| where ResourceProvider == "MICROSOFT.AUTOMATION"  
| where StreamType_s == "Error" 
| project TimeGenerated, Category, JobId_g, OperationName, RunbookName_s, ResultDescription

查找报告过去一天自动化作业中的错误的日志

列出自动化作业中的所有错误。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics 
| where ResourceProvider == "MICROSOFT.AUTOMATION" 
| where StreamType_s == "Error" 
| project TimeGenerated, Category, JobId_g, OperationName, RunbookName_s, ResultDescription, _ResourceId 

Azure 自动化失败、挂起或停止的作业

列出所有失败、挂起或停止的自动化作业。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics 
| where ResourceProvider == "MICROSOFT.AUTOMATION" and Category == "JobLogs" and (ResultType == "Failed" or ResultType == "Stopped" or ResultType == "Suspended") 
| project TimeGenerated , RunbookName_s , ResultType , _ResourceId , JobId_g

Runbook 已成功完成,但出现错误

列出已完成但出现错误的所有作业。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics 
| where ResourceProvider == "MICROSOFT.AUTOMATION" and Category == "JobStreams" and StreamType_s == "Error" 
| project TimeGenerated , RunbookName_s , StreamType_s , _ResourceId , ResultDescription , JobId_g 

查看历史作业状态

列出所有自动化作业。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.AUTOMATION" and Category == "JobLogs" and ResultType != "started"
| summarize AggregatedValue = count() by ResultType, bin(TimeGenerated, 1h) , RunbookName_s , JobId_g, _ResourceId

已完成Azure 自动化作业

列出已完成的所有自动化作业。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics 
| where ResourceProvider == "MICROSOFT.AUTOMATION" and Category == "JobLogs" and ResultType == "Completed" 
| project TimeGenerated , RunbookName_s , ResultType , _ResourceId , JobId_g 

microsoft.batch 查询

每个作业的成功任务数

提供每个作业的成功任务数。

AzureDiagnostics
| where OperationName=="TaskCompleteEvent"
| where executionInfo_exitCode_d==0 // Your application may use an exit code other than 0 to denote a successful operation
| summarize successfulTasks=count(id_s) by jobId=jobId_s

每个作业的失败任务数

按父作业Lists失败的任务。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where OperationName=="TaskFailEvent"
| summarize failedTaskList=make_list(id_s) by jobId=jobId_s, ResourceId

任务工期

提供任务的运行时间(以秒为单位),从任务开始到任务完成。

AzureDiagnostics
| where OperationName=="TaskCompleteEvent"
| extend taskId=id_s, ElapsedTime=datetime_diff('second', executionInfo_endTime_t, executionInfo_startTime_t) // For longer running tasks, consider changing 'second' to 'minute' or 'hour'
| summarize taskList=make_list(taskId) by ElapsedTime

调整池大小

按池和结果代码列出大小调整时间, (成功或失败) 。

AzureDiagnostics
| where OperationName=="PoolResizeCompleteEvent"
| summarize operationTimes=make_list(startTime_s) by poolName=id_s, resultCode=resultCode_s

池大小调整失败

按错误代码和时间列出池大小调整失败。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where OperationName=="PoolResizeCompleteEvent"
| where resultCode_s=="Failure" // Filter only on failed pool resizes
| summarize by poolName=id_s, resultCode=resultCode_s, resultMessage=resultMessage_s, operationTime=startTime_s, ResourceId

microsoft.cdn 查询

[Microsoft CDN (经典) ]每小时请求数

显示每小时请求总数的呈现折线图。

// Summarize number of requests per hour 
// Change bins resolution from 1hr to 5m to get real time results)
// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics 
| where OperationName == "Microsoft.Cdn/Profiles/AccessLog/Write" and Category == "AzureCdnAccessLog" 
| where isReceivedFromClient_b == "true"
| summarize RequestCount = count() by bin(TimeGenerated, 1h), Resource, _ResourceId
| render timechart

[Microsoft CDN (经典) ]按 URL 排序的流量

按 URL 显示 CDN 边缘的流出量。

// Change bins resolution from 1 hour to 5 minutes to get real time results)
// CDN edge response traffic by URL
AzureDiagnostics
| where OperationName == "Microsoft.Cdn/Profiles/AccessLog/Write" and Category == "AzureCdnAccessLog" 
| where isReceivedFromClient_b == true
| summarize ResponseBytes = sum(toint(responseBytes_s)) by requestUri_s

[Microsoft CDN (经典) ] 按 URL 列出的 4XX 错误率

按 URL 显示 4XX 错误率。

// Request errors rate by URL
// Count number of requests with error responses by URL. 
// Summarize number of requests by URL, and status codes are 4XX
// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where OperationName == "Microsoft.Cdn/Profiles/AccessLog/Write" and Category == "AzureCdnAccessLog" and isReceivedFromClient_b == true
| extend Is4XX = (toint(httpStatusCode_s ) >= 400 and toint(httpStatusCode_s ) < 500)
| summarize 4xxrate = (1.0 * countif(Is4XX)  / count()) * 100 by requestUri_s, bin(TimeGenerated, 1h), _ResourceId

[Microsoft CDN (经典) ]用户代理的请求错误

按用户代理计算具有错误响应的请求数。

// Summarize number of requests per user agent and status codes >= 400
// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where OperationName == "Microsoft.Cdn/Profiles/AccessLog/Write" and Category == "AzureCdnAccessLog" 
| where isReceivedFromClient_b == true
| where toint(httpStatusCode_s) >= 400
| summarize RequestCount = count() by UserAgent = userAgent_s, StatusCode = httpStatusCode_s , Resource, _ResourceId
| order by RequestCount desc

[Microsoft CDN (经典) ]前 10 个 URL 请求计数

按请求计数显示前 10 个 URL。

// top URLs by request count
// Render line chart showing total requests per hour . 
// Summarize number of requests per hour 
AzureDiagnostics
| where OperationName == "Microsoft.Cdn/Profiles/AccessLog/Write" and Category == "AzureCdnAccessLog" 
| where isReceivedFromClient_b == true
| summarize UserRequestCount = count() by requestUri_s
| order by UserRequestCount
| limit 10

[Microsoft CDN (经典) ]唯一 IP 请求计数

显示唯一 IP 请求计数。

AzureDiagnostics
| where OperationName == "Microsoft.Cdn/Profiles/AccessLog/Write"and Category == "AzureCdnAccessLog"
| where isReceivedFromClient_b == true
| summarize dcount(clientIp_s) by bin(TimeGenerated, 1h)
| render timechart 

[Microsoft CDN (经典) ]前 10 个客户端 IP 和 HTTP 版本

显示前 10 个客户端 IP 和 http 版本。

// Top 10 client IPs and http versions 
// Show top 10 client IPs and http versions. 
// Summarize top 10 client ips and http versions
AzureDiagnostics
| where OperationName == "Microsoft.Cdn/Profiles/AccessLog/Write" and Category == "AzureCdnAccessLog"
| where isReceivedFromClient_b == true
| summarize RequestCount = count() by ClientIP = clientIp_s, HttpVersion = httpVersion_s, Resource
| top 10 by RequestCount 
| order by RequestCount desc

[Azure Front Door 标准版/高级版]按 IP 和规则阻止的前 20 个客户端

按 IP 和规则名称显示阻止的前 20 个客户端。

AzureDiagnostics
| where ResourceProvider == "MICROSOFT.CDN" and Category == "FrontDoorWebApplicationFirewallLog"
| where action_s == "Block"
| summarize RequestCount = count() by ClientIP = clientIP_s, UserAgent = userAgent_s, RuleName = ruleName_s,Resource
| top 20 by RequestCount 
| order by RequestCount desc

[Azure Front Door 标准版/高级版]按路由对源的请求

对每分钟每个路由和源的请求数进行计数。 汇总每个路由和源每分钟的请求数。

AzureDiagnostics
| where ResourceProvider == "MICROSOFT.CDN" and Category == "FrontDoorAccessLog"
| summarize RequestCount = count() by bin(TimeGenerated, 1m), Resource, RouteName = routingRuleName_s, originName = originName_s, ResourceId

[Azure Front Door 标准版/高级版]用户代理的请求错误

按用户代理计算具有错误响应的请求数。 汇总每个用户代理的请求数,状态代码 >= 400。

AzureDiagnostics
| where ResourceProvider == "MICROSOFT.CDN" and Category == "FrontDoorAccessLog"
| where toint(httpStatusCode_s) >= 400
| summarize RequestCount = count() by UserAgent = userAgent_s, StatusCode = httpStatusCode_s , Resource, ResourceId
| order by RequestCount desc

[Azure Front Door 标准版/高级版]前 10 个客户端 IP 和 http 版本

按请求计数显示前 10 个客户端 IP 和 http 版本。

AzureDiagnostics
| where ResourceProvider == "MICROSOFT.CDN" and Category == "FrontDoorAccessLog"
| summarize RequestCount = count() by ClientIP = clientIp_s, HttpVersion = httpVersion_s, Resource
|top 10 by RequestCount 
| order by RequestCount desc

[Azure Front Door 标准版/高级版]按主机和路径排序的请求错误

按主机和路径统计具有错误响应的请求数。 按主机、路径和状态代码 >= 400 汇总请求数。

AzureDiagnostics
| where ResourceProvider == "MICROSOFT.CDN" and Category == "FrontDoorAccessLog"
| where toint(httpStatusCode_s) >= 400
| extend ParsedUrl = parseurl(requestUri_s)
| summarize RequestCount = count() by Host = tostring(ParsedUrl.Host), Path = tostring(ParsedUrl.Path), StatusCode = httpStatusCode_s, ResourceId
| order by RequestCount desc

[Azure Front Door 标准版/高级版]每小时防火墙阻止的请求计数

每小时计数防火墙阻止的请求数。 按策略汇总每小时防火墙阻止的请求数。

AzureDiagnostics
| where ResourceProvider == "MICROSOFT.CDN" and Category == "FrontDoorWebApplicationFirewallLog"
| where action_s == "Block"
| summarize RequestCount = count() by bin(TimeGenerated, 1h), Policy = policy_s, PolicyMode = policyMode_s, Resource, ResourceId
| order by RequestCount desc

[Azure Front Door 标准版/高级版]按主机、路径、规则和操作的防火墙请求计数

按主机、路径、规则和采取的操作对防火墙处理的请求进行计数。 按主机、路径、规则和操作汇总请求计数。

AzureDiagnostics
| where ResourceProvider == "MICROSOFT.CDN" and Category == "FrontDoorWebApplicationFirewallLog"
| extend ParsedUrl = parseurl(requestUri_s)
| summarize RequestCount = count() by Host = tostring(ParsedUrl.Host), Path = tostring(ParsedUrl.Path), RuleName = ruleName_s, Action = action_s, ResourceId
| order by RequestCount desc

[Azure Front Door 标准版/高级版]每小时请求数

呈现折线图,显示每个 FrontDoor 资源的每小时请求总数。

AzureDiagnostics
| where ResourceProvider == "MICROSOFT.CDN" and Category == "FrontDoorWebApplicationFirewallLog"
| summarize RequestCount = count() by bin(TimeGenerated, 1h), Resource, ResourceId
| render timechart 

[Azure Front Door 标准版/高级版]前 10 个 URL 请求计数

按请求计数显示前 10 个 URL。

AzureDiagnostics
| where ResourceProvider == "MICROSOFT.CDN" and Category == "FrontDoorAccessLog"
| summarize UserRequestCount = count() by requestUri_s
| order by UserRequestCount
| limit 10

[Azure Front Door 标准版/高级版]前 10 个 URL 请求计数

按 URL 显示 AFD 边缘的流出量。 将箱分辨率从 1 小时更改为 5 米,以获取实时结果。

AzureDiagnostics
| where ResourceProvider == "MICROSOFT.CDN" and Category == "FrontDoorAccessLog"
| summarize ResponseBytes = sum(toint(responseBytes_s)) by requestUri_s

[Azure Front Door 标准版/高级版]唯一 IP 请求计数

显示唯一的 IP 请求计数。

AzureDiagnostics
| where ResourceProvider == "MICROSOFT.CDN" and Category == "FrontDoorAccessLog"
| summarize dcount(clientIp_s) by bin(TimeGenerated, 1h)
| render timechart

microsoft.containerservice 查询

在 Azure 中查找Diagnostics

在 AzureDiagnostics 中查找以在 AzureDiagnostics 表中搜索特定值。/n请注意,此查询需要更新 <SeachValue> 参数才能生成结果

// This query requires a parameter to run. Enter value in SearchValue to find in table.
let SearchValue =  "<SearchValue>";//Please update term you would like to find in the table.
AzureDiagnostics
| where ResourceProvider == "Microsoft.ContainerService"
| where * contains tostring(SearchValue)
| take 1000

microsoft.dbformariadb 的查询

超过阈值的执行时间

确定其运行时间超过 10 秒的查询。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider =="MICROSOFT.DBFORMARIADB" 
| where Category == 'MySqlSlowLogs'
| project TimeGenerated, LogicalServerName_s, event_class_s, start_time_t , query_time_d, sql_text_s, ResourceId 
| where query_time_d > 10 // You may change the time threshold

显示最慢查询

确定前 5 个最慢的查询。

AzureDiagnostics
| where ResourceProvider =="MICROSOFT.DBFORMARIADB" 
| where Category == 'MySqlSlowLogs'
| project TimeGenerated, LogicalServerName_s, event_class_s, start_time_t , query_time_d, sql_text_s 
| top 5 by query_time_d desc

显示查询的统计信息

按查询构造汇总统计信息表。

AzureDiagnostics
| where ResourceProvider =="MICROSOFT.DBFORMARIADB" 
| where Category == 'MySqlSlowLogs'
| project TimeGenerated, LogicalServerName_s, event_class_s, start_time_t , query_time_d, sql_text_s 
| summarize count(), min(query_time_d), max(query_time_d), avg(query_time_d), stdev(query_time_d), percentile(query_time_d, 95) by LogicalServerName_s ,sql_text_s 
|  top 50  by percentile_query_time_d_95 desc

查看 GENERAL 类中的审核日志事件

确定服务器的常规类事件。

AzureDiagnostics
| where ResourceProvider =="MICROSOFT.DBFORMARIADB" 
| where Category == 'MySqlAuditLogs' and event_class_s == "general_log"
| project TimeGenerated, LogicalServerName_s, event_class_s, event_subclass_s, event_time_t, user_s , ip_s , sql_text_s 
| order by TimeGenerated asc

查看 CONNECTION 类中的审核日志事件

确定服务器的连接相关事件。

AzureDiagnostics
| where ResourceProvider =="MICROSOFT.DBFORMARIADB" 
| where Category == 'MySqlAuditLogs' and event_class_s == "connection_log"
| project TimeGenerated, LogicalServerName_s, event_class_s, event_subclass_s, event_time_t, user_s , ip_s , sql_text_s 
| order by TimeGenerated asc 

microsoft.dbformysql 查询

超过阈值的执行时间

确定其运行时间超过 10 秒的查询。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DBFORMYSQL"
| where Category == 'MySqlSlowLogs'
| project TimeGenerated, LogicalServerName_s, event_class_s, start_time_t , query_time_d, sql_text_s, ResourceId 
| where query_time_d > 10 //You may change the time threshold 

显示最慢查询

确定前 5 个最慢的查询。

AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DBFORMYSQL" 
| where Category == 'MySqlSlowLogs'
| project TimeGenerated, LogicalServerName_s, event_class_s, start_time_t , query_time_d, sql_text_s 
| top 5 by query_time_d desc

显示查询的统计信息

按查询构造汇总统计信息表。

AzureDiagnostics
| where ResourceProvider ==  "MICROSOFT.DBFORMYSQL"
| where Category == 'MySqlSlowLogs'
| project TimeGenerated, LogicalServerName_s, event_class_s, start_time_t , query_time_d, sql_text_s 
| summarize count(), min(query_time_d), max(query_time_d), avg(query_time_d), stdev(query_time_d), percentile(query_time_d, 95) by LogicalServerName_s ,sql_text_s 
|  top 50  by percentile_query_time_d_95 desc

查看 GENERAL 类中的审核日志事件

确定服务器的常规类事件。

AzureDiagnostics
| where ResourceProvider =="MICROSOFT.DBFORMYSQL"
| where Category == 'MySqlAuditLogs' and event_class_s == "general_log"
| project TimeGenerated, LogicalServerName_s, event_class_s, event_subclass_s, event_time_t, user_s , ip_s , sql_text_s 
| order by TimeGenerated asc

查看 CONNECTION 类中的审核日志事件

确定服务器的连接相关事件。

AzureDiagnostics
| where ResourceProvider =="MICROSOFT.DBFORMYSQL"
| where Category == 'MySqlAuditLogs' and event_class_s == "connection_log"
| project TimeGenerated, LogicalServerName_s, event_class_s, event_subclass_s, event_time_t, user_s , ip_s , sql_text_s 
| order by TimeGenerated asc 

microsoft.dbforpostgresql 的查询

Autovacuum 事件

搜索过去 24 小时内的 autovacuum 事件。 它需要启用参数“log_autovacuum_min_duration”。

AzureDiagnostics
| where TimeGenerated > ago(1d) 
| where ResourceProvider =="MICROSOFT.DBFORPOSTGRESQL" 
| where Category == "PostgreSQLLogs"
| where Message contains "automatic vacuum"

服务器重启

搜索服务器关闭和服务器就绪事件。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where TimeGenerated > ago(7d)
| where ResourceProvider =="MICROSOFT.DBFORPOSTGRESQL" 
| where Category == "PostgreSQLLogs"
| where Message contains "database system was shut down at" or Message contains "database system is ready to accept" 

查找错误

搜索过去 6 小时内的错误。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where TimeGenerated > ago(6h)
| where Category == "PostgreSQLLogs"
| where  errorLevel_s contains "error" 

未经授权的连接

搜索未经授权的 (拒绝) 连接尝试。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider =="MICROSOFT.DBFORPOSTGRESQL" 
| where Category == "PostgreSQLLogs"
| where Message contains "password authentication failed" or Message contains "no pg_hba.conf entry for host"

死锁数

搜索死锁事件。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider =="MICROSOFT.DBFORPOSTGRESQL" 
| where Category == "PostgreSQLLogs"
| where Message contains "deadlock detected"

锁争用

搜索锁争用。 它需要 log_lock_waits=ON,具体取决于deadlock_timeout设置。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider =="MICROSOFT.DBFORPOSTGRESQL" 
| where Message contains "still waiting for ShareLock on transaction" 

审核日志

获取所有审核日志。 它要求启用审核日志 [https://docs.microsoft.com/azure/postgresql/concepts-audit]。

AzureDiagnostics
| where ResourceProvider =="MICROSOFT.DBFORPOSTGRESQL" 
| where Category == "PostgreSQLLogs"
| where Message contains "AUDIT:"

表 () 和事件类型的审核日志 ()

搜索特定表和事件类型 DDL 的审核日志。 其他事件类型包括 READ、WRITE、FUNCTION、MISC。 它要求启用审核日志。 [https://docs.microsoft.com/azure/postgresql/concepts-audit].

AzureDiagnostics
| where ResourceProvider =="MICROSOFT.DBFORPOSTGRESQL" 
| where Category == "PostgreSQLLogs"
| where Message contains "AUDIT:" 
| where Message contains "table name" and Message contains "DDL"

执行时间超过阈值的查询

识别耗时超过 10 秒的查询。 查询存储将实际查询规范化,以聚合类似的查询。 默认情况下,每 15 分钟聚合一次条目。 查询使用平均执行时间每 15 分钟一次,其他查询统计信息(如 max、min)可根据需要使用。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DBFORPOSTGRESQL"
| where Category == "QueryStoreRuntimeStatistics"
| where user_id_s != "10" //exclude azure system user
| project TimeGenerated, LogicalServerName_s, event_type_s , mean_time_s , db_id_s , start_time_s , query_id_s, _ResourceId
| where todouble(mean_time_s) > 0 // You may change the time threshold

最慢的查询

确定前 5 个最慢的查询。

AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DBFORPOSTGRESQL"
| where Category == "QueryStoreRuntimeStatistics"
| where user_id_s != "10" //exclude azure system user
| summarize avg(todouble(mean_time_s)) by event_class_s , db_id_s ,query_id_s
| top 5 by avg_mean_time_s desc

查询统计信息

按查询构造汇总统计信息表。

AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DBFORPOSTGRESQL"
| where Category == "QueryStoreRuntimeStatistics"
| where user_id_s != "10" //exclude azure system user
| summarize sum(toint(calls_s)), min(todouble(min_time_s)),max(todouble(max_time_s)),avg(todouble(mean_time_s)),percentile(todouble(mean_time_s),95) by  db_id_s ,query_id_s
| order by percentile_mean_time_s_95 desc nulls last 

按 15 分钟间隔聚合的查询的执行趋势。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DBFORPOSTGRESQL"
| where Category == "QueryStoreRuntimeStatistics"
| where user_id_s != "10" //exclude azure system user
| summarize sum(toint(calls_s)) by  tostring(query_id_s), bin(TimeGenerated, 15m), ResourceId
| render timechart 

排名靠前的等待事件

按查询确定前 5 个等待事件。

AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DBFORPOSTGRESQL"
| where Category == "QueryStoreWaitStatistics"
| where user_id_s != "10" //exclude azure system user
| where query_id_s != 0
| summarize sum(toint(calls_s)) by event_s, query_id_s, bin(TimeGenerated, 15m)
| top 5 by sum_calls_s desc nulls last

显示一段时间内的等待事件趋势。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DBFORPOSTGRESQL"
| where Category == "QueryStoreWaitStatistics"
| where user_id_s != "10" //exclude azure system user
| extend query_id_s = tostring(query_id_s)
| summarize sum(toint(calls_s)) by event_s, query_id_s, bin(TimeGenerated, 15m), ResourceId // You may change the time threshold 
| render timechart

microsoft.devices 查询

Connectvity 错误

识别设备连接错误。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DEVICES" and ResourceType == "IOTHUBS"
| where Category == "Connections" and Level == "Error"

限制错误最多的设备

识别发出最多请求导致限制错误的设备。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DEVICES" and ResourceType == "IOTHUBS"
| where ResultType == "429001"
| extend DeviceId = tostring(parse_json(properties_s).deviceId)
| summarize count() by DeviceId, Category , _ResourceId
| order by count_ desc

死终结点

通过报告问题的次数以及原因确定死终结点或不正常终结点。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DEVICES" and ResourceType == "IOTHUBS"
| where Category == "Routes" and OperationName in ("endpointDead", "endpointUnhealthy")
| extend parsed_json = parse_json(properties_s)
| extend Endpoint   = tostring(parsed_json.endpointName), Reason =tostring(parsed_json.details) 
| summarize count() by Endpoint, OperationName, Reason, _ResourceId
| order by count_ desc

错误摘要

所有操作中的错误计数(按类型)。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DEVICES" and ResourceType == "IOTHUBS"
| where Level == "Error"
| summarize count() by ResultType, ResultDescription, Category, _ResourceId

最近连接的设备

IoT 中心在指定时间段内看到连接的设备列表。

AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DEVICES" and ResourceType == "IOTHUBS"
| where Category == "Connections" and OperationName == "deviceConnect"
| extend DeviceId = tostring(parse_json(properties_s).deviceId)
| summarize max(TimeGenerated) by DeviceId, _ResourceId

设备的 SDK 版本

设备及其 SDK 版本的列表。

// this query works on device connection or when your device uses device to cloud twin operations
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DEVICES" and ResourceType == "IOTHUBS"
| where Category == "Connections" or  Category == "D2CTwinOperations"
| extend parsed_json = parse_json(properties_s) 
| extend SDKVersion = tostring(parsed_json.sdkVersion) , DeviceId = tostring(parsed_json.deviceId)
| distinct DeviceId, SDKVersion, TimeGenerated, _ResourceId

microsoft.documentdb 查询

过去 24 小时内消耗的 RU/秒

识别 Cosmos 数据库和集合上使用的 RU/秒。

// To create an alert for this query, click '+ New alert rule'
//You can compare the RU/s consumption with your provisioned RU/s to determine if you should scale up or down RU/s based on your workload.
AzureDiagnostics
| where TimeGenerated >= ago(24hr)
| where Category == "DataPlaneRequests"
//| where collectionName_s == "CollectionToAnalyze" //Replace to target the query to a collection
| summarize ConsumedRUsPerMinute = sum(todouble(requestCharge_s)) by collectionName_s, _ResourceId, bin(TimeGenerated, 1m)
| project TimeGenerated , ConsumedRUsPerMinute , collectionName_s, _ResourceId
| render timechart

过去 24 小时内限制 (429) 的集合

确定接收了 429 个 (限制) 的集合和操作,这些限制在消耗的吞吐量 (RU/秒) 超过预配吞吐量时发生。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where TimeGenerated >= ago(24hr)
| where Category == "DataPlaneRequests"
| where statusCode_s == 429 
| summarize numberOfThrottles = count() by databaseName_s, collectionName_s, requestResourceType_s, _ResourceId, bin(TimeGenerated, 1hr)
| order by numberOfThrottles

过去 24 小时内消耗的请求单位数 (RU) 排名靠前的操作

按每个操作的计数和消耗的 RU 确定 Cosmos 资源上排名靠前的操作。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where TimeGenerated >= ago(24h)
| where Category == "DataPlaneRequests"
| summarize numberOfOperations = count(), totalConsumedRU = sum(todouble(requestCharge_s)) by databaseName_s, collectionName_s, OperationName, requestResourceType_s, requestResourceId_s, _ResourceId
| extend averageRUPerOperation = totalConsumedRU / numberOfOperations 
| order by numberOfOperations

按存储划分的顶级逻辑分区键

确定最大的逻辑分区键值。 PartitionKeyStatistics 将按存储发出顶级逻辑分区键的数据。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where Category == "PartitionKeyStatistics"
//| where collectionName_s == "CollectionToAnalyze" //Replace to target the query to a collection
| summarize arg_max(TimeGenerated, *) by databaseName_s, collectionName_s, partitionKey_s, _ResourceId //Get the latest storage size
| extend utilizationOf20GBLogicalPartition = sizeKb_d / 20000000 //20GB
| project TimeGenerated, databaseName_s , collectionName_s , partitionKey_s, sizeKb_d, utilizationOf20GBLogicalPartition, _ResourceId

microsoft.eventhub 查询

[经典]捕获失败的持续时间

总结捕获时失败的重复。

AzureDiagnostics
| where ResourceProvider == \"MICROSOFT.EVENTHUB\"
| where Category == \"ArchiveLogs\"
| summarize count() by \"failures\", \"durationInSeconds\", _ResourceId

[经典]客户端的加入请求

汇总了客户端的加入请求的状态。

AzureDiagnostics // Need to turn on the Capture for this 
| where ResourceProvider == \"MICROSOFT.EVENTHUB\"
|  project \"OperationName\"

[经典]访问 keyvault - 找不到密钥

总结未找到密钥时对 keyvault 的访问权限。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == \"MICROSOFT.EVENTHUB\"
| where Category == \"Error\" and OperationName == \"wrapkey\"
| project Message, _ResourceId

[经典]使用 keyvault 执行的操作

总结使用 keyvault 为禁用或还原密钥而执行的操作。

AzureDiagnostics
| where ResourceProvider == \"MICROSOFT.EVENTHUB\"
| where Category == \"info\" and OperationName == \"disable\" or OperationName == \"restore\"
| project Message

过去 7 天内的错误

这会列出过去 7 天的所有错误。

AzureDiagnostics
| where TimeGenerated > ago(7d)
| where ResourceProvider =="MICROSOFT.EVENTHUB"
| where Category == "OperationalLogs"
| summarize count() by "EventName", _ResourceId

捕获失败的持续时间

总结捕获时失败的重复。

AzureDiagnostics
| where ResourceProvider == "MICROSOFT.EVENTHUB"
| where Category == "ArchiveLogs"
| summarize count() by "failures", "durationInSeconds", _ResourceId

客户端的加入请求

汇总了客户端的加入请求的状态。

AzureDiagnostics // Need to turn on the Capture for this 
| where ResourceProvider == "MICROSOFT.EVENTHUB"
|  project "OperationName"

访问 keyvault - 找不到密钥

总结未找到密钥时对 keyvault 的访问权限。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.EVENTHUB" 
| where Category == "Error" and OperationName == "wrapkey"
| project Message, ResourceId

使用 keyvault 执行的操作

总结使用 keyvault 为禁用或还原密钥而执行的操作。

AzureDiagnostics
| where ResourceProvider == "MICROSOFT.EVENTHUB"
| where Category == "info" and OperationName == "disable" or OperationName == "restore"
| project Message

microsoft.keyvault 查询

[经典]此 KeyVault 的活跃程度如何?

[经典]显示一段时间内每个操作的 KeyVault 请求量趋势的折线图。

// KeyVault diagnostic currently stores logs in AzureDiagnostics table which stores logs for multiple services. 
// Filter on ResourceProvider for logs specific to a service.
AzureDiagnostics
| where ResourceProvider =="MICROSOFT.KEYVAULT" 
| summarize count() by bin(TimeGenerated, 1h), OperationName // Aggregate by hour
| render timechart

[经典]谁在调用此 KeyVault?

[经典]由其 IP 地址及其请求计数标识的调用方列表。

// KeyVault diagnostic currently stores logs in AzureDiagnostics table which stores logs for multiple services. 
// Filter on ResourceProvider for logs specific to a service.
AzureDiagnostics
| where ResourceProvider =="MICROSOFT.KEYVAULT"
| summarize count() by CallerIPAddress

[经典]请求速度是否缓慢?

[经典]耗时超过 1 秒的 KeyVault 请求列表。

// To create an alert for this query, click '+ New alert rule'
let threshold=1000; // let operator defines a constant that can be further used in the query
AzureDiagnostics
| where ResourceProvider =="MICROSOFT.KEYVAULT" 
| where DurationMs > threshold
| summarize count() by OperationName, _ResourceId

[经典]此 KeyVault 处理请求的速度有多快?

[经典]使用不同聚合显示请求持续时间随时间推移趋势的折线图。

AzureDiagnostics
| where ResourceProvider =="MICROSOFT.KEYVAULT" 
| summarize avg(DurationMs) by requestUri_s, bin(TimeGenerated, 1h) // requestUri_s contains the URI of the request
| render timechart

[经典]是否有任何故障?

[经典]按状态代码统计的失败 KeyVault 请求计数。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider =="MICROSOFT.KEYVAULT" 
| where httpStatusCode_d >= 300 and not(OperationName == "Authentication" and httpStatusCode_d == 401)
| summarize count() by requestUri_s, ResultSignature, _ResourceId
// ResultSignature contains HTTP status, e.g. "OK" or "Forbidden"
// httpStatusCode_d contains HTTP status code returned by the request (e.g.  200, 300 or 401)
// requestUri_s contains the URI of the request

[经典]上个月发生了哪些更改?

[经典]Lists过去 30 天内的所有更新和修补请求。

// KeyVault diagnostic currently stores logs in AzureDiagnostics table which stores logs for multiple services. 
// Filter on ResourceProvider for logs specific to a service.
AzureDiagnostics
| where TimeGenerated > ago(30d) // Time range specified in the query. Overrides time picker in portal.
| where ResourceProvider =="MICROSOFT.KEYVAULT" 
| where OperationName == "VaultPut" or OperationName == "VaultPatch"
| sort by TimeGenerated desc

[经典]列出所有输入反序列化错误

[经典]显示作业无法反序列化的格式不正确的事件导致的错误。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.KEYVAULT" and parse_json(properties_s).DataErrorType in ("InputDeserializerError.InvalidData", "InputDeserializerError.TypeConversionError", "InputDeserializerError.MissingColumns", "InputDeserializerError.InvalidHeader", "InputDeserializerError.InvalidCompressionType")
| project TimeGenerated, Resource, Region_s, OperationName, properties_s, Level, _ResourceId

[经典]在 Azure 中查找Diagnostics

[经典]在 AzureDiagnostics 中查找以在 AzureDiagnostics 表中搜索特定值。/n请注意,此查询需要更新 <SeachValue> 参数才能生成结果

// This query requires a parameter to run. Enter value in SearchValue to find in table.
let SearchValue =  "<SearchValue>";//Please update term you would like to find in the table.
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.KEYVAULT"
| where * contains tostring(SearchValue)
| take 1000

microsoft.logic 查询

计费的执行总数

按操作名称计费的执行总数。

// Total billable executions
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.LOGIC"
| where Category == "WorkflowRuntime" 
| where OperationName has "workflowTriggerStarted" or OperationName has "workflowActionStarted" 
| summarize dcount(resource_runId_s) by OperationName, resource_workflowName_s

逻辑应用执行分布(按工作流)

逻辑应用执行的每小时时间图,按工作流分布。

// Hourly Time chart for Logic App execution distribution by workflows
AzureDiagnostics 
| where ResourceProvider == "MICROSOFT.LOGIC"
| where Category == "WorkflowRuntime"
| where OperationName has "workflowRunStarted"
| summarize dcount(resource_runId_s) by bin(TimeGenerated, 1h), resource_workflowName_s
| render timechart 

逻辑应用执行分布(按状态)

按工作流、状态和错误完成的执行。

//logic app execution status summary
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.LOGIC"
| where OperationName has "workflowRunCompleted"
| summarize dcount(resource_runId_s) by resource_workflowName_s, status_s, error_code_s
| project LogicAppName = resource_workflowName_s , NumberOfExecutions = dcount_resource_runId_s , RunStatus = status_s , Error = error_code_s 

触发的失败计数

按资源名称显示所有逻辑应用执行的操作/触发器失败。

// To create an alert for this query, click '+ New alert rule'
//Action/Trigger failures for all Logic App executions
AzureDiagnostics
| where ResourceProvider  == "MICROSOFT.LOGIC"  
| where Category == "WorkflowRuntime" 
| where status_s == "Failed" 
| where OperationName has "workflowActionCompleted" or OperationName has "workflowTriggerCompleted" 
| extend ResourceName = coalesce(resource_actionName_s, resource_triggerName_s) 
| extend ResourceCategory = substring(OperationName, 34, strlen(OperationName) - 43) | summarize dcount(resource_runId_s) by code_s, ResourceName, resource_workflowName_s, ResourceCategory, _ResourceId
| project ResourceCategory, ResourceName , FailureCount = dcount_resource_runId_s , ErrorCode = code_s, LogicAppName = resource_workflowName_s, _ResourceId 
| order by FailureCount desc 

microsoft.network 查询

每小时请求数

应用程序网关上的传入请求计数。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceType == "APPLICATIONGATEWAYS" and OperationName == "ApplicationGatewayAccess"
| summarize AggregatedValue = count() by bin(TimeGenerated, 1h), _ResourceId
| render timechart

每小时非 SSL 请求数

应用程序网关上的非 SSL 请求计数。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceType == "APPLICATIONGATEWAYS" and OperationName == "ApplicationGatewayAccess" and sslEnabled_s == "off"
| summarize AggregatedValue = count() by bin(TimeGenerated, 1h), _ResourceId
| render timechart

每小时失败的请求数

应用程序网关响应时出错的请求计数。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceType == "APPLICATIONGATEWAYS" and OperationName == "ApplicationGatewayAccess" and httpStatus_d > 399
| summarize AggregatedValue = count() by bin(TimeGenerated, 1h), _ResourceId
| render timechart

按用户代理分类的错误

用户代理的错误数。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceType == "APPLICATIONGATEWAYS" and OperationName == "ApplicationGatewayAccess" and httpStatus_d > 399
| summarize AggregatedValue = count() by userAgent_s, _ResourceId
| sort by AggregatedValue desc

按 URI 排序的错误

按 URI 排序的错误数。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceType == "APPLICATIONGATEWAYS" and OperationName == "ApplicationGatewayAccess" and httpStatus_d > 399
| summarize AggregatedValue = count() by requestUri_s, _ResourceId
| sort by AggregatedValue desc

前 10 个客户端 IP

每个客户端 IP 的请求计数。

AzureDiagnostics
| where ResourceType == "APPLICATIONGATEWAYS" and OperationName == "ApplicationGatewayAccess"
| summarize AggregatedValue = count() by clientIP_s
| top 10 by AggregatedValue

排名靠前的 HTTP 版本

每个 HTTP 版本的请求计数。

AzureDiagnostics
| where ResourceType == "APPLICATIONGATEWAYS" and OperationName == "ApplicationGatewayAccess"
| summarize AggregatedValue = count() by httpVersion_s
| top 10 by AggregatedValue

网络安全事件

查找报告阻止的传入流量的网络安全事件。

AzureDiagnostics 
| where ResourceProvider == "MICROSOFT.NETWORK"  
| where Category == "NetworkSecurityGroupEvent"  
| where direction_s == "In" and type_s == "block"

每小时请求数

呈现折线图,显示每个 FrontDoor 资源的每小时请求总数。

// Summarize number of requests per hour for each FrontDoor resource
// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics 
| where ResourceProvider == "MICROSOFT.NETWORK" and Category == "FrontdoorAccessLog"
| summarize RequestCount = count() by bin(TimeGenerated, 1h), Resource, ResourceId
| render timechart 

按路由规则转发的后端请求

对每个路由规则和后端主机每分钟的请求数进行计数。

// Summarize number of requests per minute for each routing rule and backend host
// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics 
| where ResourceProvider == "MICROSOFT.NETWORK" and Category == "FrontdoorAccessLog"
| summarize RequestCount = count() by bin(TimeGenerated, 1m), Resource, RoutingRuleName = routingRuleName_s, BackendHostname = backendHostname_s, ResourceId

按主机和路径排序的请求错误

按主机和路径统计具有错误响应的请求数。

// Summarize number of requests by host, path, and status codes >= 400
// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.NETWORK" and Category == "FrontdoorAccessLog"
| where toint(httpStatusCode_s) >= 400
| extend ParsedUrl = parseurl(requestUri_s)
| summarize RequestCount = count() by Host = tostring(ParsedUrl.Host), Path = tostring(ParsedUrl.Path), StatusCode = httpStatusCode_s, ResourceId
| order by RequestCount desc 

用户代理的请求错误

按用户代理计算具有错误响应的请求数。

// Summarize number of requests per user agent and status codes >= 400
// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.NETWORK" and Category == "FrontdoorAccessLog"
| where toint(httpStatusCode_s) >= 400
| summarize RequestCount = count() by UserAgent = userAgent_s, StatusCode = httpStatusCode_s , Resource, ResourceId
| order by RequestCount desc 

前 10 个客户端 IP 和 http 版本

显示前 10 个客户端 IP 和 http 版本。

// Summarize top 10 client ips and http versions
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.NETWORK" and Category == "FrontdoorAccessLog"
| summarize RequestCount = count() by ClientIP = clientIp_s, HttpVersion = httpVersion_s, Resource
| top 10 by RequestCount 
| order by RequestCount desc

每小时防火墙阻止的请求计数

每小时计数防火墙阻止的请求数。

// Summarize number of firewall blocked requests per hour by policy
// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.NETWORK" and Category == "FrontdoorWebApplicationFirewallLog"
| where action_s == "Block"
| summarize RequestCount = count() by bin(TimeGenerated, 1h), Policy = policy_s, PolicyMode = policyMode_s, Resource, ResourceId
| order by RequestCount desc

按 IP 和规则阻止的前 20 个客户端

按 IP 和规则名称显示阻止的前 20 个客户端。

// Summarize top 20 blocked clients by IP and rule
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.NETWORK" and Category == "FrontdoorWebApplicationFirewallLog"
| where action_s == "Block"
| summarize RequestCount = count() by ClientIP = clientIP_s, UserAgent = userAgent_s, RuleName = ruleName_s ,Resource
| top 20 by RequestCount 
| order by RequestCount desc

按主机、路径、规则和操作的防火墙请求计数

按主机、路径、规则和采取的操作对防火墙处理的请求进行计数。

// Summarize request count by host, path, rule, and action
// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.NETWORK" and Category == "FrontdoorWebApplicationFirewallLog"
| extend ParsedUrl = parseurl(requestUri_s)
| summarize RequestCount = count() by Host = tostring(ParsedUrl.Host), Path = tostring(ParsedUrl.Path), RuleName = ruleName_s, Action = action_s, ResourceId
| order by RequestCount desc

应用程序规则日志数据

分析应用程序规则日志数据。

AzureDiagnostics
| where Category == "AzureFirewallApplicationRule"
//this first parse statement is valid for all entries as they all start with this format
| parse msg_s with Protocol " request from " SourceIP ":" SourcePort:int * 
//Parse action as this is the same for all log lines 
| parse kind=regex flags=U msg_s with * ". Action\\: " Action "\\."
// case1: Action: A. Reason: R.
| parse kind=regex flags=U msg_s with "\\. Reason\\: " Reason "\\."
//case 2a: to FQDN:PORT Url: U. Action: A. Policy: P. Rule Collection Group: RCG. Rule Collection: RC. Rule: R.
| parse msg_s with * "to " FQDN ":" TargetPort:int * "." *
//Parse policy if present
| parse msg_s with * ". Policy: " Policy ". Rule Collection Group: " RuleCollectionGroup "." *
| parse msg_s with * " Rule Collection: " RuleCollection ". Rule: " Rule
//case 2.b: Web Category: WC.
| parse Rule with * ". Web Category: " WebCategory
//case 3: No rule matched. Proceeding with default action"
| extend DefaultRule = iff(msg_s contains "No rule matched. Proceeding with default action", true, false)
| extend 
SourcePort = tostring(SourcePort),
TargetPort = tostring(TargetPort)
| extend 
 Action = case(Action == "","N/A", case(DefaultRule, "Deny" ,Action)),
 FQDN = case(FQDN == "", "N/A", FQDN),
 TargetPort = case(TargetPort == "", "N/A", tostring(TargetPort)),
 Policy = case(RuleCollection contains ":", split(RuleCollection, ":")[0] ,case(Policy == "", "N/A", Policy)),
 RuleCollectionGroup = case(RuleCollection contains ":", split(RuleCollection, ":")[1], case(RuleCollectionGroup == "", "N/A", RuleCollectionGroup)),
 RuleCollection = case(RuleCollection contains ":", split(RuleCollection, ":")[2], case(RuleCollection == "", "N/A", RuleCollection)),
 WebCategory = case(WebCategory == "", "N/A", WebCategory),
 Rule = case(Rule == "" , "N/A", case(WebCategory == "N/A", Rule, split(Rule, '.')[0])),
 Reason = case(Reason == "", case(DefaultRule, "No rule matched - default action", "N/A"), Reason )
| project TimeGenerated, msg_s, Protocol, SourceIP, SourcePort, FQDN, TargetPort, Action, Policy, RuleCollectionGroup, RuleCollection, Rule, Reason ,WebCategory

网络规则日志数据

分析网络规则日志数据。

AzureDiagnostics
| where Category == "AzureFirewallNetworkRule"
| where OperationName == "AzureFirewallNatRuleLog" or OperationName == "AzureFirewallNetworkRuleLog"
//case 1: for records that look like this:
//PROTO request from IP:PORT to IP:PORT.
| parse msg_s with Protocol " request from " SourceIP ":" SourcePortInt:int " to " TargetIP ":" TargetPortInt:int *
//case 1a: for regular network rules
| parse kind=regex flags=U msg_s with * ". Action\\: " Action1a "\\."
//case 1b: for NAT rules
//TCP request from IP:PORT to IP:PORT was DNAT'ed to IP:PORT
| parse msg_s with * " was " Action1b:string " to " TranslatedDestination:string ":" TranslatedPort:int *
//Parse rule data if present
| parse msg_s with * ". Policy: " Policy ". Rule Collection Group: " RuleCollectionGroup "." *
| parse msg_s with * " Rule Collection: "  RuleCollection ". Rule: " Rule 
//case 2: for ICMP records
//ICMP request from 10.0.2.4 to 10.0.3.4. Action: Allow
| parse msg_s with Protocol2 " request from " SourceIP2 " to " TargetIP2 ". Action: " Action2
| extend
SourcePort = tostring(SourcePortInt),
TargetPort = tostring(TargetPortInt)
| extend 
    Action = case(Action1a == "", case(Action1b == "",Action2,Action1b), split(Action1a,".")[0]),
    Protocol = case(Protocol == "", Protocol2, Protocol),
    SourceIP = case(SourceIP == "", SourceIP2, SourceIP),
    TargetIP = case(TargetIP == "", TargetIP2, TargetIP),
    //ICMP records don't have port information
    SourcePort = case(SourcePort == "", "N/A", SourcePort),
    TargetPort = case(TargetPort == "", "N/A", TargetPort),
    //Regular network rules don't have a DNAT destination
    TranslatedDestination = case(TranslatedDestination == "", "N/A", TranslatedDestination), 
    TranslatedPort = case(isnull(TranslatedPort), "N/A", tostring(TranslatedPort)),
    //Rule information
    Policy = case(Policy == "", "N/A", Policy),
    RuleCollectionGroup = case(RuleCollectionGroup == "", "N/A", RuleCollectionGroup ),
    RuleCollection = case(RuleCollection == "", "N/A", RuleCollection ),
    Rule = case(Rule == "", "N/A", Rule)
| project TimeGenerated, msg_s, Protocol, SourceIP,SourcePort,TargetIP,TargetPort,Action, TranslatedDestination, TranslatedPort, Policy, RuleCollectionGroup, RuleCollection, Rule

威胁情报规则日志数据

分析威胁情报规则日志数据。

AzureDiagnostics
| where OperationName  == "AzureFirewallThreatIntelLog"
| parse msg_s with Protocol " request from " SourceIP ":" SourcePortInt:int " to " TargetIP ":" TargetPortInt:int *
| parse msg_s with * ". Action: " Action "." Message
| parse msg_s with Protocol2 " request from " SourceIP2 " to " TargetIP2 ". Action: " Action2
| extend SourcePort = tostring(SourcePortInt),TargetPort = tostring(TargetPortInt)
| extend Protocol = case(Protocol == "", Protocol2, Protocol),SourceIP = case(SourceIP == "", SourceIP2, SourceIP),TargetIP = case(TargetIP == "", TargetIP2, TargetIP),SourcePort = case(SourcePort == "", "N/A", SourcePort),TargetPort = case(TargetPort == "", "N/A", TargetPort)
| sort by TimeGenerated desc 
| project TimeGenerated, msg_s, Protocol, SourceIP,SourcePort,TargetIP,TargetPort,Action,Message

Azure 防火墙日志数据

如果要分析来自网络规则、应用程序规则、NAT 规则、IDS、威胁情报等的日志,以便了解允许或拒绝某些流量的原因,请从此查询开始。 此查询将显示最后 100 条日志记录,但通过在查询末尾添加简单的筛选语句,可以调整结果。

// Parses the azure firewall rule log data. 
// Includes network rules, application rules, threat intelligence, ips/ids, ...
AzureDiagnostics
| where Category == "AzureFirewallNetworkRule" or Category == "AzureFirewallApplicationRule"
//optionally apply filters to only look at a certain type of log data
//| where OperationName == "AzureFirewallNetworkRuleLog"
//| where OperationName == "AzureFirewallNatRuleLog"
//| where OperationName == "AzureFirewallApplicationRuleLog"
//| where OperationName == "AzureFirewallIDSLog"
//| where OperationName == "AzureFirewallThreatIntelLog"
| extend msg_original = msg_s
// normalize data so it's eassier to parse later
| extend msg_s = replace(@'. Action: Deny. Reason: SNI TLS extension was missing.', @' to no_data:no_data. Action: Deny. Rule Collection: default behavior. Rule: SNI TLS extension missing', msg_s)
| extend msg_s = replace(@'No rule matched. Proceeding with default action', @'Rule Collection: default behavior. Rule: no rule matched', msg_s)
// extract web category, then remove it from further parsing
| parse msg_s with * " Web Category: " WebCategory
| extend msg_s = replace(@'(. Web Category:).*','', msg_s)
// extract RuleCollection and Rule information, then remove it from further parsing
| parse msg_s with * ". Rule Collection: " RuleCollection ". Rule: " Rule
| extend msg_s = replace(@'(. Rule Collection:).*','', msg_s)
// extract Rule Collection Group information, then remove it from further parsing
| parse msg_s with * ". Rule Collection Group: " RuleCollectionGroup
| extend msg_s = replace(@'(. Rule Collection Group:).*','', msg_s)
// extract Policy information, then remove it from further parsing
| parse msg_s with * ". Policy: " Policy
| extend msg_s = replace(@'(. Policy:).*','', msg_s)
// extract IDS fields, for now it's always add the end, then remove it from further parsing
| parse msg_s with * ". Signature: " IDSSignatureIDInt ". IDS: " IDSSignatureDescription ". Priority: " IDSPriorityInt ". Classification: " IDSClassification
| extend msg_s = replace(@'(. Signature:).*','', msg_s)
// extra NAT info, then remove it from further parsing
| parse msg_s with * " was DNAT'ed to " NatDestination
| extend msg_s = replace(@"( was DNAT'ed to ).*",". Action: DNAT", msg_s)
// extract Threat Intellingence info, then remove it from further parsing
| parse msg_s with * ". ThreatIntel: " ThreatIntel
| extend msg_s = replace(@'(. ThreatIntel:).*','', msg_s)
// extract URL, then remove it from further parsing
| extend URL = extract(@"(Url: )(.*)(\. Action)",2,msg_s)
| extend msg_s=replace(@"(Url: .*)(Action)",@"\2",msg_s)
// parse remaining "simple" fields
| parse msg_s with Protocol " request from " SourceIP " to " Target ". Action: " Action
| extend 
    SourceIP = iif(SourceIP contains ":",strcat_array(split(SourceIP,":",0),""),SourceIP),
    SourcePort = iif(SourceIP contains ":",strcat_array(split(SourceIP,":",1),""),""),
    Target = iif(Target contains ":",strcat_array(split(Target,":",0),""),Target),
    TargetPort = iif(SourceIP contains ":",strcat_array(split(Target,":",1),""),""),
    Action = iif(Action contains ".",strcat_array(split(Action,".",0),""),Action),
    Policy = case(RuleCollection contains ":", split(RuleCollection, ":")[0] ,Policy),
    RuleCollectionGroup = case(RuleCollection contains ":", split(RuleCollection, ":")[1], RuleCollectionGroup),
    RuleCollection = case(RuleCollection contains ":", split(RuleCollection, ":")[2], RuleCollection),
    IDSSignatureID = tostring(IDSSignatureIDInt),
    IDSPriority = tostring(IDSPriorityInt)
| project msg_original,TimeGenerated,Protocol,SourceIP,SourcePort,Target,TargetPort,URL,Action, NatDestination, OperationName,ThreatIntel,IDSSignatureID,IDSSignatureDescription,IDSPriority,IDSClassification,Policy,RuleCollectionGroup,RuleCollection,Rule,WebCategory
| order by TimeGenerated
| limit 100

Azure 防火墙 DNS 代理日志数据

如果想要了解防火墙 DNS 代理日志数据,请从此查询开始。 此查询将显示最后 100 条日志记录,但通过在查询末尾添加简单的筛选语句,可以调整结果。

// DNS proxy log data 
// Parses the DNS proxy log data. 
AzureDiagnostics
| where Category == "AzureFirewallDnsProxy"
| parse msg_s with "DNS Request: " SourceIP ":" SourcePortInt:int " - " QueryID:int " " RequestType " " RequestClass " " hostname ". " protocol " " details
| extend
    ResponseDuration = extract("[0-9]*.?[0-9]+s$", 0, msg_s),
    SourcePort = tostring(SourcePortInt),
    QueryID =  tostring(QueryID)
| project TimeGenerated,SourceIP,hostname,RequestType,ResponseDuration,details,msg_s
| order by TimeGenerated
| limit 100

BGP 路由表

过去 12 小时内学习的 BPG 路由表。

AzureDiagnostics
| where TimeGenerated > ago(12h)
| where ResourceType == "EXPRESSROUTECIRCUITS"
| project TimeGenerated , ResourceType , network_s , path_s , OperationName

BGP 信息性消息

按级别、资源类型和网络提供的 BGP 信息性消息。

AzureDiagnostics
| where Level == "Informational"
| project TimeGenerated , ResourceId, Level, ResourceType , network_s , path_s

监视状态关闭的终结点

查找 Azure 流量管理器终结点的监视状态关闭的原因。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceType == "TRAFFICMANAGERPROFILES"  and Category  == "ProbeHealthStatusEvents"
| where Status_s == "Down"
| project TimeGenerated, EndpointName_s, Status_s, ResultDescription, SubscriptionId , _ResourceId

成功的 P2S 连接

在过去 12 小时内成功建立 P2S 连接。

AzureDiagnostics 
| where TimeGenerated > ago(12h)
| where Category == "P2SDiagnosticLog" and Message has "Connection successful"
| project TimeGenerated, Resource ,Message

失败的 P2S 连接

过去 12 小时内 P2S 连接失败。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics 
| where TimeGenerated > ago(12h)
| where Category == "P2SDiagnosticLog" and Message has "Connection failed"
| project TimeGenerated, Resource ,Message

网关配置更改

管理员在过去 24 小时内成功进行了网关配置更改。

AzureDiagnostics
| where TimeGenerated > ago(24h)
| where Category == "GatewayDiagnosticLog" and operationStatus_s == "Success" and configuration_ConnectionTrafficType_s == "Internet"
| project TimeGenerated, Resource, OperationName, Message, operationStatus_s

S2S 隧道连接/断开连接事件

过去 24 小时内的 S2S 隧道连接/断开连接事件。

AzureDiagnostics 
| where TimeGenerated > ago(24h)
| where Category == "TunnelDiagnosticLog" and (status_s == "Connected" or status_s == "Disconnected")
| project TimeGenerated, Resource , status_s, remoteIP_s, stateChangeReason_s

BGP 路由更新

过去 24 小时内的 BGP 路由更新。

AzureDiagnostics
| where TimeGenerated > ago(24h)
| where Category == "RouteDiagnosticLog" and OperationName == "BgpRouteUpdate"

显示 AzureDiagnostics 表中的日志

Lists AzureDiagnostics 表中的最新日志,按时间 (最新的第一个) 排序。

AzureDiagnostics
| top 10 by TimeGenerated

microsoft.recoveryservices 查询

失败的备份作业

查找从最后一天报告失败的备份作业的日志。

AzureDiagnostics  
| where ResourceProvider == "MICROSOFT.RECOVERYSERVICES" and Category == "AzureBackupReport"  
| where OperationName == "Job" and JobOperation_s == "Backup" and JobStatus_s == "Failed" 
| project TimeGenerated, JobUniqueId_g, JobStartDateTime_s, JobOperation_s, JobOperationSubType_s, JobStatus_s , JobFailureCode_s, JobDurationInSecs_s , AdHocOrScheduledJob_s

microsoft.servicebus 查询

[经典]列表管理操作

这会列出所有管理调用。

AzureDiagnostics
| where ResourceProvider ==\"MICROSOFT.SERVICEBUS\"
| where Category == \"OperationalLogs\"
| summarize count() by EventName_s, _ResourceId

[经典]错误摘要

汇总遇到的所有错误。

AzureDiagnostics
| where ResourceProvider ==\"MICROSOFT.SERVICEBUS\"
| where Category == \"Error\"
| summarize count() by EventName_s, _ResourceId

[经典]Keyvault 访问尝试 - 找不到密钥

总结未找到密钥时对 keyvault 的访问权限。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == \"MICROSOFT.SERVICEBUS\"
| where Category == \"Error\" and OperationName == \"wrapkey\"
| project Message, _ResourceId

[经典]AutoDeleted 实体

已自动删除的所有实体的摘要。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == \"MICROSOFT.SERVICEBUS\"
| where Category == \"OperationalLogs\"
| where EventName_s startswith \"AutoDelete\"
| summarize count() by EventName_s, _ResourceId

[经典]Keyvault 已执行操作

总结使用 keyvault 为禁用或还原密钥而执行的操作。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == \"MICROSOFT.SERVICEBUS\"
| where (Category == \"info\" and (OperationName == \"disable\" or OperationName == \"restore\"))
| project Message, _ResourceId

过去 7 天内的管理操作

这会列出过去 7 天的所有管理调用。

AzureDiagnostics
| where TimeGenerated > ago(7d)
| where ResourceProvider =="MICROSOFT.SERVICEBUS"
| where Category == "OperationalLogs"
| summarize count() by EventName_s, _ResourceId

错误摘要

汇总过去 7 天内看到的所有错误。

AzureDiagnostics
| where TimeGenerated > ago(7d)
| where ResourceProvider =="MICROSOFT.SERVICEBUS"
| where Category == "Error" 
| summarize count() by EventName_s, _ResourceId

Keyvault 访问尝试 - 找不到密钥

总结未找到密钥时对 keyvault 的访问权限。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.SERVICEBUS" 
| where Category == "Error" and OperationName == "wrapkey"
| project Message, _ResourceId

AutoDeleted 实体

已自动删除的所有实体的摘要。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.SERVICEBUS"
| where Category == "OperationalLogs"
| where EventName_s startswith "AutoDelete"
| summarize count() by EventName_s, _ResourceId

Keyvault 已执行操作

总结使用 keyvault 为禁用或还原密钥而执行的操作。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.SERVICEBUS"
| where (Category == "info" and (OperationName == "disable" or OperationName == "restore"))
| project Message, _ResourceId

查询microsoft.sql

托管实例上的存储超过 90%

显示存储利用率高于 90% 的所有托管实例。

// To create an alert for this query, click '+ New alert rule'
let storage_percentage_threshold = 90;
AzureDiagnostics
| where Category =="ResourceUsageStats"
| summarize (TimeGenerated, calculated_storage_percentage) = arg_max(TimeGenerated, todouble(storage_space_used_mb_s) *100 / todouble (reserved_storage_mb_s))
   by _ResourceId
| where calculated_storage_percentage > storage_percentage_threshold

托管实例上的 CPU 利用率控制在 95% 以上

显示 CPU 占用量超过 95% 的所有托管实例。

// To create an alert for this query, click '+ New alert rule'
let cpu_percentage_threshold = 95;
let time_threshold = ago(1h);
AzureDiagnostics
| where Category == "ResourceUsageStats" and TimeGenerated > time_threshold
| summarize avg_cpu = max(todouble(avg_cpu_percent_s)) by _ResourceId
| where avg_cpu > cpu_percentage_threshold

显示所有活动的智能见解

显示智能见解检测到的所有活动性能问题。 请注意,需要为每个受监视的数据库启用 SQLInsights 日志。

AzureDiagnostics
| where Category == "SQLInsights" and status_s == "Active"
| distinct rootCauseAnalysis_s

等待统计信息

等待过去一小时的统计信息,按逻辑服务器和数据库。

AzureDiagnostics
| where ResourceProvider == "MICROSOFT.SQL"
| where TimeGenerated >= ago(60min)
| parse _ResourceId with * "/microsoft.sql/servers/" LogicalServerName "/databases/" DatabaseName
| summarize Total_count_60mins = sum(delta_waiting_tasks_count_d) by LogicalServerName, DatabaseName, wait_type_s

microsoft.streamanalytics 查询

列出所有输入数据错误

显示处理来自输入的数据时发生的所有错误。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics 
| where ResourceProvider == "MICROSOFT.STREAMANALYTICS" and parse_json(properties_s).Type == "DataError" 
| project TimeGenerated, Resource, Region_s, OperationName, properties_s, Level, _ResourceId

列出所有输入反序列化错误

显示作业无法反序列化的格式不正确的事件导致的错误。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.STREAMANALYTICS" and parse_json(properties_s).DataErrorType in ("InputDeserializerError.InvalidData", "InputDeserializerError.TypeConversionError", "InputDeserializerError.MissingColumns", "InputDeserializerError.InvalidHeader", "InputDeserializerError.InvalidCompressionType")
| project TimeGenerated, Resource, Region_s, OperationName, properties_s, Level, _ResourceId

列出所有 InvalidInputTimeStamp 错误

显示由于无法将 TIMESTAMP BY 表达式的值转换为 datetime 的事件导致的错误。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.STREAMANALYTICS" and  parse_json(properties_s).DataErrorType == "InvalidInputTimeStamp"
| project TimeGenerated, Resource, Region_s, OperationName, properties_s, Level, _ResourceId

列出所有 InvalidInputTimeStampKey 错误

显示由 TIMESTAMP BY OVER timestampColumn 值为 NULL 的事件导致的错误。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.STREAMANALYTICS" and  parse_json(properties_s).DataErrorType == "InvalidInputTimeStampKey"
| project TimeGenerated, Resource, Region_s, OperationName, properties_s, Level, _ResourceId

延迟到达的事件

显示应用程序时间和到达时间差大于延迟到达策略的事件导致的错误。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.STREAMANALYTICS" and  parse_json(properties_s).DataErrorType == "LateInputEvent"
| project TimeGenerated, Resource, Region_s, OperationName, properties_s, Level, _ResourceId

提前到达的事件

显示应用程序时间与到达时间差大于 5 分钟的事件导致的错误。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.STREAMANALYTICS" and parse_json(properties_s).DataErrorType == "EarlyInputEvent"
| project TimeGenerated, Resource, Region_s, OperationName, properties_s, Level, _ResourceId

无序到达的事件

根据无序策略显示由于事件无序到达而导致的错误。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.STREAMANALYTICS" and parse_json(properties_s).DataErrorType == "OutOfOrderEvent"
| project TimeGenerated, Resource, Region_s, OperationName, properties_s, Level, _ResourceId

所有输出数据错误

显示将查询结果写入作业中的输出时发生的所有错误。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.STREAMANALYTICS" and parse_json(properties_s).DataErrorType in ("OutputDataConversionError.RequiredColumnMissing", "OutputDataConversionError.ColumnNameInvalid", "OutputDataConversionError.TypeConversionError", "OutputDataConversionError.RecordExceededSizeLimit", "OutputDataConversionError.DuplicateKey")
| project TimeGenerated, Resource, Region_s, OperationName, properties_s, Level, _ResourceId

列出所有 RequiredColumnMissing 错误

显示作业生成的输出记录缺少列的所有错误。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.STREAMANALYTICS" and parse_json(properties_s).DataErrorType == "OutputDataConversionError.RequiredColumnMissing"
| project TimeGenerated, Resource, Region_s, OperationName, properties_s, Level, _ResourceId

列出所有 ColumnNameInvalid 错误

显示错误:作业生成的输出记录的列名称未映射到输出中的列。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.STREAMANALYTICS" and parse_json(properties_s).DataErrorType == "OutputDataConversionError.ColumnNameInvalid"
| project TimeGenerated, Resource, Region_s, OperationName, properties_s, Level, _ResourceId

列出所有 TypeConversionError 错误

显示作业生成的输出记录的列无法转换为输出中的有效类型的错误。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.STREAMANALYTICS" and parse_json(properties_s).DataErrorType == "OutputDataConversionError.TypeConversionError"
| project TimeGenerated, Resource, Region_s, OperationName, properties_s, Level, _ResourceId

列出所有 RecordExceededSizeLimit 错误

显示作业生成的输出记录大小大于支持的输出大小的错误。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.STREAMANALYTICS" and  parse_json(properties_s).DataErrorType == "OutputDataConversionError.RecordExceededSizeLimit"
| project TimeGenerated, Resource, Region_s, OperationName, properties_s, Level, _ResourceId

列出所有 DuplicateKey 错误

显示作业生成的输出记录包含与系统列同名的列的错误。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.STREAMANALYTICS" and parse_json(properties_s).DataErrorType == "OutputDataConversionError.DuplicateKey"
| project TimeGenerated, Resource, Region_s, OperationName, properties_s, Level, _ResourceId

级别为“Error”的所有日志

显示可能对作业产生负面影响的所有日志。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.STREAMANALYTICS" and Level == "Error" 
| project TimeGenerated, Resource, Region_s, OperationName, properties_s, Level, _ResourceId

已“失败”的操作

显示作业上导致失败的所有操作。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.STREAMANALYTICS" and status_s == "Failed" 
| project TimeGenerated, Resource, Region_s, OperationName, properties_s, Level, _ResourceId

(Cosmos DB、Power BI、事件中心) 的输出限制日志

显示目标服务限制写入其中一个输出的所有实例。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.STREAMANALYTICS" and parse_json(properties_s).Type in ("DocumentDbOutputAdapterWriteThrottlingError", "EventHubOutputAdapterEventHubThrottlingError", "PowerBIServiceThrottlingError", "PowerBIServiceThrottlingError")
| project TimeGenerated, Resource, Region_s, OperationName, properties_s, Level, _ResourceId

暂时性输入和输出错误

显示与输入和输出相关的所有错误,这些错误本质上是间歇性的。

// To create an alert for this query, click '+ New alert rule'
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.STREAMANALYTICS" and parse_json(properties_s).Type in ("AzureFunctionOutputAdapterTransientError", "BlobInputAdapterTransientError", "DataLakeOutputAdapterTransientError", "DocumentDbOutputAdapterTransientError", "EdgeHubOutputAdapterEdgeHubTransientError", "EventHubBasedInputInvalidOperationTransientError", "EventHubBasedInputOperationCanceledTransientError", "EventHubBasedInputTimeoutTransientError", "EventHubBasedInputTransientError", "EventHubOutputAdapterEventHubTransientError", "InputProcessorTransientFailure", "OutputProcessorTransientError", "ReferenceDataInputAdapterTransientError", "ServiceBusOutputAdapterTransientError", "TableOutputAdapterTransientError")
| project TimeGenerated, Resource, Region_s, OperationName, properties_s, Level, _ResourceId

过去 7 天内所有数据错误的摘要

过去 7 天内所有数据错误的摘要。

AzureDiagnostics
| where TimeGenerated > ago(7d) //last 7 days
| where ResourceProvider == "MICROSOFT.STREAMANALYTICS" and parse_json(properties_s).Type == "DataError"
| extend DataErrorType = tostring(parse_json(properties_s).DataErrorType)
| summarize Count=count(), sampleEvent=any(properties_s)  by DataErrorType, JobName=Resource

过去 7 天内所有错误的摘要

过去 7 天内所有错误的摘要。

AzureDiagnostics
| where TimeGenerated > ago(7d) //last 7 days
| where ResourceProvider == "MICROSOFT.STREAMANALYTICS"
| extend ErrorType = tostring(parse_json(properties_s).Type)
| summarize Count=count(), sampleEvent=any(properties_s)  by ErrorType, JobName=Resource

过去 7 天内的“失败”操作摘要

过去 7 天内的“失败”操作摘要。

AzureDiagnostics
| where TimeGenerated > ago(7d) //last 7 days
| where ResourceProvider == "MICROSOFT.STREAMANALYTICS" and status_s == "Failed" 
| summarize Count=count(), sampleEvent=any(properties_s) by JobName=Resource