InsightsMetrics 테이블에 대한 쿼리

IoT Edge: 디바이스가 오프라인 상태이거나 예상 속도로 업스트림 메시지를 보내지 않음

지난 2일 동안 D2C 메시지를 30분 동안 예상 속도로 IoT Hub 보내지 않는 IoT Edge 디바이스를 식별합니다.

// To create an alert for this query, click '+ New alert rule'
let targetReceiver = "upstream";
InsightsMetrics
| where Origin == "iot.azm.ms" and Namespace == "metricsmodule"
| where Name == "edgehub_messages_sent_total"
| extend dimensions=parse_json(Tags)
| extend device = tostring(dimensions.edge_device)
| extend target = trim_start(@"[^/]+/", extractjson("$.to", 
tostring(dimensions), typeof(string)))
| where target contains targetReceiver
| extend source = strcat(device, "::", trim_start(@"[^/]+/", 
tostring(dimensions.from)))
| extend messages = toint(Val)
| extend timeUtc = TimeGenerated
| extend sourceTarget = strcat(source, "::", target)
| project timeUtc, source, sourceTarget, messages, device, _ResourceId
| order by device, sourceTarget, timeUtc
| serialize
| extend nextCount = next(messages, 1)
| extend nextSourceTarget= next(sourceTarget, 1)
| extend diff = iff((messages - nextCount) >= 0, messages - nextCount, 0)
| where sourceTarget == nextSourceTarget and diff >= 0
| project TimeGenerated = timeUtc, source, sourceTarget, messages, diff, 
device, _ResourceId
| make-series sum(diff) default=0 on TimeGenerated from ago(2d) to now() 
step 30m by device, _ResourceId
| mv-expand sum_diff, TimeGenerated
| project TimeGenerated=todatetime(TimeGenerated), device, 
AggregatedValue=toint(sum_diff), _ResourceId

IoT Edge: 에지 허브 큐 크기 임계값 초과

평가 기간 동안 디바이스의 Edge Hub 큐 크기(합계)가 구성된 임계값을 초과한 횟수입니다.

// To create an alert for this query, click '+ New alert' 
let qlenThreshold = 100;
InsightsMetrics
| where Origin == "iot.azm.ms" and Namespace == "metricsmodule"
| where Name == "edgehub_queue_length"
| extend dimensions=parse_json(Tags)
| extend device = tostring(dimensions.edge_device)
| extend ep = tostring(dimensions.endpoint)
| extend qlen = toint(Val)
| project device, qlen, ep, TimeGenerated, _ResourceId
| summarize sum(qlen) by TimeGenerated, device, _ResourceId
| where sum_qlen >= qlenThreshold
| project-away sum_qlen

최대 노드 디스크

최대 노드 디스크 사용량은 평균 30분 간격을 초과했습니다.

// To create an alert for this query, click '+ New alert rule'
//InsightMetrics contains all the custom metrics for Container Insights solution
InsightsMetrics // Replace Name with your custom metric
| where Name == "used_percent" and Namespace == "container.azm.ms/disk" 
| summarize val= max(Val) by bin(TimeGenerated, 15m), _ResourceId
| render timechart

노드당 초당 Prometheus 디스크 읽기

기본 kubernetes 네임스페이스의 Prometheus 디스크 읽기 메트릭을 시간 차트로 봅니다.

// To create an alert for this query, click '+ New alert rule'
// Update TimeGenerated field for custom time range
InsightsMetrics
| where Namespace == 'container.azm.ms/diskio'
| where TimeGenerated > ago(1h)
| where Name == 'reads'
| extend Tags = todynamic(Tags)
| extend HostName = tostring(Tags.hostName), Device = Tags.name
| extend NodeDisk = strcat(Device, "/", HostName)
| order by NodeDisk asc, TimeGenerated asc
| serialize //calculating the PreVal, PrevTimeGenerated to render the chart.
| extend PrevVal = iif(prev(NodeDisk) != NodeDisk, 0.0, prev(Val)), PrevTimeGenerated = iif(prev(NodeDisk) != NodeDisk, datetime(null), prev(TimeGenerated))
| where isnotnull(PrevTimeGenerated) and PrevTimeGenerated != TimeGenerated
//Calculating the rate for disk using PreVal
| extend Rate = iif(PrevVal > Val, Val / (datetime_diff('Second', TimeGenerated, PrevTimeGenerated) * 1), iif(PrevVal == Val, 0.0, (Val - PrevVal) / (datetime_diff('Second', TimeGenerated, PrevTimeGenerated) * 1)))
| where isnotnull(Rate)
| project TimeGenerated, NodeDisk, Rate, _ResourceId
| render timechart

InsightsMetrics에서 찾기

InsightsMetrics 테이블에서 특정 값을 검색하려면 InsightsMetrics에서 찾습니다./nNote 이 쿼리를 수행하려면 결과를 생성하기 위해 SeachValue> 매개 변수를 업데이트<해야 합니다.

// This query requires a parameter to run. Enter value in SearchValue to find in table.
let SearchValue =  "<SearchValue>";//Please update term you would like to find in the table.
InsightsMetrics
| where * contains tostring(SearchValue)
| take 1000

어떤 데이터가 수집되나요?

수집된 성능 카운터 및 개체 형식을 나열합니다.

InsightsMetrics
| where Origin == "vm.azm.ms"
| summarize by Namespace, Name

Virtual Machine 사용 가능한 메모리

Virtual Machine 사용 가능한 메모리.

InsightsMetrics
| where TimeGenerated > ago(1h)
| where Origin == "vm.azm.ms"
| where Namespace == "Memory"
| where Name == "AvailableMB"
| summarize avg(Val) by bin(TimeGenerated, 5m), Computer
| render timechart 

지난 1시간 동안의 CPU 사용량 패턴을 백분위수별로 차트로 계산합니다.

InsightsMetrics
| where TimeGenerated > ago(1h)
| where Origin == "vm.azm.ms"
| where Namespace == "Processor"
| where Name == "UtilizationPercentage"
| summarize avg(Val) by bin(TimeGenerated, 5m), Computer //split up by computer
| render timechart

Virtual Machine 사용 가능한 디스크 공간

instance당 사용 가능한 디스크 공간의 최신 보고서를 표시합니다.

InsightsMetrics
| where TimeGenerated > ago(1h)
| where Origin == "vm.azm.ms"
| where Namespace == "LogicalDisk"
| where Name == "FreeSpaceMB"
| extend t=parse_json(Tags)
| summarize arg_max(TimeGenerated, *) by tostring(t["vm.azm.ms/mountId"]), Computer // arg_max over TimeGenerated returns the latest record
| project Computer, TimeGenerated, t["vm.azm.ms/mountId"], Val

하트비트를 사용하여 VM 가용성 추적

지난 1시간 동안 VM의 보고된 가용성을 표시합니다.

InsightsMetrics
| where TimeGenerated > ago(1h)
| where Origin == "vm.azm.ms"
| where Namespace == "Computer"
| where Name == "Heartbeat"
| summarize heartbeat_count = count() by bin(TimeGenerated, 5m), Computer
| extend alive=iff(heartbeat_count > 2, 1.0, 0.0) //computer considered "down" if it has 2 or fewer heartbeats in 5 min interval
| project TimeGenerated, alive, Computer
| render timechart with (ymin = 0, ymax = 1) 

CPU 사용률별 상위 10개 Virtual Machines

CPU 사용률별 상위 10개 Virtual Machines.

InsightsMetrics
| where TimeGenerated > ago(1h)
| where Origin == "vm.azm.ms"
| where Namespace == "Processor" and Name == "UtilizationPercentage"
| summarize P90 = percentile(Val, 90) by Computer
| top 10 by P90

아래쪽 10 사용 가능한 디스크 공간 %

하위 10 컴퓨터별 사용 가능한 디스크 공간 %

InsightsMetrics
| where TimeGenerated > ago(24h)
| where Origin == "vm.azm.ms"
| where Namespace == "LogicalDisk" and Name == "FreeSpacePercentage"
| summarize P90 = percentile(Val, 90) by Computer
| top 10 by P90 asc