Useful Operational Insights Search Query Collection
[Edited October 27th 2014 - System Center Advisor is now a part of the new Microsoft Azure Operational Insights - Click to learn more]
This is a living document that will be periodically updated to collect useful, well-known, or sample queries to use in the Search experience in Microsoft Azure Operational Insights .
I dump new useful searches here as I come up with or stumble into them. Will keep this post periodically updated, so check it from time to time or subscribe to it.
These are some of the queries I use in my own workspace’s dashboard
I hope this will provide useful examples to learn from… but reminder the full query language reference is published here: https://go.microsoft.com/fwlink/?LinkId=394544 when you don’t understand why a given search magically works (or doesn’t) in your environment
They are grouped by broad categories that generally map to the Intelligence Pack that produces a specific ‘Type’ of data. We also have documentation on the Types we use in various intelligence packs and the meaning of their fields here https://msdn.microsoft.com/en-us/library/azure/dn884648.aspx
General Exploration Queries
Which Management Group is generating the most data points?
* | Measure count() by ManagementGroupName
Distribution of data Types
* | Measure count() by Type
List all Computers
ObjectName!="Advisor Metrics" ObjectName!=ManagedSpace | measure max(SourceSystem) by Computer | Sort Computer
List all Computers with their most recent data's timestamp
ObjectName!="Advisor Metrics" ObjectName!=ManagedSpace | measure max(TimeGenerated) by Computer | Sort Computer
List all Computers whose last reported data is older than 4 hours
ObjectName!="Advisor Metrics" ObjectName!=ManagedSpace ObjectName!="Advisor Metrics" ObjectName!=ManagedSpace | Measure Max(TimeGenerated) as LastData by Computer | Where LastData<NOW-4HOURS | Sort Computer
Note – the ObjectName!= filters in the three queries above is just a workaround to filter out some performance data whose target object in SCOM is NOT a ‘Computer’, hence will have a improper value in that field.
Note#2 – if you see ‘duplicate’ computer names (the NETBIOS name and the FQDN for the same machine listed as distinct computer), this might be due to IIS Logs – see post here where I describe the issue with the ‘Computer’ field https://blogs.technet.com/b/momteam/archive/2014/09/19/iis-log-format-requirements-in-system-center-advisor.aspx . If you know you have *other* data for that computer for sure – not just IIS logs - you can then easily filter those out (another workaround) and the last query above now becomes
Type!=W3CIISLog ObjectName!="Advisor Metrics" ObjectName!=ManagedSpace ObjectName!="Advisor Metrics" ObjectName!=ManagedSpace | Measure Max(TimeGenerated) as LastData by Computer | Where LastData<NOW-4HOURS | Sort Computer
Alert Management
Note – if you have been using System Center Advisor Preview, Type=Alert used to be wired to the alerts generated by Advisor Configuration Assessment scenario. With the introduction of ‘Alert Management’ Intelligence pack, which pulls up your Operations Manager Alerts into Operational Insights search, we have re-used Type=Alert for these ‘real’ and ‘reactive’ alerts, and have renamed the Advisor ‘legacy’ alerts (Configuration Assessment Alerts) to Type=ConfigurationAlert. You might need no update your saved searches.
Alerts raised during the past 1 day grouped by their severity
Type=Alert TimeRaised>NOW-1DAY | measure count() as Count by AlertSeverity
Alerts raised during the past 1 day sorted by their repeat count value
Type=Alert TimeRaised>NOW-1DAY | sort RepeatCount desc
Alerts raised during the past 24 hours which are now closed
Type=Alert TimeRaised>NOW-24HOUR AlertState=closed
Last Modified time for Alerts raised during the past 24 hours which are now closed
Type=Alert TimeRaised>NOW-24HOUR AlertState=closed | measure Max(TimeLastModified) by AlertName
Critical alerts raised during the past 24 hours
Type=Alert AlertSeverity=error TimeRaised>NOW-24HOUR
Critical alerts raised during the past 24 hours which are still active
Type=Alert AlertSeverity=error TimeRaised>NOW-24HOUR AlertState!=closed
Sources with active alerts raised during the past 24 hours
Type=Alert AlertState!=closed TimeRaised>NOW-24HOUR | measure count() as Count by SourceDisplayName
Warning alerts raised during the past 24 hours
Type=Alert AlertSeverity=warning TimeRaised>NOW-24HOUR
Capacity (Aggregated Performance Data)
All performance data
Type=PerfHourly
Average CPU utilization by Top 5 machines
* Type=PerfHourly CounterName="% Processor Time" InstanceName="_Total" | Measure avg(SampleValue) as AVGCPU by Computer | Sort AVGCPU desc | Top 5
Max CPU time used by HyperV by machine
Type=PerfHourly CounterName="% Total Run Time" InstanceName="_Total" ObjectName="Hyper-V Hypervisor Logical Processor" | Measure max(Max) as MAXCPU by Computer | Where MAXCPU>0
CPU Utilization by VM/Virtual Core
Type=PerfHourly ObjectName="Hyper-V Hypervisor Virtual Processor" CounterName="% Guest Run Time" NOT(InstanceName="_Total") | Measure Avg(SampleValue) by InstanceName
Memory Utilization by VM/Virtual Core
Type=PerfHourly ObjectName="Hyper-V Dynamic Memory VM" CounterName="Average Pressure" | Measure Avg(SampleValue) by InstanceName
Top Hosts with Highest Core Utilization
CounterName="% Core Utilization" Type=PerfHourly | Measure Avg(SampleValue) by Computer
Top Hosts with Highest Memory Utilization
CounterName="% Memory Utilization" Type=PerfHourly | Measure Avg(SampleValue) by Computer
Top Hosts with Inefficient VMs
CounterName="NumberVMOverUtilized" or CounterName="NumberVMIdle" or CounterName="NumberVMPoweredOff" Type=PerfHourly | Measure Avg(SampleValue) by Computer
Top Hosts by Utilization (mathematical average of CPU and Memory usage counters)
CounterName="% Core Utilization" or CounterName="% Memory Utilization" Type=PerfHourly | Measure Avg(SampleValue) as CombinedCPUMemAvg by Computer
Log Management (Windows Events)
This section contains a mix up of query scenarios. Each Windows log has its own flavor and adds some different, unique perspective about what the system is doing. In order for some of the queries below to work, you would have to first collect the necessary event log. Notice that, since the log management pipeline is one of the simplest and easiest from a functionality perspective, many of the examples around the ‘Operations Manager’ event log are actually useful to troubleshoot Advisor-related discovery and connectivity issue that might be preventing some of the other intelligence packs and scenarios from working. Have fun searching logs!
All Events
Type=Event
Count of Events containing the word "started" grouped by EventID
Type=Event "started" | Measure count() by EventID
Count of Events grouped by Event Log
Type=Event | Measure count() by EventLog
Count of Events grouped by Event Source
Type=Event | Measure count() by Source
Count of Events grouped by Event ID
Type=Event | Measure count() by EventID
All Events with level "Warning"
Type=Event EventLevelName=warning
Count of Events with level "Warning" grouped by Event ID
Type=Event EventLevelName=warning | Measure count() by EventID
How many connections to Operations Manager's SDK service by day
Type=Event EventID=26328 EventLog="Operations Manager" | Measure count() interval 1DAY
Events in the Operations Manager Event Log whose Event ID is in the range between 2000 and 3000
Type=Event EventLog="Operations Manager" EventID:[2000..3000]
Operations Manager Event Log’s Health Service Modules events around connectivity with Advisor
Type=Event EventLog="Operations Manager" EventID:[2100..2199]
Operations Manager Event Log’s Health Service Modules errors around Type Space (=Configuration Data) Subscription Module (if these errors are frequent, Predictions in Capacity Intelligence Pack might be affected/unavailable)
Type=Event EventID=4502 "Microsoft.EnterpriseManagement.Mom.Modules.SubscriptionDataSource.TypeSpaceSubscriptionDataSource"
When did my servers initiate restart?
shutdown Type=Event EventLog=System Source=User32 EventID=1074 | Select TimeGenerated,Computer
Did my servers shutdown unexpectedly?
Type=Event EventID=6008 Source=EventLog
SQL Server was waiting on a I/O request for longer than 15 seconds
EventID=833 EventLog=Application Source=MSSQLSERVER
Windows Firewall Policy settings have changed
Type=Event EventLog="Microsoft-Windows-Windows Firewall With Advanced Security/Firewall" EventID=2008
On which machines and how many times have Windows Firewall Policy settings changed
Type=Event EventLog="Microsoft-Windows-Windows Firewall With Advanced Security/Firewall" EventID=2008 | measure count() by Computer
The following are substitutes you can use to do get some of the same information as the ‘Antimalware Intelligence Pack’ by using Log Management (i.e. for logs you are pulling from WAD for Azure machines that don’t have the MMA agent installed)
Computers with Microsoft Antimalware (SCEP/Defender/Essentials) installed
Type=Event Source="Microsoft Antimalware" | measure count() as Count by Computer
Malware detections
Type=Event Source="Microsoft Antimalware" EventID=1116
Computers with no signature update in the last 24 hours
Type=Event Source="Microsoft Antimalware" EventID=2000 | measure max(TimeGenerated) as lastdata by Computer | where lastdata < NOW-24HOURS
Log Management (IIS Logs)
All IIS Log Entries
Type=W3CIISLog
Count of IIS Log Entries by HTTP Request Method
Type=W3CIISLog | Measure count() by csMethod
Count of IIS Log Entries by Client IP Address
Type=W3CIISLog | Measure count() by cIP
IIS Log Entries for a specific client IP Address (replace with your own)
Type=W3CIISLog cIP="192.168.0.1" | Select csUriStem,scBytes,csBytes,TimeTaken,scStatus
Count of IIS Log Entries by URL requested by client (without query strings)
Type=W3CIISLog | Measure count() by csUriStem
Count of IIS Log Entries by Host requested by client
Type=W3CIISLog | Measure count() by csHost
Count of IIS Log Entries by URL for the host "www.contoso.com" (replace with your own)
Type=W3CIISLog csHost="www.contoso.com" | Measure count() by csUriStem
Count of IIS Log Entries by HTTP User Agent
Type=W3CIISLog | Measure count() by csUserAgent
Total Bytes sent by Client IP Address
Type=W3CIISLog | Measure Sum(csBytes) by cIP
Total Bytes received by each Azure Role InstanceType=W3CIISLog | Measure Sum(csBytes) by RoleInstance
Total Bytes received by each IIS Computer
Type=W3CIISLog | Measure Sum(csBytes) by Computer
Total Bytes responded back to clients by each IIS Server IP Address
Type=W3CIISLog | Measure Sum(scBytes) by sIP
Total Bytes responded back to clients by Client IP Address
Type=W3CIISLog | Measure Sum(scBytes) by cIP
Average HTTP Request time by Client IP Address
Type=W3CIISLog | Measure Avg(TimeTaken) by cIP
Average HTTP Request time by HTTP Method
Type=W3CIISLog | Measure Avg(TimeTaken) by csMethod
[For more W3CIISLog search examples, also read the blog post I published earlier.]
Change Tracking
All Configuration Changes
Type=ConfigurationChange
All Software Changes
Type=ConfigurationChange ConfigChangeType=Software
All Windows Services Changes
Type=ConfigurationChange ConfigChangeType=WindowsServices
Change Type<Software> by Computer
Type=ConfigurationChange ConfigChangeType=Software | Measure count() by Computer
Total Changes by Computer
Type=ConfigurationChange | measure count() by Computer
Changes by Change Type
Type=ConfigurationChange | measure count() by ConfigChangeType
When was the most recent Change by Type?
Type=ConfigurationChange | measure Max(TimeGenerated) by ConfigChangeType
When was the most recent Change for each Computer?
Type=ConfigurationChange | measure Max(TimeGenerated) by Computer
List when Windows Services have been stopped
Type=ConfigurationChange ConfigChangeType=WindowsServices SvcState=Stopped
List of all Windows Services that have been stopped, by frequency
Type=ConfigurationChange ConfigChangeType=WindowsServices SvcState=Stopped | measure count() by SvcDisplayName
Count of different Software change types
Type=ConfigurationChange ConfigChangeType=Software | measure count() by ChangeCategory
SQL Assessment
Did the agent pass the prerequisite check (if results are present, SQL Assessment data might not be complete, you might want to check the RunAs account configuration https://technet.microsoft.com/en-us/library/dn818161.aspx )
Type=SQLAssessmentRecommendation IsRollup=false FocusArea="Prerequisites"
List of Focus Areas the Recommendations are categorized into
Type=SQLAssessmentRecommendation IsRollup=true | measure count() by FocusArea
SQL Recommendation by Computer
Type=SQLAssessmentRecommendation IsRollup=false RecommendationResult=Failed | measure count() by Computer
How much gain can I get by computer if I fix all its SQL recommendations?
Type=SQLAssessmentRecommendation IsRollup=false RecommendationResult=Failed | measure Sum(RecommendationWeight) by Computer
SQL Recommendation by Instance
Type=SQLAssessmentRecommendation IsRollup=false RecommendationResult=Failed | measure count() by SqlInstanceName
SQL Recommendation by Database
Type=SQLAssessmentRecommendation IsRollup=false RecommendationResult=Failed| measure count() by DatabaseName
How many SQL Recommendation are affecting a Computer a SQL Instance or a Database?
Type=SQLAssessmentRecommendation IsRollup=false RecommendationResult=Failed | measure count() by AffectedObjectType
How many times did each unique SQL Recommendation trigger?
Type=SQLAssessmentRecommendation IsRollup=false RecommendationResult=Failed | measure count() by Recommendation
Prioritized Detail Recommendations for a monthly 'RecommendationPeriod' (replace with YYYY-MM as appropriate). Great recipe for Excel export.
Type:SQLAssessmentRecommendation IsRollup=false RecommendationPeriod=2014-10 RecommendationResult=Failed | sort RecommendationWeight desc | Select RecommendationId,Recommendation,RecommendationResult,FocusArea,RecommendationWeight,Computer,AffectedObjectType,SqlInstanceName,DatabaseName
System Update Assessment
Missing Required Updates
Type=RequiredUpdate | Select UpdateTitle,KBID,UpdateClassification,UpdateSeverity,PublishDate,Computer
Missing Required Updates for server "SERVER1.contoso.com"
Type=RequiredUpdate (UpdateSeverity=Critical and UpdateClassification="Security Updates" and Server="SERVER1.contoso.com") | Select Computer,UpdateTitle,KBID,Product,UpdateSeverity,PublishDate
Missing Critical Security Updates
Type=RequiredUpdate (UpdateSeverity=Critical and UpdateClassification="Security Updates") | Select Computer,UpdateTitle,KBID,Product,UpdateSeverity,PublishDate
Missing Security Updates
Type=RequiredUpdate UpdateClassification="Security Updates" | Select Computer,UpdateTitle,KBID,Product,UpdateSeverity,PublishDate
Missing Update Rollups
Type=RequiredUpdate UpdateClassification="Update Rollups" | Select UpdateTitle,KBID,UpdateClassification,UpdateSeverity,PublishDate,Computer
Missing Updates by Product
Type=RequiredUpdate | Measure count() by Product
Missing Updates for a specific product ("Windows Server 2012" in the example)
Type=RequiredUpdate Product="Windows Server 2012"
Malware Assessment
Devices with Signatures out of date
Type=ProtectionStatus | measure max(ProtectionStatusRank) as Rank by DeviceName | where Rank=250
Protection Status updates per day
Type=ProtectionStatus | Measure count(ScanDate) interval 1DAY | Sort TimeGenerated desc
Malware detected grouped by 'threat'
Type=ProtectionStatus NOT (ThreatStatus="No threats detected") | Measure count() by Threat
Configuration Assessment (Legacy Advisor Scenario)
NOTE: For the legacy Advisor Configuration Assessment scenario, in addition to the old Silverlight screens, some data is also indexed in the new Search feature for exploration purposes. Records of Type=ConfigurationObject are indexed and updated every time an object is discovered (or re-discovered) by Advisor Configuration Assessment. There are also records of Type=ConfigurationObjectProperties that represent the properties of those objects. These are only inserted in the index when their VALUE has CHANGED since the previous known value Advisor had discovered till the previous discovery. This is somewhat similar to ‘Change Tracking’ Intelligence Pack, but less sophisticated. Also records of Type=ConfigrationAlert are indexed once those Configuration Assessment Alerts are fired (each time, even if it is a ‘repeat’ i.e. because the HealthService has restarted) on Advisor agents by Advisor Configuration Assessment Alert Rules you are not ignoring.
All 'Advisor Managed' Computers that have reported Configuration Assessment data
Type=ConfigurationObject ObjectType="Microsoft.Windows.Computer" | Measure count() by Computer
All 'Advisor Managed' Computers that have reported Configuration Assessment data (alternate version)
Type=ConfigurationObject ObjectType="Microsoft.Windows.Computer" | Measure Max(TimeGenerated) by Computer
Count of machines by Operating System
Type=ConfigurationObject ObjectType="Microsoft.Windows.OperatingSystem" | Measure count() by ObjectDisplayName
All Property changes tracked by Advisor Configuration Assessment for Computer "OM54.contoso.com" (replace with your own computer name)
Type="ConfigurationObjectProperty" RootObjectName="OM054.contoso.com"
IP Address changes tracked by Advisor Configuration Assessment for Computer "OM54.contoso.com" (replace with your own computer name)
Type="ConfigurationObjectProperty" Name="Microsoft.Windows.Computer.IPAddress" RootObjectName="OM054.contoso.com"
Check SQL Collation settings for each database called "tempdb" on each SQL instance on each SQL server
Type="ConfigurationObjectProperty" Name="Microsoft.SQLServer.Database.Collation" ObjectDisplayName="tempdb" | Select ObjectDisplayName, ParentObjectName, RootObjectName, Value
Machines grouped by Organizational Unit
Type="ConfigurationObjectProperty" Name="Microsoft.Windows.Computer.OrganizationalUnit" | Measure count() by Value | Where AggregatedValue>0
All Alerts generated by Advisor Configuration Assessment
Type=ConfigurationAlert
Worst Severity of Configuration Assessment Alerts by Computer
Type=ConfigurationAlert | measure Max(Severity) by Computer
Configuration Assessment Alerts grouped by Rule/Monitor that generated them
Type=ConfigurationAlert | measure count() by WorkflowName
Configuration Assessment Alerts for ‘SQL Server’ workload
Type=ConfigurationAlert Workload=“SQL Server”
Active Machine-Generated Recommendations for 'Windows' (or 'SQL Server') Workloads
Type=Recommendation RecommendationStatus=Active AdvisorWorkload=Windows
Type=Recommendation RecommendationStatus=Active AdvisorWorkload="SQL Server"
Active Machine-Generated Recommendations grouped by Computer
Type=Recommendation RecommendationStatus=Active | Measure count() by RootObjectName
List Active Directory Sites (based on computers that had that changed)
Type=ConfigurationObjectProperty Name="Microsoft.Windows.Computer.ActiveDirectorySite" | Measure count() by Value
Which machines have the most memory assigned (and that has changed - probably you will only have data for VMs with dynamic memory most of the times with this query)
Type=ConfigurationObjectProperty Name="Microsoft.Windows.OperatingSystem.PhysicalMemory" | Measure Max(Value) by RootObjectName
Other searches on blogs
Stan has some useful ones mainly around System Update and Malware Assessments in this post https://cloudadministrator.wordpress.com/2014/10/19/system-center-advisor-restarted-time-matters-in-dashboard-part-6/ and about SQL Assessment in this other one https://cloudadministrator.wordpress.com/2014/10/23/microsoft-azure-operational-insights-preview-series-sql-assessment-part-7/
For more W3CIISLog search examples, I also posted another blog post here.
For more Windows Event searches around IIS errors (cloned from the IIS MP) in another blog post here.
I am also putting out a series of posts that guide you to take your first steps with the search syntax:
- https://blogs.msdn.com/b/dmuscett/archive/2014/10/19/advisor-search-first-steps-how-to-filter-data-part-i.aspx
- https://blogs.msdn.com/b/dmuscett/archive/2014/10/19/advisor-search-how-to-part-ii-more-on-filtering-using-boolean-operators-and-the-time-dimension.aspx
- https://blogs.msdn.com/b/dmuscett/archive/2014/10/19/advisor-search-how-to-part-iii-manipulating-results-the-pipeline-and-search-commands.aspx
- https://blogs.msdn.com/b/dmuscett/archive/2014/10/29/operational-insights-search-how-to-part-iv-introducing-the-measure-command.aspx
- https://blogs.msdn.com/b/dmuscett/archive/2014/10/29/azure-operational-insights-search-howto-part-v-max-and-min-statistical-functions-with-measure-command.aspx
- https://blogs.msdn.com/b/dmuscett/archive/2014/10/31/azure-operational-insights-search-how-to-part-vi-measure-avg-and-an-exploration-of-type-perfhourly.aspx
- https://blogs.msdn.com/b/dmuscett/archive/2014/11/10/azure-operational-insights-search-hot-to-part-vii-measure-sum-and-where-command.aspx
Comments
Anonymous
October 28, 2014
You might also find useful the series of blog posts on the syntax Part I - blogs.msdn.com/.../operational-insights-search-how-to-part-iv-introducing-the-measure-command.aspx Part II - blogs.msdn.com/.../advisor-search-how-to-part-ii-more-on-filtering-using-boolean-operators-and-the-time-dimension.aspx Part III - blogs.msdn.com/.../advisor-search-how-to-part-iii-manipulating-results-the-pipeline-and-search-commands.aspx Part IV - blogs.msdn.com/.../operational-insights-search-how-to-part-iv-introducing-the-measure-command.aspx (more to follow)Anonymous
July 22, 2015
I am trying to run the Query to find Distinct Computer Name and I am explicitly Excluding IIS Logs Type=Event Type!=W3CIISLog | Distinct Computer Still I am getting server with NetBIOS Name for the below Event , could you let me know how can I remove NetBIOS Name in the Query. EventLevelName : Information SourceSystem : OpsManager Source : Microsoft-Windows-EventSystem EventLog : Application EventCategory : 0 EventLevel : 4 UserName : N/A Message :Anonymous
July 22, 2015
@Jason - The bug for W3CIISLog and short computer name was fixed - feedback.azure.com/.../6772198-i-have-multiple-directly-connected-servers-listed if you are seeing NetBIOS name in the Computer field of some other type, to figure out which one it is try including one 'Type' at the time, rather than excluding them. It's possible that some other solutions produce the NetBIOS name too. Or, if ALL data (including Type=Event) has the short name, then it means that's what the agent knows about the machine name...Anonymous
August 19, 2015
I would like to build additional logic to extract my EventLog data in the field of ParameterXML. Here is my query: Type=Event EventLog=abcd EventID=17 | Select Computer, EventLog,ParameterXml What would be the additional syntax to accomplish it? Thanks,Anonymous
August 20, 2015
@Tuan, we don't support that at query time. But you can now define a field extraction - the feature just shipped a couple days ago - and then use that new field you have defined: blogs.technet.com/.../create-your-own-fields-in-oms-with-custom-fields.aspx