Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This article provides an overview of the agentic CLI for Azure Kubernetes Service (AKS), an AI-powered troubleshooting and insights tool that brings advanced diagnostics directly to your terminal. This feature is designed to help AKS administrators or developers quickly diagnose, understand, and resolve complex issues without needing deep Kubernetes expertise or memorizing command syntax.
Agentic CLI for AKS overview
The agentic CLI for AKS provides the az aks agent command group. You can use it to ask natural language questions about your cluster's health, configuration, and issues.
Get cluster information, configurations, and insights
You can use the agentic CLI for AKS to quickly gather detailed information about your AKS clusters, including:
- Comprehensive cluster status and configuration details.
- Real-time cluster metrics and health information.
- Intelligent analysis of cluster state and potential issues.
- Proactive recommendations based on cluster configuration and workload patterns.
Troubleshoot advanced AKS, Kubernetes, and health issues
The agentic CLI for AKS uses AI to help you troubleshoot complex issues by providing:
- AI-powered diagnostics that analyze complex cluster problems.
- Intelligent issue detection across the AKS control plane, node pools, and workloads.
- Automated root cause analysis for networking, storage, and security issues.
- Guided troubleshooting workflows with step-by-step remediation suggestions.
- Integration with Microsoft's extensive Kubernetes troubleshooting knowledge base.
Deployment modes
The agentic CLI for AKS supports two deployment modes to accommodate different operational requirements and security models:
Client mode
Client mode runs the agentic CLI locally on your machine using Docker containers. This mode is ideal for:
- Development and testing environments where you want quick, local access to cluster diagnostics.
- Individual developer workflows where you prefer to run tools locally with your existing credentials.
- Environments with strict cluster security policies that limit in-cluster deployments.
Key characteristics:
- Uses your local Azure CLI credentials and kubectl configuration
- Requires Docker to be installed and running locally
- Provides the same diagnostic capabilities as cluster mode
- Ideal for ad-hoc troubleshooting and development scenarios
Cluster mode
Cluster mode deploys the agentic CLI as a pod within your AKS cluster using Kubernetes service accounts and workload identity. This mode is recommended for:
- Production environments where you want the agent running closer to cluster resources.
- Shared team environments where multiple users need consistent access to cluster diagnostics.
- Automated workflows that require persistent agent availability.
- Enhanced security scenarios with workload identity and Azure RBAC integration.
Key characteristics:
- Uses Kubernetes service accounts with workload identity for secure authentication
- Runs directly within the AKS cluster for optimal performance and network access
- Supports optional Azure RBAC integration for enhanced security
- Ideal for production monitoring and shared operational workflows
Choosing the right mode
| Consideration | Client Mode | Cluster Mode |
|---|---|---|
| Deployment location | Local machine | AKS cluster |
| Authentication | Local Azure credentials | Service account + workload identity |
| Prerequisites | Docker | Service account |
| Use case | Development, testing | Production, shared environments |
| Performance | Network dependent | Optimized cluster access |
| Security | Local credential management | Azure RBAC integration |
Best practices for using the agentic CLI for AKS
To maximize the effectiveness of the agentic CLI for AKS, consider the following best practices:
- Start with broad diagnostic queries: Begin with general questions, like "What's wrong with my cluster?" Let the AI guide you to specific issues.
- Use descriptive problem statements: Provide context about symptoms that you observe for better AI analysis.
- Review AI recommendations carefully: Always understand the suggested solutions before you implement them.
- Use historical analysis: Ask about patterns and trends in cluster behavior over time.
- Provide feedback: Help improve the AI by providing feedback on the accuracy and usefulness of diagnostic responses.
- Use alongside traditional monitoring: Complement AI insights with Azure Monitor and other observability tools.
Security considerations
Keep the following security considerations in mind when you use the agentic CLI for AKS:
General security practices
- Follow the principle of least privilege when configuring access permissions.
- Review AI recommendations carefully before implementing suggested solutions.
- Audit command usage through Azure activity logs and cluster audit logs.
- Ensure your LLM API keys are stored securely and rotated regularly.
Client mode security
- Ensure your local Azure CLI credentials are properly secured and up to date.
- Use secure Docker configurations and keep Docker images updated.
- Be mindful of local credential storage and access permissions.
Cluster mode security
- Configure proper Kubernetes RBAC permissions for the service account.
- Enable workload identity for secure Azure resource access.
- Consider implementing Azure RBAC integration for enhanced security controls.
- Use network policies to control agent pod communication if required.
- Regular review and rotation of workload identity credentials.
Get started with the agentic CLI for AKS
To start using the agentic CLI for AKS or to learn more, refer to the following resources: