Malware scanning in Defender for Storage
Malware Scanning in Defender for Storage helps protect your storage accounts from malicious content by performing a full malware scan on uploaded content in near real time, using Microsoft Defender Antivirus capabilities. It's designed to help fulfill security and compliance requirements for handling untrusted content.
The Malware Scanning capability is an agentless SaaS solution that allows simple setup at scale, with zero maintenance, and supports automating response at scale.
Malware upload is a top threat on cloud storage
Content uploaded to cloud storage could be malware. Storage accounts can be a malware entry point into the organization and a malware distribution point. To protect organizations from this threat, content in cloud storage must be scanned for malware before it's accessed.
Malware scanning in Defender for Storage helps protect storage accounts from malicious content
- A built-in SaaS solution that allows simple enabling at scale with zero maintenance.
- Comprehensive antimalware capabilities using Microsoft Defender Antivirus (MDAV), catching polymorphic and metamorphic malware.
- Every file type is scanned (including archives like zip files) and a result is returned for every scan. The file size limit is 2 GB.
- Supports response at scale – deleting or quarantining suspicious files, based on the blobs’ index tags or Event Grid events.
- When the malware scan identifies a malicious file, detailed Microsoft Defender for Cloud security alerts are generated.
- Designed to help fulfill security and compliance requirements to scan untrusted content uploaded to storage, including an option to log every scan result.
Common use-cases and scenarios
Some common use-cases and scenarios for malware scanning in Defender for Storage include:
Web applications: many cloud web applications allow users to upload content to storage. This allows low maintenance and scalable storage for applications like tax apps, CV upload HR sites, and receipts upload.
Content protection: assets like videos and photos are commonly shared and distributed at scale both internally and to external parties. CDNs (Content Delivery Network) and content hubs are a classic malware distribution opportunity.
Compliance requirements: resources that adhere to compliance standards like NIST, SWIFT, GDPR, and others require robust security practices, which include malware scanning. It's critical for organizations operating in regulated industries or regions.
Third-party integration: third-party data can come from a wide variety of sources, and not all of them may have robust security practices, such as business partners, developers, and contractors. Scanning for malware helps to ensure that this data doesn't introduce security risks to your system.
Collaborative platforms: similar to file sharing, teams use cloud storage for continuously sharing content and collaborating across teams and organizations. Scanning for malware ensures safe collaboration.
Data pipelines: data moving through ETL (Extract, Transfer, Load) processes can come from multiple sources and may include malware. Scanning for malware can help to ensure the integrity of these pipelines.
Machine learning training data: the quality and security of the training data are critical for effective machine learning models. It's important to ensure these data sets are clean and safe, especially if they include user-generated content or data from external sources.
To enable and configure Malware Scanning, you must have Owner roles (such as Subscription Owner or Storage Account Owner) or specific roles with the necessary data actions. Learn more about the required permissions.
You can enable and configure Malware Scanning at scale for your subscriptions while maintaining granular control over configuring the feature for individual storage accounts. There are several ways to enable and configure Malware Scanning: Azure built-in policy (the recommended method), programmatically using Infrastructure as Code templates, including Terraform, Bicep, and ARM templates, using the Azure portal, or directly with the REST API.
How does malware scanning work
On-upload malware scanning
When a blob is uploaded to a protected storage account - a malware scan is triggered. All upload methods trigger the scan. Modifying a blob is an upload operation and therefore the modified content is scanned after the update.
Scan regions and data retention
The malware scanning service that uses Microsoft Defender Antivirus technologies reads the blob. Malware Scanning scans the content "in-memory" and deletes scanned files immediately after scanning. The content isn't retained. The scanning occurs within the same region of the storage account. In some cases, when a file is suspicious, and more data is required, Malware Scanning may share file metadata outside the scanning region, including metadata classified as customer data (for example, SHA-256 hash), with Microsoft Defender for Endpoint.
Access customer data
The Malware Scanning service requires access to your data to scan your data for malware. During service enablement, a new Data Scanner resource called StorageDataScanner is created in your Azure subscription. This resource is granted with a Storage Blob Data Owner role assignment to access and change your data for malware scanning and sensitive data discovery.
Private Endpoint is supported out-of-the-box
Malware scanning in Defender for Storage is supported in storage accounts that use private endpoints while maintaining data privacy.
Private endpoints provide secure connectivity to your Azure storage services, eliminating public internet exposure, and are considered a best practice.
Set up of malware scanning
When malware scanning is enabled, the following actions automatically take place in your environment:
For each storage account you enable malware scanning on, an Event Grid System Topic resource is created in the same resource group of the storage account - used by the malware scanning service to listen on blob upload triggers. Removing this resource breaks the malware scanning functionality.
To scan your data, the Malware Scanning service requires access to your data. During service enablement, a new Data Scanner resource called
StorageDataScanneris created in your Azure subscription and assigned with a system-assigned managed identity. This resource is granted with the Storage Blob Data Owner role assignment permitting it to access your data for purposes of Malware Scanning and Sensitive Data Discovery.
If your storage account Networking configuration is set to Enable Public network access from selected virtual networks and IP addressed, the
StorageDataScanner resource is added to the Resource instances section under storage account Networking configuration to allow access to scan your data.
If you're enabling malware scanning on the subscription level, a new Security Operator resource called
StorageAccounts/securityOperators/DefenderForStorageSecurityOperator is created in your Azure subscription and assigned with a system-managed identity. This resource is used to enable and repair Defender for Storage and Malware Scanning configuration on existing storage accounts and check for new storage accounts created in the subscription to be enabled. This resource has role assignments that include the specific permissions needed to enable malware scanning.
Malware scanning depends on certain resources, identities, and networking settings to function properly. If you modify or delete any of these, malware scanning will stop working. To restore its normal operation, you can turn it off and on again.
Providing scan results
Malware scanning scan results are available through four methods. After setup, you'll see scan results as blob index tags for every uploaded and scanned file in the storage account, and as Microsoft Defender for Cloud security alerts when a file is identified as malicious.
You may choose to configure extra scan result methods, such as Event Grid and Log Analytics; these methods require extra configuration. In the next section, you'll learn about the different scan result methods.
Blob index tags
Blob index tags are metadata fields on a blob. They categorize data in your storage account using key-value tag attributes. These tags are automatically indexed and exposed as a searchable multi-dimensional index to easily find data. The scan results are concise, displaying Malware Scanning scan result and malware scanning scan time UTC in the blob metadata. Other result types (alerts, events, logs) provide more information on the malware type and file upload operation.
Blob index tags can be used by applications to automate workflows, but aren't tamper-resistant. Read more on setting up response.
Defender for Cloud security alerts
When a malicious file is detected, Microsoft Defender for Cloud generates a Microsoft Defender for Cloud security alert. To see the alert, go to Microsoft Defender for Cloud security alerts. The security alert contains details and context on the file, the malware type, and recommended investigation and remediation steps. To use these alerts for remediation, you can:
- View security alerts in the Azure portal by navigating to Microsoft Defender for Cloud > Security alerts.
- Configure automations based on these alerts.
- Export security alerts to a SIEM. You can continuously export security alerts Microsoft Sentinel (Microsoft’s SIEM) using Microsoft Sentinel connector, or another SIEM of your choice.
Learn more about responding to security alerts.
Event Grid event
Event Grid is useful for event-driven automation. It's the fastest method to get results with minimum latency in a form of events that you can use for automating response.
Events from Event Grid custom topics can be consumed by multiple endpoint types. The most useful for malware scanning scenarios are:
- Function App (previously called Azure Function) – use a serverless function to run code for automated response like move, delete or quarantine.
- Webhook – to connect an application.
- Event Hubs & Service Bus Queue – to notify downstream consumers.
Learn how to configure Malware Scanning so that every scan result is sent automatically to an Event Grid topic for automation purposes.
You may want to log your scan results for compliance evidence or investigating scan results. By setting up a Log Analytics Workspace destination, you can store every scan result in a centralized log repository that is easy to query. You can view the results by navigating to the Log Analytics destination workspace and looking for the
Learn more about setting up logging for malware scanning.
We invite you to explore the Malware Scanning feature in Defender for Storage through our hands-on lab. Follow the Ninja training instructions for a detailed, step-by-step guide on how to set up and test Malware Scanning end-to-end, including configuring responses to scanning results. This is part of the 'labs' project that helps customers get ramped up with Microsoft Defender for Cloud and provide hands-on practical experience with its capabilities.
Malware scanning is billed per GB scanned. To provide cost predictability, Malware Scanning supports setting a cap on the amount of GB scanned in a single month per storage account.
The "capping" mechanism is designed to set a monthly scanning limit, measured in gigabytes (GB), for each storage account, serving as an effective cost control. If a predefined scanning limit is established for a storage account in a single calendar month, the scanning operation would automatically halt once this threshold is reached (with up to a 20-GB deviation), and files wouldn't be scanned for malware. Updating the cap typically takes up to an hour to take effect.
By default, a limit of 5 TB (5,000 GB) is established if no specific capping mechanism is defined.
You can set the capping mechanism on either individual storage accounts or across an entire subscription (every storage account on the subscription will be allocated the limit defined on the subscription level).
Follow these steps to configure the capping mechanism.
Handling possible false positives
If you have a file that you suspect might be malware or is being incorrectly detected, you can submit it to us for analysis through the sample submission portal. Select “Microsoft Defender for Storage” as the source.
Malware Scanning doesn't block access or change permissions to the uploaded blob, even if it's malicious.
Unsupported features and services
- Unsupported storage accounts: Legacy v1 storage accounts aren't supported by malware scanning.
- Unsupported service: Azure Files isn't supported by malware scanning.
- Unsupported blob types: Append and Page blobs aren't supported for Malware Scanning.
- Unsupported encryption: Client-side encrypted blobs aren't supported as they can't be decrypted before scanning by the service. However, data encrypted at rest by Customer Managed Key (CMK) is supported.
- Unsupported index tag results: Index tag scan result isn't supported in storage accounts with Hierarchical namespace enabled (Azure Data Lake Storage Gen2).
- Event Grid: Event Grid topics that don't have public network access enabled (i.e. private endpoint connections) are not supported by malware scanning in Defender for Storage.
Throughput capacity and blob size limit
Scan throughput rate limit: Malware Scanning can process up to 2 GB per minute for each storage account. If the rate of file upload momentarily exceeds this threshold for a storage account, the system attempts to scan the files in excess of the rate limit. If the rate of file upload consistently exceeds this threshold, some blobs won't be scanned.
Blob scan limit: Malware Scanning can process up to 2,000 files per minute for each storage account. If the rate of file upload momentarily exceeds this threshold for a storage account, the system attempts to scan the files in excess of the rate limit. If the rate of file upload consistently exceeds this threshold, some blobs won't be scanned.
Blob size limit: The maximum size limit for a single blob to be scanned is 2 GB. Blobs that are larger than the limit won't be scanned.
Blob uploads and index tag updates
Upon uploading a blob to the storage account, the malware scanning initiates an extra read operation and updates the index tag. In most cases, these operations don't generate significant load.
Impact on access and storage IOPS
Despite the scanning process, access to uploaded data remains unaffected, and the impact on storage Input/Output Operations Per Second (IOPS) is minimal.
Learn more on how to set up response for malware scanning results.