Malware scanning in Defender for Storage

Malware Scanning in Defender for Storage helps protect your Azure Blob Storage from malicious content by performing a full malware scan on uploaded content in near real time, using Microsoft Defender Antivirus capabilities. It's designed to help fulfill security and compliance requirements for handling untrusted content.

The Malware Scanning capability is an agentless SaaS solution that allows simple setup at scale, with zero maintenance, and supports automating response at scale.

Diagram showing how malware scanning protects your data from malicious code.

Malware upload is a top threat on cloud storage

Content uploaded to cloud storage could be malware. Storage accounts can be a malware entry point into the organization and a malware distribution point. To protect organizations from this threat, content in cloud storage must be scanned for malware before it's accessed.

Malware scanning in Defender for Storage helps protect storage accounts from malicious content

  • A built-in SaaS solution that allows simple enabling at scale with zero maintenance.
  • Comprehensive antimalware capabilities using Microsoft Defender Antivirus (MDAV), catching polymorphic and metamorphic malware.
  • Every file type is scanned (including archives like zip files) and a result is returned for every scan. The file size limit is 2 GB.
  • Supports response at scale – deleting or quarantining suspicious files, based on the blobs’ index tags or Event Grid events.
  • When the malware scan identifies a malicious file, detailed Microsoft Defender for Cloud security alerts are generated.
  • Designed to help fulfill security and compliance requirements to scan untrusted content uploaded to storage, including an option to log every scan result.

Common use-cases and scenarios

Some common use-cases and scenarios for malware scanning in Defender for Storage include:

  • Web applications: many cloud web applications allow users to upload content to storage. This allows low maintenance and scalable storage for applications like tax apps, CV upload HR sites, and receipts upload.

  • Content protection: assets like videos and photos are commonly shared and distributed at scale both internally and to external parties. CDNs (Content Delivery Network) and content hubs are a classic malware distribution opportunity.

  • Compliance requirements: resources that adhere to compliance standards like NIST, SWIFT, GDPR, and others require robust security practices, which include malware scanning. It's critical for organizations operating in regulated industries or regions.

  • Third-party integration: third-party data can come from a wide variety of sources, and not all of them might have robust security practices, such as business partners, developers, and contractors. Scanning for malware helps to ensure that this data doesn't introduce security risks to your system.

  • Collaborative platforms: similar to file sharing, teams use cloud storage for continuously sharing content and collaborating across teams and organizations. Scanning for malware ensures safe collaboration.

  • Data pipelines: data moving through ETL (Extract, Transfer, Load) processes can come from multiple sources and might include malware. Scanning for malware can help to ensure the integrity of these pipelines.

  • Machine learning training data: the quality and security of the training data are critical for effective machine learning models. It's important to ensure these data sets are clean and safe, especially if they include user-generated content or data from external sources.

    animated GIF showing user-generated-content and data from external sources.

Note

Malware scanning is a near real time service. Scan times can vary depending on the scanned file size or file type as well as on the load on the service or on the storage account. Microsoft is constantly working on reducing the overall scan time, however you should take this variability in scan times into consideration when designing a user experience based the service.

Prerequisites

To enable and configure Malware Scanning, you must have Owner roles (such as Subscription Owner or Storage Account Owner) or specific roles with the necessary data actions. Learn more about the required permissions.

You can enable and configure Malware Scanning at scale for your subscriptions while maintaining granular control over configuring the feature for individual storage accounts. There are several ways to enable and configure Malware Scanning: Azure built-in policy (the recommended method), programmatically using Infrastructure as Code templates, including Terraform, Bicep, and ARM templates, using the Azure portal, or directly with the REST API.

How does malware scanning work

On-upload malware scanning

On-upload triggers

Malware scans are triggered in a protected storage account by any operation that results in a BlobCreated event, as specified in the Azure Blob Storage as an Event Grid source page. These operations include the initial uploading of new blobs, overwriting existing blobs, and finalizing changes to blobs through specific operations. Finalizing operations might involve PutBlockList, which assembles block blobs from multiple blocks, or FlushWithClose, which commits data appended to a blob in Azure Data Lake Storage Gen2.

Note

Incremental operations such as AppendFile in Azure Data Lake Storage Gen2 and PutBlock in Azure BlockBlob, which allow data to be added without immediate finalization, do not trigger a malware scan on their own. A malware scan is initiated only when these additions are officially committed: FlushWithClose commits and finalizes AppendFile operations, triggering a scan, and PutBlockList commits blocks in BlockBlob, initiating a scan. Understanding this distinction is critical for managing scanning costs effectively, as each commit can lead to a new scan and potentially increase expenses due to multiple scans of incrementally updated data.

Scan regions and data retention

The malware scanning service that uses Microsoft Defender Antivirus technologies reads the blob. Malware Scanning scans the content "in-memory" and deletes scanned files immediately after scanning. The content isn't retained. The scanning occurs within the same region of the storage account. In some cases, when a file is suspicious, and more data is required, Malware Scanning might share file metadata outside the scanning region, including metadata classified as customer data (for example, SHA-256 hash), with Microsoft Defender for Endpoint.

Access customer data

The Malware Scanning service requires access to your data to scan your data for malware. During service enablement, a new Data Scanner resource called StorageDataScanner is created in your Azure subscription. This resource is granted with a Storage Blob Data Owner role assignment to access and change your data for malware scanning and sensitive data discovery.

Private Endpoint is supported out-of-the-box

Malware scanning in Defender for Storage is supported in storage accounts that use private endpoints while maintaining data privacy.

Private endpoints provide secure connectivity to your Azure storage services, eliminating public internet exposure, and are considered a best practice.

Set up of malware scanning

When malware scanning is enabled, the following actions automatically take place in your environment:

  • For each storage account you enable malware scanning on, an Event Grid System Topic resource is created in the same resource group of the storage account - used by the malware scanning service to listen on blob upload triggers. Removing this resource breaks the malware scanning functionality.

  • To scan your data, the Malware Scanning service requires access to your data. During service enablement, a new Data Scanner resource called StorageDataScanner is created in your Azure subscription and assigned with a system-assigned managed identity. This resource is granted with the Storage Blob Data Owner role assignment permitting it to access your data for purposes of Malware Scanning and Sensitive Data Discovery.

If your storage account Networking configuration is set to Enable Public network access from selected virtual networks and IP addressed, the StorageDataScanner resource is added to the Resource instances section under storage account Networking configuration to allow access to scan your data.

If you're enabling malware scanning on the subscription level, a new Security Operator resource called StorageAccounts/securityOperators/DefenderForStorageSecurityOperator is created in your Azure subscription and assigned with a system-managed identity. This resource is used to enable and repair Defender for Storage and Malware Scanning configuration on existing storage accounts and check for new storage accounts created in the subscription to be enabled. This resource has role assignments that include the specific permissions needed to enable malware scanning.

Note

Malware scanning depends on certain resources, identities, and networking settings to function properly. If you modify or delete any of these, malware scanning will stop working. To restore its normal operation, you can turn it off and on again.

Providing scan results

Malware scanning scan results are available through four methods. After setup, you'll see scan results as blob index tags for every uploaded and scanned file in the storage account, and as Microsoft Defender for Cloud security alerts when a file is identified as malicious.

You might choose to configure extra scan result methods, such as Event Grid and Log Analytics; these methods require extra configuration. In the next section, you'll learn about the different scan result methods.

Diagram showing flow of viewing and consuming malware scanning results.

Scan results

Blob index tags

Blob index tags are metadata fields on a blob. They categorize data in your storage account using key-value tag attributes. These tags are automatically indexed and exposed as a searchable multi-dimensional index to easily find data. The scan results are concise, displaying Malware Scanning scan result and malware scanning scan time UTC in the blob metadata. Other result types (alerts, events, logs) provide more information on the malware type and file upload operation.

Screenshot that shows an example of a blob index tag.

Blob index tags can be used by applications to automate workflows, but aren't tamper-resistant. Read more on setting up response.

Note

Access to index tags requires permissions. For more information see Get, set, and update blob index tags.

Defender for Cloud security alerts

When a malicious file is detected, Microsoft Defender for Cloud generates a Microsoft Defender for Cloud security alert. To see the alert, go to Microsoft Defender for Cloud security alerts. The security alert contains details and context on the file, the malware type, and recommended investigation and remediation steps. To use these alerts for remediation, you can:

  1. View security alerts in the Azure portal by navigating to Microsoft Defender for Cloud > Security alerts.
  2. Configure automations based on these alerts.
  3. Export security alerts to a SIEM. You can continuously export security alerts Microsoft Sentinel (Microsoft’s SIEM) using Microsoft Sentinel connector, or another SIEM of your choice.

Learn more about responding to security alerts.

Event Grid event

Event Grid is useful for event-driven automation. It's the fastest method to get results with minimum latency in a form of events that you can use for automating response.

Events from Event Grid custom topics can be consumed by multiple endpoint types. The most useful for malware scanning scenarios are:

  • Function App (previously called Azure Function) – use a serverless function to run code for automated response like move, delete or quarantine.
  • Webhook – to connect an application.
  • Event Hubs & Service Bus Queue – to notify downstream consumers.

Learn how to configure Malware Scanning so that every scan result is sent automatically to an Event Grid topic for automation purposes.

Logs analytics

You might want to log your scan results for compliance evidence or investigating scan results. By setting up a Log Analytics Workspace destination, you can store every scan result in a centralized log repository that is easy to query. You can view the results by navigating to the Log Analytics destination workspace and looking for the StorageMalwareScanningResults table.

Learn more about setting up logging for malware scanning.

Tip

We invite you to explore the Malware Scanning feature in Defender for Storage through our hands-on lab. Follow the Ninja training instructions for a detailed, step-by-step guide on how to set up and test Malware Scanning end-to-end, including configuring responses to scanning results. This is part of the 'labs' project that helps customers get ramped up with Microsoft Defender for Cloud and provide hands-on practical experience with its capabilities.

Cost control

Malware scanning is billed per GB scanned. To provide cost predictability, Malware Scanning supports setting a cap on the amount of GB scanned in a single month per storage account.

Important

Malware scanning in Defender for Storage is not included for free in the first 30-day trial and will be charged from the first day in accordance with the pricing scheme available on the Defender for Cloud pricing page.

The "capping" mechanism is designed to set a monthly scanning limit, measured in gigabytes (GB), for each storage account, serving as an effective cost control. If a predefined scanning limit is established for a storage account in a single calendar month, the scanning operation would automatically halt once this threshold is reached (with up to a 20-GB deviation), and files wouldn't be scanned for malware. The cap is reset at the end of every month at midnight UTC. Updating the cap typically takes up to an hour to take effect.

By default, a limit of 5 TB (5,000 GB) is established if no specific capping mechanism is defined.

Tip

You can set the capping mechanism on either individual storage accounts or across an entire subscription (every storage account on the subscription will be allocated the limit defined on the subscription level).

Follow these steps to configure the capping mechanism.

Additional costs of malware scanning

Malware scanning uses other Azure services as its foundation. This means that when you enable Malware scanning, you will also be charged for the Azure services that it requires. These services include Azure Storage read operations, Azure Storage blob indexing and Azure Event Grid notifications.

Handling possible false positives and false negatives

If you have a file that you suspect might be malware but isn't being detected (false negative) or is being incorrectly detected (false positive), you can submit it to us for analysis through the sample submission portal. Select “Microsoft Defender for Storage” as the source.

Defender for Cloud allows you to suppress false positive alerts. Make sure to limit the suppression rule by using the malware name or file hash.

Malware Scanning doesn't automatically block access or change permissions to the uploaded blob, even if it's malicious.

Limitations

Unsupported features and services

  • Unsupported storage accounts: Legacy v1 storage accounts aren't supported by malware scanning.

  • Unsupported service: Azure Files isn't supported by malware scanning.

  • Unsupported client: Blobs uploaded with Network File System (NFS) 3.0 protocol will not be scanned for malware upon upload.

  • Unsupported regions: Jio India West, Korea South, South Africa West.

  • Regions that are supported by Defender for Storage but not by malware scanning. Learn more about availability for Defender for Storage.

  • Unsupported blob types: Append and Page blobs aren't supported for Malware Scanning.

  • Unsupported encryption: Client-side encrypted blobs aren't supported as they can't be decrypted before scanning by the service. However, data encrypted at rest by Customer Managed Key (CMK) is supported.

  • Unsupported index tag results: Index tag scan result isn't supported in storage accounts with Hierarchical namespace enabled (Azure Data Lake Storage Gen2).

  • Event Grid: Event Grid topics that don't have public network access enabled (i.e. private endpoint connections) are not supported by malware scanning in Defender for Storage.

Scenarios where malware scanning is ineffective

While malware scanning provides comprehensive detection capabilities, there are specific scenarios where it becomes ineffective due to inherent limitations. It is important to evaluate these scenarios carefully before deciding to enable malware scanning on a storage account:

  • Chunked data: Malware scanning does not effectively detect malware in blobs that contain chunked data, for example files that have been split into smaller parts. This issue is particularly common in backup services that upload backup data in chunks to storage accounts. The scanning process might miss malicious content or incorrectly flag clean content, leading to false negatives and false positives. To mitigate this risk, consider implementing additional security measures, such as scanning data, before it is chunked, or after is has been fully reassembled.

  • Encrypted data: Malware scanning does not support client-side encrypted data. This data cannot be decrypted by the service, meaning any malware within these encrypted blobs will go undetected. If encryption is necessary, ensure that scanning occurs before the encryption process, or use supported encryption methods like Customer Managed Keys (CMK) for encryption at rest.

When evaluating whether to enable malware scanning in these scenarios, the primary consideration should be whether other files, which are supported by the scanning process, are being uploaded to the storage account, and whether attackers could exploit this upload stream to introduce malware.

Throughput capacity and blob size limit

  • Scan throughput rate limit: Malware Scanning can process up to 2 GB per minute for each storage account. If the rate of file upload momentarily exceeds this threshold for a storage account, the system attempts to scan the files in excess of the rate limit. If the rate of file upload consistently exceeds this threshold, some blobs won't be scanned.
  • Blob scan limit: Malware Scanning can process up to 2,000 files per minute for each storage account. If the rate of file upload momentarily exceeds this threshold for a storage account, the system attempts to scan the files in excess of the rate limit. If the rate of file upload consistently exceeds this threshold, some blobs won't be scanned.
  • Blob size limit: The maximum size limit for a single blob to be scanned is 2 GB. Blobs that are larger than the limit won't be scanned.

Blob uploads and index tag updates

Upon uploading a blob to the storage account, the malware scanning initiates an extra read operation and updates the index tag. In most cases, these operations don't generate significant load.

Impact on access and storage IOPS

Despite the scanning process, access to uploaded data remains unaffected, and the impact on storage Input/Output Operations Per Second (IOPS) is minimal.

Limitations compared to Microsoft Defender for Endpoint

Defender for Storage utilizes the same antimalware engine and up-to-date signatures as Defender for Endpoint to scan for malware. However, when files are uploaded to Azure Storage, they lack certain metadata that the antimalware engine depends on. This lack of metadata can lead to a higher rate of missed detections, known as 'false negatives', in Azure Storage compared to those detected by Defender for Endpoint.

The following are some examples of missing metadata:

  • Mark of the Web (MOTW): MOTW is a Windows security feature that tracks files downloaded from the internet. However, when files are uploaded to Azure Storage, this metadata is not preserved.

  • File path context: On standard operating systems, the file path can provide additional context for threat detection. For example, a file trying to modify system locations (e.g., C:\Windows\System32) would be flagged as suspicious, and be subject to further analysis. In Azure Storage, the context of specific file paths within the blob cannot be utilized in the same way.

  • Behavioral data: Defender for Storage analyzes the contents of files without running them. It inspects the files and may emulate their execution to check for malware. However, this approach may not detect certain types of malware that reveal their malicious nature only during execution.

Next steps

Learn more on how to set up response for malware scanning results.