Set up a connector to archive webpage data

Use a Veritas connector to import and archive data from webpages to user mailboxes in your Microsoft 365 organization. Veritas provides a Webpage Capture connector that captures specific webpages (and any links on those pages) in a specific website or an entire domain. The connector converts the webpage content to a PDF, PNG, or custom file format and then attaches the converted files to an email message and then imports those email items to user mailboxes in Microsoft 365.

After webpage content is stored in user mailboxes, you can apply Microsoft Purview features such as Litigation Hold, eDiscovery, and retention policies and retention labels. Using a Webpage Capture connector to import and archive data in Microsoft 365 can help your organization stay compliant with government and regulatory policies.

Tip

If you're not an E5 customer, use the 90-day Microsoft Purview solutions trial to explore how additional Purview capabilities can help your organization manage data security and compliance needs. Start now at the Microsoft Purview compliance portal trials hub. Learn details about signing up and trial terms.

Overview of archiving webpage data

The following overview explains the process of using a connector to archive webpage content in Microsoft 365.

Archiving workflow for webpage data.

  1. Your organization works with the webpage source to set up and configure a Webpage Capture site.

  2. Once every 24 hours, the webpage sources items are copied to the Veritas Merge1 site. The connector also converts and attaches the content of a webpage to an email message.

  3. The Webpage Capture connector that you create in the Microsoft Purview portal or the Microsoft Purview compliance portal connects to the Veritas Merge1 site every day and transfers the webpage items to a secure Azure Storage location in the Microsoft cloud.

  4. The connector imports the converted webpage items to the mailboxes of specific users by using the value of the Email property of the automatic user mapping as described in Step 3. A subfolder in the Inbox folder named Webpage Capture is created in the user mailboxes, and the webpage items are imported to that folder. The connector does this by using the value of the Email property. Every webpage item contains this property, which is populated with the email addresses provided when you configure the Webpage Capture connector in Step 2.

Before you begin

  • Create a Veritas Merge1 account for Microsoft connectors. To create this account, contact Veritas Customer Support. You will sign into this account when you create the connector in Step 1.

  • You need to work with Veritas support to set up a custom file format to convert the webpage items to. For more information, see the Merge1 Third-Party Connectors User Guide.

  • The user who creates the Webpage Capture connector in Step 1 (and completes it in Step 3) must be assigned the Data Connector Admin role. This role is required to add connectors on the Data connectors page in the Microsoft Purview portal or the compliance portal. This role is added by default to multiple role groups. For a list of these role groups, see Roles in Microsoft Defender for Office 365 and Microsoft Purview compliance. Alternatively, an admin in your organization can create a custom role group, assign the Data Connector Admin role, and then add the appropriate users as members. For instructions, see:

  • This Veritas data connector is in public preview in GCC environments in the Microsoft 365 US Government cloud. Third-party applications and services might involve storing, transmitting, and processing your organization's customer data on third-party systems that are outside of the Microsoft 365 infrastructure and therefore aren't covered by the Microsoft Purview and data protection commitments. Microsoft makes no representation that use of this product to connect to third-party applications implies that those third-party applications are FEDRAMP compliant.

Step 1: Set up the Webpage Capture connector

The first step is to create a connector for Web Page source data in the Microsoft Purview portal or the compliance portal.

Select the appropriate tab for the portal you're using. To learn more about the Microsoft Purview portal, see Microsoft Purview portal. To learn more about the Compliance portal, see Microsoft Purview compliance portal.

  1. Sign into the Microsoft Purview portal.
  2. Select Settings > Data connectors.
  3. Select My connectors, then select Add connector.
  4. From the list, choose Webpage Capture.
  5. On the Terms of service page, select Accept.
  6. Enter a unique name that identifies the connector, and then select Next.
  7. Sign in to your Merge1 account to configure the connector.

Step 2: Configure the Webpage Capture connector on the Veritas Merge1 site

The second step is to configure the Webpage Capture connector on the Veritas Merge1 site. For information about how to configure the Webpage Capture connector, see Merge1 Third-Party Connectors User Guide.

After you select Save & Finish, the User mapping page in the connector wizard in the Microsoft Purview portal or the compliance portal is displayed.

Step 3: Map users and complete the connector setup

To map users and complete the connector setup in the Microsoft Purview portal or the compliance portal, follow the steps below:

  1. On the Map Webpage Capture users to Microsoft 365 users page, enable automatic user mapping. The Webpage Capture items include a property called Email, which contains email addresses for users in your organization. If the connector can associate this address with a Microsoft 365 user, the items are imported to that user's mailbox.

  2. Select Next, review your settings, and go to the Data connectors page to see the progress of the import process for the new connector.

Step 4: Monitor the Webpage Capture connector

After you create the Webpage Capture connector, you can view the connector status in the Microsoft Purview portal or the compliance portal.

Select the appropriate tab for the portal you're using. To learn more about the Microsoft Purview portal, see Microsoft Purview portal. To learn more about the Compliance portal, see Microsoft Purview compliance portal.

  1. Sign into the Microsoft Purview portal.
  2. Select Settings > Data connectors.
  3. Select My connectors, then select the Webpage Capture connector that you created to display the flyout page. This page contains the properties and information about the connector.
  4. Under Connector status with source, select the Download log link to open (or save) the status log for the connector. This log contains information about the data that's been imported to the Microsoft cloud. For more information, see View admin logs for data connectors.

Known issues

At this time, we don't support importing attachments or items that are larger than 10 MB. Support for larger items will be available at a later date.