Enable Education Data Lake Export

Important

If you're still using SDS (Classic) for provisioning SIS / SMS data to Microsoft 365 and Microsoft Entra ID to managing users and classes you will ONLY be able to set up Insights & Analytics scenarios.

This article provides information about the Education Data Lake Export feature, and technical resources on setting up your Azure subscription to copy the data into a Microsoft Azure Data Lake (Data Lake).

Overview

Education Data Lake Export allows a customer to get a copy of the Data Lake with Activity data for their own custom analytics using Data Services.

  • Export uses Microsoft Azure Data Share (Data Share) to copy blobs from the Microsoft subscription into the customer Azure subscription.
  • Data Share generates a snapshot of the Insights data that can be exported to an organization’s Data Lake.
  • Data Share helps simplify this process to securely export data. Organizations can then use Microsoft Synapse, Azure Machine Learning, and/or Power BI to create their own customized analytics and reports.

With Education Insights data in their Azure Data Lake, organizations can then combine it with other data sources such as SIS, LMS, or assessment data.

Important

Activity data is only stored in the Data Lake if Education Data Lake Export is enabled.

For tenants that wish to only setup Education Data Lake export there are three steps to using Education Insights data for custom analytics:

  • Set up School Data Sync (SDS) for Microsoft 365 Education
  • An Azure subscription. If you don't have an Azure subscription, you can set up a free subscription here, or check the the current list of Azure offers.
    • Role assignment of "Owner" on the preferred Azure subscription you're using.
    • The preferred subscription is selected as default.
  • Set up Education Data Lake Export to copy Roster, Microsoft Entra users and Groups, and Activity data to Azure Data Lake.

Set up SDS for Microsoft 365 Education Tenant

Enabling Education Data Lake Export

Important

By setting up Education Data Lake Export to make your Institution data available, you represent that:

You're authorized to use this data.

Commit to adhere to your organization’s data governance standards.

  1. Navigate to Sync | Configuration and select Insights & analytics tab.

    Screenshot that shows Insights and analytics configuration tab.

  2. Select Edit data lake export. You see a fly-out window.

    Screenshot showing opened fly-out.

  3. Select from the list of available Global Tenant Admin who will receive the Data Share invitation.

    Screenshot showing list of available global tenant admins.

  4. Select Send invitation to create your data share.

    Screenshot showing button for send invitation.

  5. Once Education Data Lake Export enablement request is acknowledged, you see a confirmation. The selected tenant admin receives a Data Share invitation in the tenants Azure subscription.

    Screenshot showing invitation being generated.

    Screenshot showing invitation sent.

    After the invitation is sent, the selected tenant admin can sign in to their Azure subscription to see the data share invitation. To open invitation from Azure subscription directly, visit Data Share invitations in Azure subscription. This action takes you to the list of Data Share invitations.

  6. Select the ‘X’ on the top of the fly-out to close and navigate back to the home dashboard.

    Close flyout

    A new card titled Education Data Lake Export is available.

    • Pending invite: The invitation request hasn't been accepted by an admin in the default Azure subscription under Data Share invitations in Azure.
    • Enabled: The invitation is accepted and the data is available for access to copy into the default Azure subscription.
    • Collect Activity Data disabled: Collect Activity Data is disabled. Collect activity data for insights needs to be enabled to include activity data in the export. See Teams Admin settings for Education Insights for information to re-enable the activity data collection.

    Note

    Education Data Lake Export status will be in a Pending invite acceptance until the recipient has accepted the invitation and configured the destination storage to receive the snapshot.

    After the recipient has accepted the invitation and configured the destination storage to receive the snapshots, the status on the card will be updated to Enabled.

    Important

    Enabling Education Data Lake Export will also invoke a process to backfill user activity data into the data lake, going back 90 days (about 3 months). Depending on the amount of user activity data, it may take time to complete the backfill, therefore the data snapshots will include the backfill data captured up to that point.

Receiving the Education Data Lake Export invitation and configuring your Azure subscription

For steps on receiving the invitation and setting up your Azure subscription for your tenant, see Tutorial: Accept and receive data using ADS, starting with Prerequisites.

Frequently asked question

What is the scope of Education Insights activity log output?

The output is for all users/teams.

How far back does the Education Insights activity log include data from when Education Data Lake Export is set up?

Ninety days of data will be stored in the Education Data Lake from the date when the Export is first enabled. It collects data daily going forward.

What is the structure of the activity files produced with the Education Data Lake Export?

The following are the schema articles:

When the invitation is accepted, the destination storage / folder a Microsoft 365 folder is created directly under it. Underneath this folder is followed by an Activity folder. Daily a folder is created based on YYYY-MM-DD with the activity data based on 1-GB files in CSV format based on the following pattern:

  • ApplicationUsage.Part001.csv
  • ApplicationUsage.Part002.csv
  • ApplicationUsage.Part003.csv

Update Roster data (users, classes, etc. that is captured and linked by SDS) is also exported. Additionally a copy of Microsoft Entra ID is place to the same destination storage / folder.

The activity log outputs user and team IDs as Microsoft Entra Object IDs and Office Group IDs, which can be linked to the managed SDS User and Group IDs.

What is the timing of the activity files produced with the Education Data Lake Export?

A daily activity log, per day, will be generated within 48 - 72 hours of the completion of the corresponding day. The data for the day will be from 00:00:00 to 23.59:59 UTC time. The activity data is included in a subsequent snapshot after the generation for that day is complete.

Can I change the Data Share schedule for the Export?

No. This isn't possible.

By what unit(team) is the activity log output?

There's no such unit. A folder for each day is created YYYY-MM-DD and contains the activity data based on 1-GB files in CSV format for that day.

Is it possible to get the file name of the operation target in the activity log?

No. Use the audit log to obtain file names.

Can I get messages from Teams chats in the activity log?

No. Use the Graph API to retrieve chat messages.

Is it possible to output the Export data to on-premises environment?

No.

Do Microsoft 365 tenants and Education Data Lake Export destination storage / folder in Azure subscription need to be the same?

Yes, it's necessary.

Does communication between Microsoft 365 tenant from Education Data Lake Export and Azure subscription destination storage / folder go through the Public internet?

No, communication doesn't go through the public internet.

Disabling Education Data Lake Export

  1. Navigate to Sync | Configuration and select Insights & analytics tab.

    Screenshot showing Insights and analytics tab.

  2. Select Edit data lake export. You see a fly-out window.

    Screenshot that shows opened fly-out.

  3. Select Delete share button.

    Screenshot that shows delete share button.

  4. Review the dialog presented Are you sure you want to delete Education Data Lake Export?

    Note

    By confirming to delete Education Data Lake Export, it will stop collecting and delete activity data that has been collected in the Data Lake. Activity data collected over 90 days (about 3 months) ago will not be added back if turned back on later. Also, it will stop writing new data (snapshots) into your Azure tenant storage if the invitation is accepted. This will NOT delete any data that is present in the associated Azure tenant storage. You will separately need to delete the data in your Azure tenant storage.

    To confirm that you want to delete, select Confirm. If you don't want to delete it, select Cancel.

    Screenshot that shows dialog to confirm delete share.

  5. Once Education Data Lake Export disable request is acknowledged, you see a confirmation.

    Screenshot that shows processing of deletion request.

    Screenshot that shows confirmation of process to delete received.

  6. Select the X on the top of the fly-out to close and return to the home page.

    You notice the card under the Microsoft 365 Services section titled Education Data Lake Export is no longer shown.

Relevant Articles

Data Lake Schema - Rostering

Data Lake Schema - Microsoft Entra ID

Data Lake Schema - Activity