Data sources overview
Dynamics 365 Customer Insights provides connections to bring data from a broad set of sources. Connecting to a data source is often referred to as the process of data ingestion. After ingesting the data, you can unify, generate insights, and activate the data for building personalized experiences.
Add or edit data sources
You can attach or import data sources into Customer Insights. The links below provide instructions on adding and editing data sources.
Attach a data source
If you have data prepared in one of Microsoft's Azure data services, Customer Insights can easily connect to the data source without having to re-ingest the data. Select one of the following options:
- Azure Data Lake Storage (csv or parquet files in a Common Data Model folder)
- Azure Synapse Analytics (Lake databases)
- Microsoft Dataverse data lake
Import and transform
If you use on-premise data sources, Microsoft, or third-party data, import and transform the data using Power Query connectors.
Review data sources
If your environment was configured to use Customer Insights storage and you use on-premise data sources, you use Power Platform dataflows. With Power Platform dataflows, you can view shared data sources and data sources managed by others. The Data Sources page lists the data sources in three sections:
- Shared: Data sources that can be managed by all Customer Insights admins. Power Platform dataflows, your own storage account, and attaching to a Dataverse-managed data lake are examples of shared data sources.
- Managed by me: Power Platform dataflows created and managed only by you. Other Customer Insights admins can only view these dataflows but not edit, refresh, or delete them.
- Managed by others: Power Platform dataflows created by other admins. You can only view them. It lists the owner of the dataflow to reach out to for any assistance.
All entities can be viewed and used by other users. While data sources are owned by the user who created them, the resulting entities from the data ingestion can be used by every user of Customer Insights.
If your environment does not use Power Platform dataflows, the Data Sources page contains only a list of all data sources. No sections display.
Manage existing data sources
Go to Data > Data sources to view the name of each ingested data source, its status, and the last time the data was refreshed for that source. You can sort the list of data sources by any column or use the search box to find the data source you want to manage.
Select a data source to view available actions.
- Edit the data source to change its properties.
- Refresh the data source to include the latest data.
- Enrich the data source before unification.
- Delete the data source. A data source can be deleted only if the data is not used in any processing such as unification, insights, activations, or exports.
Refresh data sources
Data sources can be refreshed on an automatic schedule or refreshed manually on demand. On-premise data sources refresh on their own schedules which are set up during data ingestion.
For attached data sources, data ingestion consumes the latest data available from that data source.
Go to Admin > System > Schedule to configure system-scheduled refreshes of your ingested data sources.
To refresh a data source on demand:
Go to Data > Data sources.
Select the data source you want to refresh and select Refresh. The data source is now triggered for a manual refresh. Refreshing a data source will update both the entity schema and data for all the entities specified in the data source.
Select the status to open the Progress details pane and view the progress. To cancel the job, select Cancel job at the bottom of the pane.
Corrupt data sources
Data being ingested may have corrupt records which can cause the data ingestion process to complete with errors or warnings.
If data ingestion completes with errors, subsequent processing (such as unification or activity creation) that leverages this data source will be skipped. If ingestion completed with warnings, subsequent processing continues but some of the records may not be included.
These errors can be seen in the task details.
Corrupt records are shown in system-created entities.
Fix corrupt data
To view the corrupt data, go to Data > Entities and look for the corrupted entities in the System section. The naming schema of corrupted entities: 'DataSourceName_EntityName_corrupt'.
Select a corrupt entity and then the Data tab.
Identify the corrupt fields in a record and the reason.
Data > Entities only show a portion of the corrupt records. To view all the corrupt records, export the files to a container in the storage account using the Customer Insights export process. If you used your own storage account, you can also look at the Customer Insights folder in your storage account.
Fix the corrupted data. For example, for Azure Data Lake data sources, fix the data in the Data Lake Storage or update the data types in the manifest/model.json file. For Power Query data sources, fix the data in the source file and correct the data type in the transformation step on the Power Query - Edit queries page.
After the next refresh of the data source, the corrected records are ingested to Customer Insights and passed on to downstream processes.
For example, a 'birthday' column has the datatype set as 'date'. A customer record has their birthday entered as '01/01/19777'. The system flags this record as corrupt. Change the birthday in the source system to '1977'. After an automated refresh of data sources, the field now has a valid format and the record is removed from the corrupted entity.