Reading cdm format (model.json) to create External Tables in Azure Synapse Serverlees Pool from ADLS containing CRM tables. How can it be resolved?

Elshan Chalabiyev 1 Reputation point
2022-06-27T10:12:49.197+00:00

My ADLS Gen2 receives tables from CRM's Dataverse in CDM format (with model.json) using Azure Synapse Link which is cofigured to link just to Storage Account (without Synapse Anlytics).
Now I want to create External Tables in my Synapse Analytics Serverless SQL Pool out of the CRM tables in the Storage Account.

The CSV files of the tables are without column names, the metadata of the tables are in model.json.

How can I tell Synapse to use that model.json for creation of the External Tables.

Azure Storage
Azure Storage
Globally unique resources that provide access to data management services and serve as the parent namespace for the services.
3,529 questions
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,374 questions
{count} votes

1 answer

Sort by: Most helpful
  1. MartinJaffer-MSFT 26,236 Reputation points
    2022-06-28T20:58:20.887+00:00

    For ease of rapid prototyping, I downloaded the model.json and wrote some Python on my local machine.
    Below is not a final version, but it seems to work. Some changes would need to be made if we want to run it on Synapse. Namely, changing model.json location from local file to blob address. Or could mount it.

    Take a look and tell me what you think. @Elshan Chalabiyev

    import json  
      
    model_location = 'C:\\Users\\XXXX\\Downloads\\model.json'  
      
    #CREATE EXTERNAL FILE FORMAT and CREAT EXTERNAL DATA SOURCE are separate things, not covered here.  
    #however the file format and external data source are used in the SQL expression, so I need names for them  
    file_format = "myFileFormat"  
    data_source = "myDataSource"  
          
    # if data types in model.json are not good types in SQL, make substitutions with below  
    typemapping = { 'string':'varchar(256)', 'int64':'int', 'decimal':'decimal'}  
      
    #magic sql writer  
    def writeSQL( tablename, filelocation, columns, datasource, fileformat):  
        statement = 'CREATE EXTERNAL TABLE ' + tablename + " (\n\t"  
          
        for col in columns:  
            statement += col['name'] + ' '  
            if col['dataType'] not in typemapping:  
                statement += col['dataType']  
            else:  
                statement += typemapping[col['dataType']]  
            statement += ", \n\t"  
        statement = statement.rstrip(", \n\t")  
        statement += ")\nWITH (\n Location = '" + filelocation + "',\n DATA_SOURCE = " + datasource + ",\n FILE_FORMAT = " + fileformat + ");\n"  
        print(statement)  
        return statement  
      
    #do the stuff  
    with open(model_location, 'r') as raw_model:  
        model = json.load(raw_model)  
        for table in model['entities']:  
            writeSQL(table['name'], table['partitions'][0]['name'], table['attributes'], data_source, file_format)  
    

    The model.json took the form of (see below). Is yours vaguely similar?
    215876-image.png

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.