Hi Vaibhav,
Thanks for reaching out to Microsoft Q&A.
How can I link my Deltalake created through ADF and do further transformation from HDinsight Spark cluster.
To link your Delta Lake created through ADF and perform further transformations from an HDInsight Spark cluster, follow these steps...
- Mount your Delta Lake storage to your HDInsight Spark cluster
- Try the 'azure-datalake-store' library in python to mount ADLS to your hdnisight cluster. This allows you to access the delta Lake files directly using spark jobs running on the cluster.
- Once the delta Lake storage is mounted, use spark to read the delta Lake files as dataframes and perform further transformations as required. The best way is to write spark jobs using PySpark to read the delta Lake data, apply your transformations, and then write the results back to delta Lake
Is there any way to do select * and view the data of the deltalake.
On the mounted delta lake files, you can query it using spark sql and view the results. For ex., reading the data from delta table...
Please 'Upvote'(Thumbs-up) and 'Accept' as an answer if the reply was helpful. This will benefit other community members who face the same issue.