Create shortcuts to on-premises data
With OneLake Shortcuts, you can create virtual references to bring together data from a variety sources across clouds, regions, systems, and domains – all with no data movement or duplication. By using a Fabric on-premises data gateway (OPDG), you can now also create shortcuts to on-premises data sources, such as S3 compatible storage hosted on-premises. With this feature, you can also create shortcuts to other network-restricted data sources, such as Amazon S3 or Google Cloud Storage buckets configured behind a firewall or Virtual Private Cloud (VPC).
On-premises data gateways are software agents that you install on a Windows machine and configure to connect to your data endpoints. By selecting an OPDG when creating a shortcut, you can establish network connectivity between OneLake and your data source.
This feature is available for Amazon S3, Google Cloud Storage, and S3 compatible shortcuts. You can use this feature in any Fabric-enabled workspace.
In this document, we show you how to install and use these on-premises data gateways to create shortcuts to on-premises or network-restricted data.
Important
This feature is in preview.
Prerequisites
- Create or identify a Fabric lakehouse that will contain your shortcut(s).
- Identify the endpoint URL associated with your Amazon S3, Google Cloud Storage, or S3 compatible location.
- For S3 compatible, the endpoint is the URL for the service, not a specific bucket. For example:
https://mys3api.contoso.com
http://10.0.1.4:9000
- For Amazon S3, the endpoint is the URL for a specific bucket. For example:
https://BucketName.s3.us-east.amazonaws.com
- For Google Cloud Storage, the endpoint is either the URL for the bucket or the service. For example:
https://storage.googleapis.com
https://bucketname.storage.googleapis.com
- Identify the user or identity credentials that meet the necessary access and authorization requirements for your data source. Your credentials generally need to be able to list buckets, list objects, and read data.
- Identify a physical or virtual machine that:
- Has network connectivity to your storage endpoint. This article explains how you can confirm this connectivity before creating your shortcut.
- Allows you to install software.
- Follow the instructions to install a standard On-premises Data Gateway on the machine you identified. Be sure to install the latest version.
- If your storage endpoint uses a self-signed certificate for HTTPS connections, be sure to trust this certificate on the machine hosting your gateway.
- For S3 compatible, the endpoint is the URL for the service, not a specific bucket. For example:
Check connectivity from gateway host
Before setting up your shortcut, follow these steps to confirm that your gateway can connect to your storage endpoint.
- Log in to the machine hosting the gateway.
- Install a client application that can query S3 compatible data sources, such as the Amazon Web Services Command Line Interface, WinSCP, or another tool of choice.
- Connect to your endpoint URL and provide the credentials you identified in the prerequisite steps.
- Ensure you can explore and read data from your storage location.
Create a shortcut
Review the instructions for creating an Amazon S3, Google Cloud Storage, or S3 compatible shortcut.
During shortcut creation, select your on-premises data gateway (OPDG) in the Data gateway dropdown field.
Note
If you do not see your OPDG in the Data gateway dropdown field and someone else created the gateway, ask them to share the gateway with you from the Manage connections and gateways interface.
Troubleshooting
If you encounter any connectivity issues during shortcut creation, try the following troubleshooting steps.
- As needed, ensure the machine hosting your gateway can connect to your storage endpoint. Follow the steps to check connectivity.
- If you're using HTTPS and need to use a self-signed certificate, ensure the machine hosting your gateway trusts the certificate. You may need to install the self-signed certificate on the machine.