Connect to Spotfire Analyst

This article describes how to use Spotfire Analyst with an Azure Databricks cluster or an Azure Databricks SQL warehouse.

Requirements

  • A cluster or SQL warehouse in your Azure Databricks workspace.

  • The connection details for your cluster or SQL warehouse, specifically the Server Hostname, Port, and HTTP Path values.

  • An Azure Databricks personal access token or a Microsoft Entra ID (formerly Azure Active Directory) token.. To create a personal access token, do the following:

    1. In your Azure Databricks workspace, click your Azure Databricks username in the top bar, and then select Settings from the drop down.
    2. Click Developer.
    3. Next to Access tokens, click Manage.
    4. Click Generate new token.
    5. (Optional) Enter a comment that helps you to identify this token in the future, and change the token’s default lifetime of 90 days. To create a token with no lifetime (not recommended), leave the Lifetime (days) box empty (blank).
    6. Click Generate.
    7. Copy the displayed token to a secure location, and then click Done.

    Note

    Be sure to save the copied token in a secure location. Do not share your copied token with others. If you lose the copied token, you cannot regenerate that exact same token. Instead, you must repeat this procedure to create a new token. If you lose the copied token, or you believe that the token has been compromised, Databricks strongly recommends that you immediately delete that token from your workspace by clicking the trash can (Revoke) icon next to the token on the Access tokens page.

    If you are not able to create or use tokens in your workspace, this might be because your workspace administrator has disabled tokens or has not given you permission to create or use tokens. See your workspace administrator or the following topics:

    Note

    As a security best practice, when you authenticate with automated tools, systems, scripts, and apps, Databricks recommends that you use personal access tokens belonging to service principals instead of workspace users. To create tokens for service principals, see Manage tokens for a service principal.

Steps to connect

  1. In Spotfire Analyst, on the navigation bar, click the plus (Files and data) icon and click Connect to.
  2. Select Databricks and click New connection.
  3. In the Apache Spark SQL dialog, on the General tab, for Server, enter the Server Hostname and Port field values from Step 1, separated by a colon.
  4. For Authentication method, select Username and password.
  5. For Username, enter the word token.
  6. For Password, enter your personal access token from Step 1.
  7. On the Advanced tab, for Thrift transport mode, select HTTP.
  8. For HTTP Path, enter the HTTP Path field value from Step 1.
  9. On the General tab, click Connect.
  10. After a successful connection, in the Database list, select the database you want to use, and then click OK.

Select the Azure Databricks data to analyze

You select data in the Views in Connection dialog.

Available Tables

  1. Browse the available tables in Azure Databricks.
  2. Add the tables you want as views, which will be the data tables you analyze in Spotfire.
  3. For each view, you can decide which columns you want to include. If you want create a very specific and flexible data selection, you have access to a range of powerful tools in this dialog, such as:
    • Custom queries. With custom queries, you can select the data you want to analyze by typing a custom SQL query.
    • Prompting. Leave the data selection to the user of your analysis file. You configure prompts based on columns of your choice. Then, the end user who opens the analysis can select to limit and view data for relevant values only. For example, the user can select data within a certain span of time or for a specific geographic region.
  4. Click OK.

Push-down queries to Azure Databricks or import data

When you have selected the data that you want to analyze, the final step is to choose how you want to retrieve the data from Azure Databricks. A summary of the data tables you are adding to your analysis is displayed, and you can click each table to change the data loading method.

orders table example

The default option for Azure Databricks is External. This means the data table is kept in-database in Azure Databricks, and Spotfire pushes different queries to the database for the relevant slices of data, based on your actions in the analysis.

You can also select Imported and Spotfire will extract the entire data table up-front, which enables local in-memory analysis. When you import data tables, you also use analytical functions in the embedded in-memory data engine of TIBCO Spotfire.

The third option is On-demand (corresponding to a dynamic WHERE clause), which means that slices of data will be extracted based on user actions in the analysis. You can define the criteria, which could be actions like marking or filtering data, or changing document properties. On-demand data loading can also be combined with External data tables.

Additional resources

Support