Intro
Databricks is a cloud-based collaborative data science, data engineering, and data analytics platform that combines the best of data warehouses and data lakes into a lakehouse architecture.
With Databricks you can access all your data, analytics, and AI on one lake house platform. The simple, open, and collaborative environment helps reduce the infrastructure complexity, keeps control of your data, and makes it easy for your teams to partner across the entire data and workflow. For more information about the Databricks API, visit their website. (https://docs.databricks.com/dev-tools/api/index.html)
The Databricks connector is a "Database" connector, meaning it retrieves data from a database using a query. In the Data Center, you can access the connector page for this and other Database connectors by clicking Database in the toolbar at the top of the window.
You connect to your Databricks database in the Data Center. This topic discusses the fields and menus that are specific to the Databricks connector user interface. General information for adding DataSets, setting update schedules, and editing DataSet information is discussed in Adding a DataSet Using a Data Connector.
Prerequisites
To connect to a Databricks database and create a DataSet, you must have the following:
-
The username and password you use to log into your Databricks host
-
The host name for the database
-
The port number for the database
-
The database name or schema name
-
The HTTP Path
Connecting to Your Databricks Database
This section enumerates the options in the Credentials and Details panes in the Databricks Connector page. The components of the other panes in this page, Scheduling and Name & Describe Your DataSet, are universal across most connector types and are discussed in greater length in Adding a DataSet Using a Data Connector.
Credentials Pane
This pane contains fields for entering credentials to connect to your database. The following table describes what is needed for each field:
Field |
Description |
---|---|
Host |
Enter the host name for the Databricks database. Example: db.company.com |
Port |
Enter the port number for the Databricks database. |
Database Name |
Enter the name of the Databricks database. |
Username |
Enter your Databricks username. |
Password |
Enter your Databricks password. |
HTTP Path | Enter the HTTP path. |
Once you have entered valid Databricks credentials, you can use the same account any time you go to create a new Databricks DataSet. You can manage connector accounts in the Accounts tab in the Data Center. For more information about this tab, see Managing User Accounts for Connectors.
Details Pane
In this pane you create an SQL query to pull data from your database, with or without a parameter.
Menu |
Description |
||||||
---|---|---|---|---|---|---|---|
Query Type |
Select the desired query type.
|
||||||
Query |
Enter the Structured Query Language (SQL) query to use in selecting the data you want. Example: select * from Employee You can use the Query Helper parameter to help you write a usable SQL query. To use the Query Helper, do the following:
|
||||||
Database Tables |
Select the database table you want to import into Domo. |
||||||
Table Columns |
Select the table columns you want to import into Domo. |
||||||
Query Helper |
Copy and paste the SQL statement in this field into the Query field. For more information, see Query, above. |
||||||
Fetch Size |
Enter the fetch size for memory performance. The default value will be used if no fetch size specified. If an "out of memory" error occurs, retry decreasing the fetch size. |
Other Panes
For information about the remaining sections of the connector interface, including how to configure scheduling, retry, and update options, see Adding a DataSet Using a Data Connector.
FAQs
What kind of credentials do I need to power up this connector?
You need the username, password, host name, port number, and database name of your Databricks database. You also need to provide the HTTP path.
Where can I find the values that I need to enter for my credentials?
You can find the hostname, database, port number, and HTTP path by going to your cluster in Databricks and viewing the JDBC/ODBC tab in the Advanced section of the cluster details.
How frequently will my data update?
As often as needed.
Are there any API limits that I need to be aware of?
Limits depend on your server configuration.
What do I need to be aware of while writing a query?
Make sure that all the words, table names and field names are correctly spelled. Refer to the Query Helper field for query help.
What's the Fetch Size?
The fetch size is for memory performance. The default value will be used if no fetch size is specified. If an "out of memory" error occurs, retry decreasing the fetch size.
Comments
0 comments
Please sign in to leave a comment.