Apache Impala is an open source massively parallel processing (MPP) SQL query engine for data stored in a computer cluster running Apache Hadoop. It consists of different processes that run on specific hosts within your CDH cluster. The Domo Apache Impala SSH connector brings your data from Apache server securely through an SSH tunnel into Domo.
The Apache Impala SSH Connector is a "Database" connector, meaning it retrieves data from a database using a query. In the Data Center, you can access the connector page for this and other Database connectors by clicking Database in the toolbar at the top of the window.
This topic discusses the fields and menus that are specific to the Apache Impala SSH connector user interface. General information for adding DataSets, setting update schedules, and editing DataSet information is discussed in Adding a DataSet Using a Data Connector.
To connect to your Apache Impala database and create a DataSet, you must have the following:
The username and password you use to log into SSH Host
The SSH host you wish to tunnel through
The port number of your SSH host
The SSH private key
The username and password you use to log into your Apache Impala database
The host name or IP address for the database server (e.g. db.company.com).
The port number for the database
The database name
Before you can connect to an Apache Impala database, you must also whitelist a number of IP addresses on your database server on the port you want to connect to. For the full list of IP addresses, see Whitelisting IP Addresses for Connectors.
Connecting to Your Apache Impala database
This section enumerates the options in the Credentials and Details panes in the Apache Impala SSH Connector page. The components of the other panes in this page, Scheduling and Name & Describe Your DataSet, are universal across most connector types and are discussed in greater length in Adding a DataSet Using a Data Connector.
This pane contains fields for entering credentials to connect to your (third-party tool) account. The following table describes what is needed for each field:
|SSH Server Host name||Enter the SSH host name you wish to tunnel through.|
|SSH Port||Enter the port number of your SSH host.|
|SSH Username||Enter the username you use to log into SSH Host.|
|SSH Password||Enter the password you use to log into SSH Host.|
|SSH Private Key||Enter the SSH private key.|
|Host||Enter the hostname or IP address of your database server. Example: db.company.com|
|Database Port||Enter your Apache Impala port number.|
|Database Name||Enter your Apache Impala database/schema name.|
|Username||Enter your Apache Impala username.|
|Password||Enter your Apache Impala password.|
|Database Connection String Parameter(s)||Enter the parameter(s) you want to include in the database connection string. Multiple parameters are separated by a semicolon. (Example: AuthMech=3;SSL=1;AllowSelfSignedCerts=1)|
Once you have entered valid Apache Impala credentials, you can use the same account any time you go to create a new Apache Impala SSH DataSet. You can manage connector accounts in the Accounts tab in the Data Center. For more information about this tab, see Managing User Accounts for Connectors.
This pane contains a primary Reports menu, along with various other menus which may or may not appear depending on the report type you select.
Select a query type.
|Query||Enter the SQL query to execute. The query will execute on the Apache Impala server and fetch the data from it.|
Enter the query parameter value. It is the initial value for query parameter. The last run date is optional. The default value for the last date is '02/01/1700' if not provided.
Select the database table.
Select the table columns.
This query is automatically generated when you select a table and columns in the Database Table and Table Columns fields, respectively. Copy and paste this query into the Query field if you need help building a query.
For information about the remaining sections of the connector interface, including how to configure scheduling, retry, and update options, see Adding a DataSet Using a Data Connector.
What kind of credentials do I need to power up this connector?
You need your Apache Impala SSH credentials (server hostname, port number, private key, username, and password) as well as your database credentials (hostname, port number, database name, username, and password). You may also provide the parameter(s) you want to include in the database connection string. Multiple parameters are separated by a semicolon. Example: AuthMech=3;SSL=1;AllowSelfSignedCerts=1.
How frequently will my data update?
As often as needed.
Are there any API limits that I need to be aware of?
Limits depend on your server configuration.
Can I use the same Apache Impala account to create multiple datasets?
What do I need to be aware of while writing a query?
Make sure that all the words, table names, and field names are correctly spelled. Refer to the Query Helper field if you need help building a query.
Why can't I connect to my Apache Impala database? Do I need to whitelist any IP addresses?
Before you can connect to an Apache Impala database in Domo, you must also whitelist a number of IP addresses on your database server on the port you want to connect to. For the full list of IP addresses, see Whitelisting IP Addresses.