LinuxCommandLibrary

impala

Execute SQL queries on Impala

TLDR

Launch impala in station mode

$ impala
copy

Launch impala in Access Point mode
$ impala [[-m|--mode]] ap
copy

Switch between different sections
$ [<Tab>|<Shift Tab>]
copy

Select a network to connect to
$ <Space>
copy

Display hotkeys
$ <?>
copy

SYNOPSIS

impala-shell [host:port] [-h host] [-p port] [-q query] [-f file] [--kerberos] [--ssl] [--ldap] [options]

PARAMETERS

-h, --host=
    Impala daemon hostname (default: localhost)

-p, --port=
    Impala daemon port (default: 21000)

-i, --impalad=
    Daemon host and port in single argument

-q, --query=
    Execute single SQL query string and exit

-f, --exec_file=
    Execute SQL from file and exit

--kerberos
    Enable Kerberos authentication

--principal=
    Kerberos principal (default: impala/_HOST@REALM)

--user=
    Username for LDAP or other auth

--ssl
    Enable SSL for secure connection

--cacert=
    CA certificate file for SSL

--protocol=
    Protocol: hs2, hs2-http, beeswax (default: hs2)

-B, --delimiters
    Output delimited by comma (use with --output_delimiter)

--database=
    Default database

--verbose
    Enable verbose logging

-V, --version
    Show version info

--help, -H
    Display help

DESCRIPTION

impala-shell (often referred to as the impala command) is the command-line interface for Apache Impala, an open source, massively parallel processing (MPP) SQL query engine for Hadoop ecosystems. It provides an interactive shell to connect to Impala daemons, execute SQL queries, DDL/DML statements, and manage sessions against large-scale data stored in HDFS, HBase, or other compatible storage.

Users can run ad-hoc queries with low latency, leveraging Impala's distributed architecture. Supports standards like SQL-92, HiveQL extensions, and integrations with Kerberos, LDAP, SSL for security. Ideal for data analysts and engineers needing fast analytics without moving data. Install via Cloudera or Apache packages; requires running impalad/coordinator services.

Common workflow: connect to cluster, set database, issue SELECT/INSERT/UPDATE, view query profiles. Outputs results in tabular or delimited format. Handles big data efficiently with features like live query progress and compression.

CAVEATS

Requires running Impala cluster (impalad, statestore). Not standard in base Linux distros; install via Apache/Cloudera repos. Large queries may need resource tuning. HS2 protocol changes in Impala 3.3+ require --strict_hs2_protocol.

BASIC USAGE EXAMPLE

impala-shell --host=impala.example.com --port=21000
USE mydb;
SELECT * FROM mytable LIMIT 10;

SECURE CONNECTION

impala-shell --host=host --kerberos --ssl --cacert=/path/to/ca.crt
SET live_progress=true;

HISTORY

Developed by Cloudera, first preview October 2012. Open-sourced to Apache Incubator May 2013, top-level project 2014. impala-shell introduced alongside, evolved with protocol updates (Beeswax to HS2) and security features.

SEE ALSO

beeline(1), psql(1), mysql(1), hive(1)

Copied to clipboard