STS Magic
About
Jupyter enables you to get started quickly on developing and running interactive queries thru spark thrift server using ppmagics. You can visualize your results as graphs and charts and share your reports.
Getting Started
Querying sts
Opening Notebook: Open Jupyter Notebook, click New
--> Python3
kernel
Import ppextensions : Execute the code below to import ppmagics from ppextensions to your notebook
%load_ext ppextensions.ppmagics
Using Spark Thrift Sever magic
To see available options for sts Magic run %sts?
:
%sts [-c CLUSTER_NAME] [-h HOST] [-p PORT] [-a AUTH] [-tab TABLEAU]
[-pub PUBLISH] [-tde TDE_NAME] [-pname PROJECT_NAME]
optional arguments:
-c CLUSTER_NAME, --cluster_name CLUSTER_NAME
Cluster Name to connect to
-h HOST, --hive_server host
sts server host name or ip address.
-p PORT, --port PORT sts Server port
-a AUTH, --auth AUTH Authentication type
-tab TABLEAU, --tableau TABLEAU
True to download tableau data
-pub PUBLISH, --publish PUBLISH
Publish Data to Tableau Server
-tde TDE_NAME, --tde_name TDE_NAME
tde Name to be published
-pname PROJECT_NAME, --project_name PROJECT_NAME
project name to be published
Running sts query:
Establishing a sts server connection to read data from sts
%%sts -c <cluster_name>
<your sql code line1>
Update ~/.ppextensions/config.json
with a named cluster including sts url
, port number
and auth
to use -c
if a persistent cluster configuration is desired.
{
"sts":{
"cluster_name": {
"host": <hostname>,
"port": <port_number>,
"auth": "plain/gssapi",
}
"cluster_name_1": {
"host": "<hostname">,
"port": <port_number>,
"auth": "plain/gssapi",
}
}
}
*Updated config will be available after restarting the kernel
Optionally, it is also possible to connect without a config
%%sts --host sts.server.com --port 10000 --auth gssapi
<your one-line sql code>
On an established sts server connection further queries can be run as:
sts sql in one-line mode:
%sts <your one-line sql code>
sts sql in multi-line mode:
%%sts
<your sql code line1>
<your sql code line2>
<your sql code lineN>
Publish to tableau
%sts --tableau True --publish True --tde_name <tde> --project_name <pname>
select * from database.table_name limit 10
**For tableau configuration refer to Publish Magic****