Prerequisites:
Before working on this post you should review below posts.
Installing/Configuring PrestoDB |
Working with PrestoDB Connectors |
In this post , I'll be covering below
1- Installing and configuring Presto Admin
2- Installing Presto Cluster on a single node
3- Using Presto ODBC Driver
4- Installing and configuring Presto Cluster with one coordinator and three workers
Download Teradata Presto Distribution
You need to download two packages from Teradata for this post
Server Package - contains Presto Server
http://www.teradata.com/products-and-services/Presto/Download-Get-Started
unzip it , you will see prestoadmin offline package installer, presto server RPM and presto client Jar
Client Package - contains client utilities eg; ODBC Drivers
http://teradata.github.io/presto/docs/127t/installation/odbc.html
Installing Presto Admin
Presto Admin is a tool for installing and managing the Presto query engine on a cluster. It provides easy-to-use commands to:- Install and uninstall Presto across your cluster
- Configure your Presto cluster
- Start and stop the Presto servers
- Gather status and log information from your Presto cluster
Java 8
Python 2.6 or Python 2.7.
If you are using the online installer then make sure you’ve installed the Python development package for your system. For RedHat/Centos that package is python2-devel and for Debian/Ubuntu it is python-dev.
Presto Admin is packaged as an offline installer – prestoadmin-<version>-offline.tar.gz.
1- Copy the installer prestoadmin-<version>-offline.tar.gz to the location where you want presto-admin to run eg; /usr/hadoopsw. Note that presto-admin does not have to be on the same node(s) where Presto will run, though it does need to have SSH access to all of the nodes in the cluster.
2- Extract and run the installation script from within the prestoadmin directory.[presto@dn04 hadoopsw]$ ll
total 648140
-rw-r--r-- 1 root root 9456552 Mar 29 21:57 prestoadmin-2.2-offline.el7.tar.gz
[root@dn04 hadoopsw]# tar -xvf prestoadmin-2.2-offline.el7.tar.gz
[root@dn04 hadoopsw]# cd prestoadmin/
[root@dn04 prestoadmin]# ll
total 100
-rwxr-xr-x 1 linuxadmin linuxadmin 2710 Mar 24 01:01 install-prestoadmin.sh
-rw-rw-r-- 1 linuxadmin linuxadmin 91414 Mar 24 01:01 prestoadmin-2.2-py2-none-any.whl
drwxrwxr-x 2 linuxadmin linuxadmin 4096 Jun 18 13:55 third-party
[root@dn04 hadoopsw]# cd prestoadmin/
[root@dn04 prestoadmin]# mkdir conf
[root@dn04 prestoadmin]# mkdir log
[root@dn04 prestoadmin]# ./install-prestoadmin.sh
The installation script will create a presto-admin-install directory and an executable presto-admin script. By default, the presto-admin config and log directory locations are
configured to be ~/.prestoadmin and ~/.prestoadmin/log, respectively. This can be changed by modifying the environment variables, PRESTO_ADMIN_CONFIG_DIR and PRESTO_ADMIN_LOG_DIR.
The installation script will also create the directories pointed to by PRESTO_ADMIN_CONFIG_DIR and PRESTO_ADMIN_LOG_DIR. If those directories already exist, the installation script will not erase their contents.
3- Verify that presto-admin was installed properly by running the following command:
[root@dn04 prestoadmin]# cd presto-admin-install/bin
[root@dn04 bin]# ./presto-admin --help
Usage: presto-admin [options] <command> [arg]
Options:
--version show program's version number and exit
-h, --help show this help message and exit
-d, --display print detailed information about command
--extended-help print out all options, including advanced ones
-I, --initial-password-prompt
Force password prompt up-front
-p PASSWORD, --password=PASSWORD
password for use with authentication and/or sudo
Commands:
catalog add
catalog remove
collect logs
collect query_info
collect system_info
configuration deploy
configuration show
file copy
file run
package install
package uninstall
plugin add_jar
server install
server restart
server start
server status
server stop
server uninstall
server upgrade
topology show
Configuring Presto-Admin
In order to use presto-admin to manage software on a cluster of nodes, you must specify a configuration for presto-admin. This configuration indicates the nodes on which to install as well as other credentials.Set the environment variables
[root@dn04 bin]# vi ~/.bash_profile
[root@dn04 bin]# cat ~/.bash_profile
## Teradata Presto Distribution Variables
export PRESTO_ADMIN_HOME=/usr/hadoopsw/prestoadmin
export PRESTO_ADMIN_CONFIG_DIR=$PRESTO_ADMIN_HOME/conf
export PRESTO_ADMIN_LOG_DIR=$PRESTO_ADMIN_HOME/log
export PATH=$PATH:$PRESTO_ADMIN_HOME/presto-admin-install/bin
[root@dn04 bin]# source ~/.bash_profile
{
"username": "<ssh_user_name>",
"port": "<ssh_port>",
"coordinator": "<host_name>",
"workers": ["<host_name_1>", "<host_name_2>", ... "<host_name_n>"],
"java8_home":"<path/to/java8/on/presto/nodes>"
}
All of the properties are optional, and if left out the following defaults will be used:
{
"username": "root",
"port": "22",
"coordinator": "localhost",
"workers": ["localhost"]
}
Installing the Presto Server (Single Node)
To install the Presto query engine on a cluster of nodes using presto-admin:- Download presto-server-rpm-VERSION.ARCH.rpm
- Copy the RPM to a location accessible by presto-admin eg; /usr/hadoopsw/.
- Run the following command to install Presto:
[root@dn04 ~]# cd /usr/hadoopsw/
[root@dn04 hadoopsw]# ll
total 1060820
-rw-r--r-- 1 root root 422579061 Mar 31 05:19 presto-server-rpm-0.167-t.0.2.x86_64.rpm
[root@dn04 hadoopsw]# ssh-keygen -t rsa
[root@dn04 hadoopsw]# ssh-copy-id root@localhost
[root@dn04 hadoopsw]# presto-admin server install presto-server-rpm-0.167-t.0.2.x86_64.rpm
Using rpm_specifier as a local path
Fetching local presto rpm at path: presto-server-rpm-0.167-t.0.2.x86_64.rpm
Found existing rpm at: presto-server-rpm-0.167-t.0.2.x86_64.rpm
Deploying rpm on localhost...
Package deployed successfully on: localhost
Package installed successfully on: localhost
Deploying configuration on: localhost
Deploying tpch.properties catalog configurations on: localhost
Now start the presto server
[root@dn04 hadoopsw]# presto-admin server start
[localhost] out: Starting presto
[localhost] out: Started as 31018
[localhost] out:
Waiting to make sure we can connect to the Presto server on localhost, please wait. This check will time out after 2 minutes if the server does not respond.
Server started successfully on: localhost
Note: If you make any change in config.properties file for coordinator or worker , you will need to redeploy the configuration as below
presto-admin configuration deploy
Running Presto Queries
Copy the presto-cli to the location you want to run it from. This location may be any node that has network access to the coordinator. Rename the artifact to prestocli and make it executable.
[hdpsysuser@nn01 ~]$ mv presto-cli-0.167-t.0.2-executable.jar prestocli
[root@nn01 hadoopsw]# chmod +x prestocli
[root@nn01 hadoopsw]# ./prestocli --server localhost:8080
presto> select count(*) from tpch.tiny.lineitem;
_col0
-------
60175
(1 row)
Query 20170618_140237_00002_tasy2, FINISHED, 1 node
Splits: 33 total, 33 done (100.00%)
0:03 [60.2K rows, 0B] [20.5K rows/s, 0B/s]
presto> exit;
Adding a Catalog
To add a catalog for the Hive connector:
1- Create a file hive.properties in PRESTO_ADMIN_CONFIG_DIR/catalog with the following content:
connector.name=hive-hadoop2
connector.name=hive-hadoop2
hive.metastore.uri=thrift://192.168.49.134:9083
[root@dn04 ~]# presto-admin catalog add hive
Deploying hive.properties catalog configurations on: localhost
Deploying hive.properties catalog configurations on: localhost
3- Restart Presto:
[root@dn04 ~]# presto-admin server restart
[localhost] out: Stopping presto
[localhost] out: Stopped 31018
[localhost] out:
[localhost] out: Starting presto
[localhost] out: Started as 1874
[localhost] out:
Waiting to make sure we can connect to the Presto server on localhost, please wait. This check will time out after 2 minutes if the server does not respond.
Server started successfully on: localhost
presto> select count(*) from hive.scott.wordcount;
_col0
-----------
295379535
(1 row)
Query 20170618_143118_00007_d68us, FINISHED, 1 node
Splits: 384 total, 384 done (100.00%)
2:34 [295M rows, 16.6GB] [1.92M rows/s, 111MB/s]
Please observe the query output statistics , we will compare it once we will have Presto cluster with 3 workers later in this post.
Configuring ODBC Driver for Teradata Presto
1- Download the drivers from below location and unzip file
http://teradata.github.io/presto/docs/127t/installation/odbc.html
2- Install ODBC driver as per your OS, I used Windows 64bit, and configure ODBC data source
After this you can use this DSN with your desired tool eg; MS PowerBI, Excel, Tableau etc.
Scenario: Need to install Presto server on a 4 node cluster, one node with coordinator and three nodes with workers using a specific OS user ie; presto.
Coordinater node ==> en01
Worker Nodes ===> dn01,dn02,dn03
Create OS user on all the nodes to be used for Presto cluster and establish the SSH between coordinator and workers nodes. Put the presto user in sudoer list also.
# useradd presto
# passwd prestoGenerate Keys
[presto@en01 ~]$ su - presto
[presto@en01 ~]$ ssh-keygen -t rsa
Copy Keys to worker nodes
[presto@en01 ~]$ ssh-copy-id presto@dn01
[presto@en01 ~]$ ssh-copy-id presto@dn02
[presto@en01 ~]$ ssh-copy-id presto@dn03
Test the ssh connectivity with worker nodes
[presto@en01 ~]$ ssh 'presto@dn01'
[presto@en01 ~]$ ssh 'presto@dn02'
[presto@en01 ~]$ ssh 'presto@dn03'
[presto@en01 hadoopsw]$ su -
[root@en01 ~]# usermod -aG wheel presto
[root@dn01 ~]# usermod -aG wheel presto
[root@dn02 ~]# usermod -aG wheel presto
[root@dn03 ~]# usermod -aG wheel presto
Run visudo utility and add the user in file to run any command without passoword
search for the below line
## Allows people in group wheel to run all commands
and then add the below line
presto ALL=(ALL) NOPASSWD: ALL
[root@en01 ~]# visudo
presto ALL=(ALL) NOPASSWD: ALL
Save the file (:wq) and check if it is working
[presto@en01 hadoopsw]$ sudo whoami
root
[presto@dn01 ~]$ sudo whoami
root
[presto@dn02 ~]$ sudo whoami
root
[presto@dn03 ~]$ sudo whoami
root
Install and Configure Presto-Admin on Coordinator
Install and configure Presto-Admin on coordinator node as prescribed above in this post.
[presto@en01 hadoopsw]$ sudo tar -xvf prestoadmin-2.2-offline.el7.tar.gz
[presto@en01 hadoopsw]$ cd prestoadmin/
[presto@en01 prestoadmin]$ sudo mkdir conf
[presto@en01 prestoadmin]$ sudo mkdir log
[presto@en01 prestoadmin]$ cd presto-admin-install/bin
[presto@en01 bin]$ sudo ./presto-admin --help
Create a file $PRESTO_ADMIN_CONFIG_DIR/config.json , Use the following configuration.
[presto@en01 prestoadmin]$ sudo vi $PRESTO_ADMIN_CONFIG_DIR/config.json
{
"username": "presto",
"port": "22",
"coordinator": "en01",
"workers": ["dn01","dn02","dn03"]
}
Config Properties
The config properties file, etc/config.properties, contains the configuration for the Presto server. Every Presto server can function as both a coordinator and a worker, but dedicating a single machine to only perform coordination work provides the best performance on larger clusters.
The following is a minimal configuration for the coordinator:
coordinator=true
node-scheduler.include-coordinator=false
http-server.http.port=8080
query.max-memory=50GB
query.max-memory-per-node=1GB
discovery-server.enabled=true
discovery.uri=http://example.net:8080
And this is a minimal configuration for the workers:
coordinator=false
http-server.http.port=8080
query.max-memory=50GB
query.max-memory-per-node=1GB
discovery.uri=http://example.net:8080
Alternatively, if you are setting up a single machine for testing that will function as both a coordinator and worker, use this configuration:
coordinator=true
node-scheduler.include-coordinator=true
http-server.http.port=8080
query.max-memory=5GB
query.max-memory-per-node=1GB
discovery-server.enabled=true
discovery.uri=http://example.net:8080
Now install presto server, I opted to use an interactive password (-I command line option) prompt, which prompts you for the initial value of your password before running any commands:
[presto@en01 ~]$ sudo /usr/hadoopsw/prestoadmin/presto-admin server install /usr/hadoopsw/presto-server-rpm-0.167-t.0.2.x86_64.rpm -I
Initial value for env.password:
Using rpm_specifier as a local path
Fetching local presto rpm at path: /usr/hadoopsw/presto-server-rpm-0.167-t.0.2.x86_64.rpm
Found existing rpm at: /usr/hadoopsw/presto-server-rpm-0.167-t.0.2.x86_64.rpm
Deploying rpm on dn03...
Deploying rpm on dn02...
Deploying rpm on dn01...
Deploying rpm on en01...
Package deployed successfully on: dn03
Package deployed successfully on: dn02
Package deployed successfully on: dn01
Package deployed successfully on: en01
Package installed successfully on: dn03
Deploying configuration on: dn03
Deploying tpch.properties catalog configurations on: dn03
Package installed successfully on: dn02
Deploying configuration on: dn02
Package installed successfully on: en01
Deploying configuration on: en01
Deploying tpch.properties catalog configurations on: dn02
Deploying tpch.properties catalog configurations on: en01
Package installed successfully on: dn01
Deploying configuration on: dn01
Deploying tpch.properties catalog configurations on: dn01
Presto is now installed on the coordinator and workers.
The installation script will create a presto-admin-install directory and an executable presto-admin script. By default, the presto-admin config and log directory locations are configured to be ~/.prestoadmin (eg; /root/.prestoadmin) and ~/.prestoadmin/log (eg; /root/.prestoadmin/log), respectively.
Start the presto server
2- Distribute the configuration file
4- Copy the presto-cli ()to the location you want to run it from. This location may be any node that has network access to the coordinator. Rename the artifact to prestocli and make it executable.
5- Run a query with cluster
Performance comparison with Hive
[presto@en01 ~]$ sudo /usr/hadoopsw/prestoadmin/presto-admin server start -p presto
[dn03] out: Starting presto
[dn03] out: Started as 3251
[dn03] out:
Waiting to make sure we can connect to the Presto server on dn03, please wait. This check will time out after 2 minutes if the server does not respond.
[dn02] out: Starting presto
[en01] out: Starting presto
[dn02] out: Started as 15087
[dn02] out:
Waiting to make sure we can connect to the Presto server on dn02, please wait. This check will time out after 2 minutes if the server does not respond.
[en01] out: Started as 8452
[en01] out:
Waiting to make sure we can connect to the Presto server on en01, please wait. This check will time out after 2 minutes if the server does not respond.
[dn01] out: Starting presto
[dn01] out: Started as 11767
[dn01] out:
Waiting to make sure we can connect to the Presto server on dn01, please wait. This check will time out after 2 minutes if the server does not respond.
Server started successfully on: en01
Server started successfully on: dn02
Server started successfully on: dn03
Server started successfully on: dn01
Check Status command line/web
[presto@en01 ~]$ sudo /usr/hadoopsw/prestoadmin/presto-admin server status -p presto
Server Status:
en01(IP: 192.168.49.134, Roles: coordinator): Running
Node URI(http): http://192.168.49.134:8080
Presto Version: 0.167-t.0.2
Node status: active
Catalogs: system, tpch
Server Status:
dn01(IP: 192.168.49.135, Roles: worker): Running
Node URI(http): http://192.168.49.135:8080
Presto Version: 0.167-t.0.2
Node status: active
Catalogs: system, tpch
Server Status:
dn02(IP: 192.168.49.136, Roles: worker): Running
Node URI(http): http://192.168.49.136:8080
Presto Version: 0.167-t.0.2
Node status: active
Catalogs: system, tpch
Server Status:
dn03(IP: 192.168.49.137, Roles: worker): Running
Node URI(http): http://192.168.49.137:8080
Presto Version: 0.167-t.0.2
Node status: active
Catalogs: system, tpch
Show Topology
[presto@en01 ~]$ sudo /usr/hadoopsw/prestoadmin/presto-admin topology show
{'coordinator': u'en01',
'port': 22,
'username': u'presto',
'workers': [u'dn01',
u'dn02',
u'dn03']}
{'coordinator': u'en01',
'port': 22,
'username': u'presto',
'workers': [u'dn01',
u'dn02',
u'dn03']}
Adding a Catalog and running query
To add a catalog for the Hive connector:
1- Create a file hive.properties in /root/.prestoadmin/catalog with the following content:
connector.name=hive-hadoop2
connector.name=hive-hadoop2
hive.metastore.uri=thrift://192.168.49.134:9083
[presto@en01 ~]$ sudo /usr/hadoopsw/prestoadmin/presto-admin catalog add hive -p presto
Deploying hive.properties catalog configurations on: dn03
Deploying hive.properties catalog configurations on: dn02
Deploying hive.properties catalog configurations on: dn01
Deploying hive.properties catalog configurations on: en01
3- Restart Presto:
[presto@en01 ~]$ sudo /usr/hadoopsw/prestoadmin/presto-admin server restart -p presto
[presto@en01 hadoopsw]$ sudo mv /usr/hadoopsw/prestoadmin/presto-cli-0.167-t.0.2-executable.jar /usr/hadoopsw/prestoadmin/prestocli
[presto@en01 hadoopsw]$ sudo chmod +x /usr/hadoopsw/prestoadmin/prestocli
[presto@en01 hadoopsw]$ /usr/hadoopsw/prestoadmin/prestocli
presto> select count(*) from hive.scott.wordcount;
_col0
-----------
295379535
(1 row)
Query 20170619_133900_00026_4hzf9, FINISHED, 3 nodes
Splits: 384 total, 384 done (100.00%)
0:22 [295M rows, 16.6GB] [13.7M rows/s, 789MB/s]
The same query took 2:34 with single worker and with 3 workers it took only 22 seconds.
I've an external table in Hive with 117GB data with 2 Billion rows approx, just ran the simple count query both in Hive and Presto (with 3 nodes/workers) and observed the difference, its significant. I also observed CPU usage was much higher when running query in Hive.
External Table Location and size
[hdpsysuser@nn01 ~]$ hdfs dfs -du -h /userdata/wordcount
16.6 G /userdata/wordcount/wc.txt
99.6 G /userdata/wordcount/wc5-6-7-8-9-10.txt
1.7 M /userdata/wordcount/wcs.txt
hive (scott)> select count(*) from wordcount;
...
...
2017-06-21 11:44:02,692 Stage-1 map = 0%, reduce = 0%
2017-06-21 11:44:13,133 Stage-1 map = 1%, reduce = 0%, Cumulative CPU 30.19 sec
...
...
2017-06-21 11:47:40,280 Stage-1 map = 98%, reduce = 33%, Cumulative CPU 3769.99 sec
2017-06-21 11:47:41,306 Stage-1 map = 99%, reduce = 33%, Cumulative CPU 3794.59 sec
2017-06-21 11:47:47,463 Stage-1 map = 100%, reduce = 33%, Cumulative CPU 3841.08 sec
2017-06-21 11:47:48,488 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 3843.48 sec
MapReduce Total cumulative CPU time: 0 days 1 hours 4 minutes 3 seconds 480 msec
Ended Job = job_1494154575009_0334
MapReduce Jobs Launched:
Stage-Stage-1: Map: 465 Reduce: 1 Cumulative CPU: 3843.48 sec HDFS Read: 124809831211 HDFS Write: 110 SUCCESS
Total MapReduce CPU Time Spent: 0 days 1 hours 4 minutes 3 seconds 480 msec
OK
c0
2067479535
Time taken: 238.152 seconds, Fetched: 1 row(s)
Hive query took about 4 minutes to execute.
presto:scott> select count(*) from wordcount;
_col0
------------
2067479535
(1 row)
Query 20170621_085324_00006_4hzf9, FINISHED, 3 nodes
Splits: 1,979 total, 1,979 done (100.00%)
1:57 [2.07B rows, 116GB] [17.6M rows/s, 1016MB/s]
Presto query took 1 minute 57 seconds
Now test with a small table (only 16 rows) using order by clause
hive (scott)> select * from emp order by job desc;
...
...
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 8.64 sec HDFS Read: 10469 HDFS Write: 942 SUCCESS
Total MapReduce CPU Time Spent: 8 seconds 640 msec
OK
emp.empno emp.ename emp.job emp.mgr emp.hiredate emp.sal emp.comm emp.deptno
7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30
7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30
7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30
7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30
7839 KING PRESIDENT 17-NOV-81 5000 10
7698 BLAKE MANAGER 7839 01-MAY-81 2850 30
7566 JONES MANAGER 7839 02-APR-81 2975 20
7782 CLARK MANAGER 7839 09-JUN-81 2450 10
7934 MILLER CLERK 7782 23-JAN-82 1300 10
7900 JAMES CLERK 7698 03-DEC-81 950 30
7876 ADAMS CLERK 7788 23-MAY-87 1100 20
7369 SMITH CLERK 7902 17-DEC-80 800 20
7902 FORD ANALYST 7566 03-DEC-81 3000 20
7788 SCOTT ANALYST 7566 19-APR-87 3000 20
2222 Tanvir NULL NULL NULL NULL NULL NULL
3 Tanvir1 NULL NULL NULL NULL NULL NULL
Time taken: 22.688 seconds, Fetched: 16 row(s)
presto:scott> select * from emp order by job desc;
empno | ename | job | mgr | hiredate | sal | comm | deptno
-------+---------+-----------+------+-----------+------+------+--------
7654 | MARTIN | SALESMAN | 7698 | 28-SEP-81 | 1250 | 1400 | 30
7499 | ALLEN | SALESMAN | 7698 | 20-FEB-81 | 1600 | 300 | 30
7521 | WARD | SALESMAN | 7698 | 22-FEB-81 | 1250 | 500 | 30
7844 | TURNER | SALESMAN | 7698 | 08-SEP-81 | 1500 | 0 | 30
7839 | KING | PRESIDENT | | 17-NOV-81 | 5000 | | 10
7566 | JONES | MANAGER | 7839 | 02-APR-81 | 2975 | | 20
7698 | BLAKE | MANAGER | 7839 | 01-MAY-81 | 2850 | | 30
7782 | CLARK | MANAGER | 7839 | 09-JUN-81 | 2450 | | 10
7934 | MILLER | CLERK | 7782 | 23-JAN-82 | 1300 | | 10
7369 | SMITH | CLERK | 7902 | 17-DEC-80 | 800 | | 20
7876 | ADAMS | CLERK | 7788 | 23-MAY-87 | 1100 | | 20
7900 | JAMES | CLERK | 7698 | 03-DEC-81 | 950 | | 30
7788 | SCOTT | ANALYST | 7566 | 19-APR-87 | 3000 | | 20
7902 | FORD | ANALYST | 7566 | 03-DEC-81 | 3000 | | 20
2222 | Tanvir | NULL | NULL | NULL | NULL | NULL | NULL
3 | Tanvir1 | NULL | NULL | NULL | NULL | NULL | NULL
(16 rows)
Query 20170621_090214_00007_4hzf9, FINISHED, 3 nodes
Splits: 20 total, 20 done (100.00%)
0:00 [16 rows, 687B] [124 rows/s, 5.2KB/s]
Same query on same data using same machines tool 22 seconds in hive and 0 second in Presto.
Some Other commands
-- display catalogs
presto:test> show catalogs;
Catalog
------------
hive
postgresql
system
tpch
(4 rows)
presto:test> show schemas in tpch;
Schema
--------------------
information_schema
sf1
sf100
sf1000
sf10000
sf100000
sf300
sf3000
sf30000
tiny
(10 rows)
Table
----------
customer
lineitem
nation
orders
part
partsupp
region
supplier
(8 rows)
-- set default catalog and schema so you don't have to fully qualifiy the table name
presto:test> use tpch.sf1;
presto:sf1> select count(*) from lineitem;
_col0
---------
6001215
(1 row)
-- describe tables to figure out what to join
presto:sf1> describe nation;
Column | Type | Extra | Comment
-----------+--------------+-------+---------
nationkey | bigint | |
name | varchar(25) | |
regionkey | bigint | |
comment | varchar(152) | |
(4 rows)
Query 20171218_140909_00527_ab4e5, FINISHED, 2 nodes
Splits: 18 total, 18 done (100.00%)
0:00 [4 rows, 256B] [73 rows/s, 4.58KB/s]
presto:sf1> describe region;
Column | Type | Extra | Comment
-----------+--------------+-------+---------
regionkey | bigint | |
name | varchar(25) | |
comment | varchar(152) | |
(3 rows)
nationkey | name | regionkey | comment
-----------+----------------+-----------+--------------------------------------------------------------------------------------------------------
1 | ARGENTINA | 1 | al foxes promise slyly according to the regular accounts. bold requests alon
2 | BRAZIL | 1 | y alongside of the pending deposits. carefully special packages are about the ironic forges. slyly spec
3 | CANADA | 1 | eas hang ironic, silent packages. slyly regular packages are furiously over the tithes. fluffily bold
4 | EGYPT | 4 | y above the carefully unusual theodolites. final dugouts are quickly across the furiously regular d
10 | IRAN | 4 | efully alongside of the slyly final dependencies.
11 | IRAQ | 4 | nic deposits boost atop the quickly final requests? quickly regula
13 | JORDAN | 4 | ic deposits are blithely about the carefully regular pa
17 | PERU | 1 | platelets. blithely pending dependencies use fluffily across the even pinto beans. carefully silent acc
20 | SAUDI ARABIA | 4 | ts. silent requests haggle. closely express packages sleep across the blithely
24 | UNITED STATES | 1 | y final packages. slow foxes cajole quickly. quickly silent platelets breach ironic accounts. unusual p
0 | ALGERIA | 0 | haggle. carefully final deposits detect slyly agai
5 | ETHIOPIA | 0 | ven packages wake quickly. regu
14 | KENYA | 0 | pending excuses haggle furiously deposits. pending, express pinto beans wake fluffily past t
15 | MOROCCO | 0 | rns. blithely bold courts among the closely regular packages use furiously bold platelets?
16 | MOZAMBIQUE | 0 | s. ironic, unusual asymptotes wake blithely r
6 | FRANCE | 3 | refully final requests. regular, ironi
7 | GERMANY | 3 | l platelets. regular accounts x-ray: unusual, regular acco
8 | INDIA | 2 | ss excuses cajole slyly across the packages. deposits print aroun
9 | INDONESIA | 2 | slyly express asymptotes. regular deposits haggle slyly. carefully ironic hockey players sleep blithel
12 | JAPAN | 2 | ously. final, express gifts cajole a
18 | CHINA | 2 | c dependencies. furiously express notornis sleep slyly regular accounts. ideas sleep. depos
19 | ROMANIA | 3 | ular asymptotes are about the furious multipliers. express dependencies nag above the ironically ironic
21 | VIETNAM | 2 | hely enticingly express accounts. even, final
22 | RUSSIA | 3 | requests against the platelets use never according to the quickly regular pint
23 | UNITED KINGDOM | 3 | eans boost carefully special requests. accounts are. carefull
(25 rows)
1 comment:
Teradata is basically a database. The purpose of teradatabase is for creating a table, column, row, domain, primary key etc. you can get further more information in Teradata dba Online Training
Post a Comment