Hive Spark
Follow instructions for your environment.
We will run Hive Shell and Spark Shell simultaneously. So we would need at least 2 terminals. So create at least 2 login terminals.
$ hive
Inspect the tables and run a query on 'clickstream' table.
hive>
show tables;
select * from clickstream limit 10;
select action, count(*) as total from clickstream group by action;
$ spark-shell
Type this in Spark Shell
sc.setLogLevel("WARN")
Go to http://localhost:4040 in the browser.
Do this in Spark-Shell
scala>
sqlContext.tableNames
val t = sqlContext.table("clickstream")
t.printSchema
t.show
sqlContext.sql("select * from clickstream limit 10").show
sqlContext.sql("select action, count(*) as total from clickstream group by action").show