Home:ALL Converter>Unable to access to Hive warehouse directory with Spark

Unable to access to Hive warehouse directory with Spark

Ask Time:2018-11-20T22:01:07         Author:Mamaf

Json Formatter

I'm trying to connect to the Hive warehouse directory by using Spark on IntelliJ which is located at the following path :

hdfs://localhost:9000/user/hive/warehouse

In order to do this, I'm using the following code :

import org.apache.spark.sql.SparkSession

// warehouseLocation points to the default location for managed databases and tables
val warehouseLocation = "hdfs://localhost:9000/user/hive/warehouse"

val spark = SparkSession
 .builder()
 .appName("Spark Hive Local Connector")
 .config("spark.sql.warehouse.dir", warehouseLocation)
 .config("spark.master", "local")
 .enableHiveSupport()
 .getOrCreate()

spark.catalog.listDatabases().show(false)
spark.catalog.listTables().show(false)
spark.conf.getAll.mkString("\n")

import spark.implicits._
import spark.sql

sql("USE test")
sql("SELECT * FROM test.employee").show()

As one can see, I have created a database 'test' and create a table 'employee' into this database using the hive console. I want to get the result of the latest request.

The 'spark.catalog.' and 'spark.conf.' are used in order to print the properties of the warehouse path and database settings.

spark.catalog.listDatabases().show(false) gives me :

  • name : default
  • description : Default Hive database
  • locationUri : hdfs://localhost:9000/user/hive/warehouse

spark.catalog.listTables.show(false) gives me an empty result. So something is wrong at this step.

At the end of the execution of the job, i obtained the following error :

> Exception in thread "main" org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: Database 'test' not found;

I have also configured the hive-site.xml file for the Hive warehouse location :

<property>
<name>hive.metastore.warehouse.dir</name>
<value>hdfs://localhost:9000/user/hive/warehouse</value>
</property>

I have already created the database 'test' using the Hive console.

Below, the versions of my components :

  • Spark : 2.2.0
  • Hive : 1.1.0
  • Hadoop : 2.7.3

Any ideas ?

Author:Mamaf,eproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/53394691/unable-to-access-to-hive-warehouse-directory-with-spark
yy