Home:ALL Converter>Hadoop Hive: How to allow regular user continuously write data and create tables in warehouse directory?

Hadoop Hive: How to allow regular user continuously write data and create tables in warehouse directory?

Ask Time:2014-03-11T17:11:46         Author:Anton Ashanin

Json Formatter

I am running Hadoop 2.2.0.2.0.6.0-101 on a single node. I am trying to run Java MRD program that writes data to an existing Hive table from Eclipse under regular user. I get exception:

org.apache.hadoop.security.AccessControlException: Permission denied: user=dev, access=WRITE, inode="/apps/hive/warehouse/testids":hdfs:hdfs:drwxr-xr-x

This happens because regular user has no write permission to warehouse directory, only hdfs user does:

drwxr-xr-x   - hdfs hdfs          0 2014-03-06 16:08 /apps/hive/warehouse/testids
drwxr-xr-x   - hdfs hdfs          0 2014-03-05 12:07 /apps/hive/warehouse/test

To circumvent this I change permissions on warehouse directory, so everybody now have write permissions:

[hdfs@localhost wks]$ hadoop fs -chmod -R a+w /apps/hive/warehouse
[hdfs@localhost wks]$ hadoop fs -ls /apps/hive/warehouse
drwxrwxrwx   - hdfs hdfs          0 2014-03-06 16:08 /apps/hive/warehouse/testids
drwxrwxrwx   - hdfs hdfs          0 2014-03-05 12:07 /apps/hive/warehouse/test

This helps to some extent, and MRD program can now write as a regular user to warehouse directory, but only once. When trying to write data into the same table second time I get:

ERROR security.UserGroupInformation: PriviledgedActionException as:dev (auth:SIMPLE) cause:org.apache.hcatalog.common.HCatException : 2003 : Non-partitioned table already contains data : default.testids

Now, if I delete output table and create it anew in hive shell, I again get default permissions that do not allow regular user to write data into this table:

[hdfs@localhost wks]$ hadoop fs -ls /apps/hive/warehouse
drwxr-xr-x   - hdfs hdfs          0 2014-03-11 12:19 /apps/hive/warehouse/testids
drwxrwxrwx   - hdfs hdfs          0 2014-03-05 12:07 /apps/hive/warehouse/test

Please advise on Hive correct configuration steps that will allow a program run as a regular user do the following operations in Hive warehouse:

  • Programmatically create / delete / rename Hive tables?
  • Programmatically read / write data from Hive tables?

Many thanks!

Author:Anton Ashanin,eproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/22321084/hadoop-hive-how-to-allow-regular-user-continuously-write-data-and-create-tables
Remus Rusanu :

If you maintain the table from outside Hive, then declare the table as external:\n\n\n An EXTERNAL table points to any HDFS location for its storage, rather than being stored in a folder specified by the configuration property hive.metastore.warehouse.dir.\n\n\nA Hive administrator can create the table and it can point it toward your own user owned HDFS storage location and you grant Hive permission to read from there. \n\nAs a general comment, there are no ways for an unprivileged user to do an unauthorized privileged action. Any such way is technically an exploit and you should never rely on it: even if is possible today, it will likely be closed soon. Hive Authorization (and HCatalog authorization) is orthogonal to HDFS authorization.\n\nYour application is also incorrect, irrelevant of authorization issues. You are trying to write 'twice' in the same table which means your application does not handle partitions correctly. Start from An Introduction to Hive’s Partitioning.",
2014-10-27T10:04:45
yy