I am trying to copy files from S3 to HDFS using the following command:
hadoop distcp s3n://bucketname/filename hdfs://namenodeip/directory
However this is not working, getting an error as following:
ERROR tools.DistCp: Exception encountered
java.lang.IllegalArgumentException: Invalid hostname in URI
I have tried to add the S3 keys in hadoop conf.xml, and it is also not working. Please help me the appropriate step by step procedure to achieve the file copy from S3 to HDFS.
Thanks in advance.
scalauser :
The command should be like this :\n\nHadoop distcp s3n://bucketname/directoryname/test.csv /user/myuser/mydirectory/\n\n\nThis will copy test.csv file from S3 to a HDFS directory called /mydirectory in the specified HDFS path.\nIn this S3 file system is being used in a native mode. More details can be found on http://wiki.apache.org/hadoop/AmazonS3",
2014-04-25T10:23:05