I have this scala and Java code running in Spark on Cloudera platform whose simple task is to perform Word count on the files in HDFS. My question is : What's the difference in reading the file with this code snippet -
sc.textFile("hdfs://quickstart.cloudera:8020/user/spark/InputFile/inputText.txt")
as opposed to reading from local drive over cloudera platform?
sc.textFile("/home/cloudera/InputFile/inputText.txt")
Is it not that in both cases the file is saved using HDFS and wouldn't make any difference reading/ writing either ways? These both read/write to HDFS, right? I referred this thread, but no clue.
Cloudera Quickstart VM illegalArguementException: Wrong FS: hdfs: expected: file:
Could you please tell me at least a single case where using hdfs:// implies something else?
Thank You!