Home:ALL Converter>Hadoop and Python: View Error

Hadoop and Python: View Error

Ask Time:2013-10-01T01:34:15         Author:Objc55

Json Formatter

I'm using Hadoop streaming to run some Python code. I have noticed that if there is an error in my Python code (in mapper.py, for example), I won't be notified about the error. Instead, the mapper program will fail to run, and the job will be killed after a few seconds. Viewing the logs, the only error I see is that mapper.py failed to run or was not found, which is clearly not the case.

My question is, is there a specific log file I can check to see actual errors that may exist in the mapper.py code? (For example, would tell me if an import command failed)

Thank you!

edit: The command used:

bin/hadoop jar contrib/streaming/hadoop-streaming.jar \ -file /hadoop/mapper.py -mapper /hadoop/mapper.py -file /hadoop/reducer.py -reducer /hadoop/reducer.py -input /hadoop/input.txt -output /hadoop/output

and the post I am referencing for which I'd like to see the errors: Hadoop and NLTK: Fails with stopwords

Author:Objc55,eproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/19100315/hadoop-and-python-view-error
vinaut :

About the log question, see it this helps :\n\nMapReduce: Log file locations for stdout and std err\n\nI suppose that if the python file fails to run, then the interpreter should print to stdout, and you would see it in the stdout log of that node.",
2013-09-30T18:51:36
yy