hadoop - Map function fails in mapreduce run in EMR -


i running own map reduce tasks on amazon emr. see map tasks failing, not able find out reason failed map tasks.

import fileinput import csv  mydict = {} csvreader = csv.reader(fileinput.input(mode='rb'), delimiter=',') newline in  csvreader:     #newline =  line.split(',')     if newline[6] not in mydict.keys():         #print 'zipcode: ' + row[6] + ' hospital code: ' + row[1]         mydict[newline[6]] = 1     elif newline[6] in mydict.keys():         #print 'value in row '+ str(mydict[row[6]])         mydict[newline[6]] += 1  key in mydict.keys():     print '%s\t%s' % (str(key), str(mydict[key])) 

the map task read csv file given input, create key,value pairs using data in 2 columns. reduce task aggregate them , print them.

the following stderr obtained maptask when #!/usr/bin/env python not added @ top of script. if adde,d stderr blank , yet maptask fails.:

/mnt/var/lib/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1417478068297_0008/container_1417478068297_0008_01_000005/./map_zip_hospi.py: line 1: import: command not found /mnt/var/lib/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1417478068297_0008/container_1417478068297_0008_01_000005/./map_zip_hospi.py: line 2: import: command not found /mnt/var/lib/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1417478068297_0008/container_1417478068297_0008_01_000005/./map_zip_hospi.py: line 3: mydict: command not found /mnt/var/lib/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1417478068297_0008/container_1417478068297_0008_01_000005/./map_zip_hospi.py: line 4: syntax error near unexpected token `(' /mnt/var/lib/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1417478068297_0008/container_1417478068297_0008_01_000005/./map_zip_hospi.py: line 4: `csvreader = csv.reader(fileinput.input(), delimiter=',') 

'

i can see console map tasks failing. can me find errors code ?

i missed trivial thing, see mistake committed many others. following should first line of python script.

#!/usr/bin/env python 

Comments

Popular posts from this blog

ruby on rails - RuntimeError: Circular dependency detected while autoloading constant - ActiveAdmin.register Role -

c++ - OpenMP unpredictable overhead -

javascript - Wordpress slider, not displayed 100% width -