- Software
- Hadoop
Hadoop 2.4.0 – Installation and standalone use
Created: 2014/04/26 ; Modified: 2015/05/23
This tutorial will install Hadoop 2.4.0 on Ubuntu Server.
Prerequisite:
- Ubuntu server 14.04 64 bits
Steps:
- Become root (00:38)
- Install Softwares (01:00)
- apt-get update
- apt-get -y install python-software-properties openjdk-7-jdk
- Create hadoop user and group (01:40)
- Get the app and extract it (02:07)
- cd /usr/local
- wget http://apache.mirror.iweb.ca/hadoop/common/hadoop-2.4.0/hadoop-2.4.0.tar.gz
- tar -zxf hadoop-2.4.0.tar.gz
- ln -s hadoop-2.4.0 hadoop
- chown -R hadoop:hadoop hadoop-2.4.0 hadoop
- sudo -i -H -u hadoop
- Configure (04:45)
- cd /usr/local/hadoop
- vim etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export HADOOP_PREFIX=/usr/local/hadoop
# On 64 bits
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_PREFIX/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_PREFIX/lib"
- Try the command line (06:32)
- Test (07:05)
- mkdir input
- cp etc/hadoop/*.xml input
- bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.0.jar grep input output ‘dfs[a-z.]+’
- cat output/*
- rm -rf input output
Comments, questions and suggestions:
Hi!
Thank you so much for the video.
I was having the same problem as you and those extra « exports » for the 64 bit version seemed to solve them, at least so far 🙂
How were you able to figure this out?
Regards,
Amin
Hi,
I found it somewhere on the net after some research. That really should be part of their manual.
Cheers
Hi, Thank you very much for the video. I build an Ubuntu machine with the express purpose of workinging with hadoop. I was struggling until your video. Thank you again. I am a long term technology guy, but essentially no Linux.
Hi.. you are simply awesome and you made my day. Thank you so much for this wonderful video.
Hi . When I login as hadoop user it is pointing to /home/hadoop. The first time I installed it worked and later I check it is not working.. can you help pls
Hi,
I have installed Hadoop as per your guidlines. But when i run bin/hadoop, its showing hadoop: command not found.
Please suggest me how to rectify this issue
My System info:
LinuxMint 17 LTS 32 bit
Thanks & Regards,
Bhaskar
Hi,
I have installed Hadoop as per your guidlines. But when i run bin/hadoop, its showing hadoop: command not found.
Please suggest me how to rectify this issue & also I am a new for Linux OS and first time i have installed Linux OS & trying to install Hadoop.
My System info:
LinuxMint 17 LTS 32 bit
Steps followed for Hadoop Installation:
Step1: sudo -i
Step2: apt-get update
Step3: apt-get -y install python-software-properties (I already have Java Software. installed in the path of « /opt/java/jdk1.8.0_11 »
Step4: adduser hadoop
Step5: cd /usr/local
Step6: wget http://apache.mirror.iweb.ca/hadoop/common/hadoop-2.4.0/hadoop-2.4.0.tar.gz
Step7: tar -zxf hadoop-2.4.0.tar.gz
Step8: ln -s hadoop-2.4.0 hadoop
Step9: chown -R hadoop:hadoop hadoop-2.4.0 hadoop
Step10: sudo -i -H -u hadoop
Step11: cd /usr/local/hadoop
Step12: gedit etc/hadoop/hadoop-env.sh
Error:
No protocol specified
(gedit:4788): Gtk-WARNING **: cannot open display: :0
Step13: Manually opened hadoop-env.sh file add the below two lines
export JAVA_HOME=/opt/java/jdk1.8.0_11
export HADOOP_PREFIX=/usr/local/hadoop
Step14: bin/hadoop
Showing the folloiwng info in Terminal
Usage: hadoop [–config confdir] COMMAND
where COMMAND is one of:
fs run a generic filesystem user client
version print the version
jar run a jar file
checknative [-a|-h] check native hadoop and compression libraries availability
distcp copy file or directories recursively
archive -archiveName NAME -p * create a hadoop archive
classpath prints the class path needed to get the
Hadoop jar and the required libraries
daemonlog get/set the log level for each daemon
or
CLASSNAME run the class named CLASSNAME
Most commands print help when invoked w/o parameters.
Step15: Could not run remaining steps from here
Thanks & Regards,
Bhaskar
Thanks & Regards,
Bhaskar
Hi Bhaskar,
When you want to run hadoop, make sure you are in the right directory (i.e. cd /usr/local/hadoop )
For step12, you cannot use gedit if you are not in a visual environment. That`s why I use vim.
Cheers
I have followed every step here to install Hadoop-2.5.0-src on Ubuntu 64-bit system. With a difference that I could locate hadoop-env.sh file in « /usr/local/hadoop/hadoop-common-project/hadoop-common/src/main/conf ». Then edited that following the vim command as tutored but then bin/hadoop returns « no such file or directory ». What should be done? [Beginner]
Hi, the problem is that you took the « src » version. Which means that you have the source code and you would need to compile it before using it.
Take this one: http://mirror.its.dal.ca/apache/hadoop/common/hadoop-2.5.0/hadoop-2.5.0.tar.gz
Hi Simon
I get this error everytime I run hadoop
/usr/local/hadoop-2.5.0/etc/hadoop/hadoop-env.sh: line 85:
HADOOP_PREFIX: command not found
Hi Andrew,
this is hard to tell since I haven’t tried this tutorial with 2.5.0, but 2.4.0 and I don’t know all the steps you did.
Hi,
This is a nice video and as I followed everything was going fine until I executed the command:
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.0.jar grep input output ‘dfs[a-z.]+’
It was showing error:
java.net.ConnectException: Call From arindam/127.0.1.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused
My working environment is Ubuntu 14.04.
Handy video (likely to point a number of people to it).
One small thing. ‘ll’ while a pretty command alias for people to use, it not standard by any means.
I would suspect a lot of people watching would get a bit confused by it despite it having nothing to do with hadoop.
Thanks a lot for this video. I just installed Hadoop on Ubuntu 32-bit LTS. Right now its working fine.