Hadoop 2.4.0 – Installation and standalone use

Created: 2014/04/26 ; Modified: 2015/05/23
Thumbnail

This tutorial will install Hadoop 2.4.0 on Ubuntu Server.

Prerequisite:

  • Ubuntu server 14.04 64 bits

Steps:

  • Become root (00:38)
    • sudo -i
  • Install Softwares (01:00)
    • apt-get update
    • apt-get -y install python-software-properties openjdk-7-jdk
  • Create hadoop user and group (01:40)
    • adduser hadoop
  • Get the app and extract it (02:07)
    • cd /usr/local
    • wget http://apache.mirror.iweb.ca/hadoop/common/hadoop-2.4.0/hadoop-2.4.0.tar.gz
    • tar -zxf hadoop-2.4.0.tar.gz
    • ln -s hadoop-2.4.0 hadoop
    • chown -R hadoop:hadoop hadoop-2.4.0 hadoop
    • sudo -i -H -u hadoop
  • Configure (04:45)
    • cd /usr/local/hadoop
    • vim etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export HADOOP_PREFIX=/usr/local/hadoop
# On 64 bits
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_PREFIX/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_PREFIX/lib"
  • Try the command line (06:32)
    • bin/hadoop
  • Test (07:05)
    • mkdir input
    • cp etc/hadoop/*.xml input
    • bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.0.jar grep input output ‘dfs[a-z.]+’
    • cat output/*
    • rm -rf input output