Hadoop 2.4.0 – Use with a one node cluster

Created: 2014/04/26 ; Modified: 2015/05/23
Thumbnail

To try that tutorial, you must already have done the steps in the previous video.

  • Configure the filesystem (00:30)
    • vim etc/hadoop/core-site.xml
<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://localhost:9000</value>
  </property>
</configuration>
  • vim etc/hadoop/hdfs-site.xml
<configuration>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
</configuration>
  • Prepare SSH password-less (01:50)
    • ssh-keygen -t dsa -P  » -f ~/.ssh/id_dsa
    • cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
    • ssh localhost
  • Prepare HDFS (03:43)
    • bin/hdfs namenode -format
    • sbin/start-dfs.sh
  • Create some directories on HDFS (05:27)
    • bin/hdfs dfs -mkdir /user
    • bin/hdfs dfs -mkdir /user/hadoop
    • bin/hdfs dfs -ls /
  • Test (06:47)
    • bin/hdfs dfs -put etc/hadoop input
    • bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.0.jar grep input output ‘dfs[a-z.]+’
    • bin/hdfs dfs -get output output
    • cat output/*
  • Stop HDFS (09:11)
    • sbin/stop-dfs.sh