單機版Hadoop建置 (Hadoop1.0.1 in Ubuntu10.04)

  • 2939
  • 0
  • 2012-03-27

Hadoop單機版建置 (Hadoop1.0.1 in Ubuntu10.04)

步驟

  1. 安裝JDK1.6 
    sudo apt-get install sun-java6-jdk
    java -version /*確認是否安裝成功*/

    上面指令無法安裝的話,可改用以下Repository
    sudo add-apt-repository ppa:ferramroberto/java
    sudo apt-get update
    sudo apt-get install sun-java6-jdk

  2. 安裝openssh-server
    sudo apt-get install openssh-server

  3. 建立Linux建立帳號,帳號名稱xxxx在此用hadoop為名稱
    sudo addgroup hadoop
    sudo adduser --ingroup hadoop hadoop

  4. 設定SSH免密碼登入 
    su - hadoop
    ssh-keygen -t rsa -P ''
    cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
    ssh localhost /*確認是否安裝成功*/

  5. 安裝Hadoop
    cd /usr/local
    sudo wget http://apache.stu.edu.tw/hadoop/core/hadoop-1.0.1/hadoop-1.0.1.tar.gz
    sudo tar -xvf hadoop-1.0.1.tar.gz
    sudo chown -R hadoop:hadoop hadoop-1.0.1
    sudo ln -s hadoop-1.0.1/ hadoop

  6. Hadoop設定
    vim /usr/local/hadoop/conf/hadoop-env.sh
    修改成以下:
    export JAVA_HOME=/usr/lib/jvm/java-6-sun  /*你的JDK位置*/
    HADOOP_OPTS=-Djava.net.preferIPv4Stack=true  /*啟用IPv4*/

    vim /usr/local/hadoop/conf/core-site.xml
    加入:

    <property>
      <name>hadoop.tmp.dir</name>
      <value>/home/hadoop/db/hadoop-${user.name}</value>
      <description>A base for other temporary directories.</description>
    </property>

    <property>
      <name>fs.default.name</name>
      <value>hdfs://localhost:9000</value>
      <description>The name of the default file system.  A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem.
      </description>
    </property>

    <property>
      <name>dfs.replication</name>
      <value>1</value>
      <description>Default block replication.
      The actual number of replications can be specified when the file is created.
      The default is used if replication is not specified in create time.
      </description>
    </property>

    vim /usr/local/hadoop/conf/mapred-site.xml
    加入:

    <property>
      <name>mapred.job.tracker</name>
      <value>localhost:9001</value>
      <description>The host and port that the MapReduce job tracker runsat. If "local", then jobs are run in-process as a single map
    and reduce task.
      </description>
    </property>

    格式化Hadoop的NameNode
    /usr/local/hadoop/bin/hadoop namenode -format

    啟動Hadoop
    /usr/local/hadoop/bin/start-all.sh
    /usr/local/hadoop/bin/hadoop dfsadmin -report /*確認狀態*/

  7. 完成。