Yadda ake Shigar da Hadoop Single Node Cluster (Pseudonode) akan CentOS 7


Hadoop tsari ne na bude-shafi wanda ake amfani dashi sosai don ma'amala da Bigdata. Yawancin ayyukan Bigdata/Nazarin Bayanai ana gina su a saman Hadoop Eco-System. Ya kunshi Layer biyu, daya na Adana bayanai ne kuma wani kuma na sarrafa bayanai ne.

Adana za a kula da shi ta hanyar tsarin fayiloli da ake kira HDFS (Hadoop Distributed Filesystem) da kuma YARN (Duk da haka Wani Mai Ba da shawara kan Kayan Gudanarwar). Mapreduce shine asalin aikin sarrafa Hadoop Eco-System.

Wannan labarin yana bayanin tsarin don shigar da shigarwar Pseudonode na Hadoop, inda duk daemons (JVMs) zasu gudana Nungiyoyin Nungiya ɗaya akan CentOS 7.

Wannan galibi galibi ne don koyon Hadoop. A cikin ainihin lokacin, za a girka Hadoop a matsayin mahaɗin multinode inda za a rarraba bayanan a tsakanin sabobin azaman tubalan kuma za a aiwatar da aikin a cikin layi ɗaya.

  • Aan ƙaramin shigarwa na uwar garken CentOS 7.
  • Java v1.8 saki.
  • Hadoop 2.x barga saki.

A wannan shafin

  • Yadda ake Shigar Java akan CentOS 7
  • Kafa Shiga Shigar da Kalmar wucewa akan CentOS 7
  • Yadda za a Shigar da Hadoop Single Node a CentOS 7
  • Yadda Ake Sanya Hadoop a CentOS 7
  • Tsara Tsarukan Fayil na HDFS ta hanyar Sunan Yanki

1. Hadoop shine tsarin Eco-System wanda ya kunshi Java. Muna buƙatar shigar Java a cikin tsarinmu bisa tilas don shigar da Hadoop.

# yum install java-1.8.0-openjdk

2. Na gaba, tabbatar da shigarwar Java akan tsarin.

# java -version

Muna buƙatar samun ssh a cikin injin mu, Hadoop zai sarrafa node tare da amfani da SSH. Babbar Jagora tana amfani da haɗin SSH don haɗa ƙwayoyin bayinsa da yin aiki kamar farawa da tsayawa.

Muna buƙatar saita ssh-kalmar sirri don maigidan zai iya sadarwa tare da bayi ta amfani da ssh ba tare da kalmar sirri ba. In ba haka ba ga kowane kafa dangane, buƙatar shigar da kalmar sirri.

A cikin wannan kumburi ɗaya, sabis na Babbar Jagora (Namenode, Secondary Namenode & Resource Manager) da sabis na Slave (Datanode & Nodemanager) za su kasance a matsayin JVMs daban. Kodayake kumburi ne na singe, muna buƙatar samun ssh-kalmar sirri don sanya Jagora don sadar da Bawa ba tare da tabbatarwa ba.

3. Kafa hanyar shiga ta SSH wacce bata da kalmar wucewa ta amfani da wadannan umarni akan sabar.

# ssh-keygen
# ssh-copy-id -i localhost

4. Bayan ka saita kalmar shiga ta SSH mara kalmar sirri, yi kokarin sake shiga, za a hada ka ba tare da kalmar sirri ba.

# ssh localhost

5. Ka tafi zuwa ga Apache Hadoop gidan yanar gizo ka zazzage fitowar Hadoop ta amfani da umarnin wget mai zuwa

# wget https://archive.apache.org/dist/hadoop/core/hadoop-2.10.1/hadoop-2.10.1.tar.gz
# tar xvpzf hadoop-2.10.1.tar.gz

6. Na gaba, addara masu canjin yanayi na Hadoop a cikin fayil ~/.bashrc kamar yadda aka nuna.

HADOOP_PREFIX=/root/hadoop-2.10.1
PATH=$PATH:$HADOOP_PREFIX/bin
export PATH JAVA_HOME HADOOP_PREFIX

7. Bayan addingara masu canjin yanayi zuwa ~/.bashrc fayil ɗin, sai a samo fayil ɗin sannan a tabbatar da Hadoop ɗin ta hanyar bin waɗannan dokokin.

# source ~/.bashrc
# cd $HADOOP_PREFIX
# bin/hadoop version

Muna buƙatar saita fayilolin sanyi na Hadoop don dacewa cikin injinku. A cikin Hadoop, kowane sabis yana da lambar tashar jirgin ruwa da kuma kundin adireshi don adana bayanan.

  • Fayilolin Tsarin Hadoop - core-site.xml, hdfs-site.xml, taswira-site.xml & yarn-site.xml

8. Da farko, muna buƙatar sabunta JAVA_HOME da hanyar Hadoop a cikin hadoop-env.sh file kamar yadda aka nuna.

# cd $HADOOP_PREFIX/etc/hadoop
# vi hadoop-env.sh

Shigar da layi na gaba a farkon fayil ɗin.

export JAVA_HOME=/usr/lib/jvm/java-1.8.0/jre
export HADOOP_PREFIX=/root/hadoop-2.10.1

9. Na gaba, gyara core-site.xml fayil.

# cd $HADOOP_PREFIX/etc/hadoop
# vi core-site.xml

Manna masu bi tsakanin alamun shafi> lambar> alamun da aka nuna.

<configuration>
            <property>
                   <name>fs.defaultFS</name>
                   <value>hdfs://localhost:9000</value>
           </property>
</configuration>

10. Createirƙiri kundin adireshi na ƙasa ƙarƙashin tecmint kundin adireshin gidan mai amfani, wanda za'a yi amfani dashi don ajiyar NN da DN.

# mkdir -p /home/tecmint/hdata/
# mkdir -p /home/tecmint/hdata/data
# mkdir -p /home/tecmint/hdata/name

10. Na gaba, gyara hdfs-site.xml fayil.

# cd $HADOOP_PREFIX/etc/hadoop
# vi hdfs-site.xml

Manna masu bi tsakanin alamun shafi> lambar> alamun da aka nuna.

<configuration>
<property>
        <name>dfs.replication</name>
        <value>1</value>
 </property>
  <property>
        <name>dfs.namenode.name.dir</name>
        <value>/home/tecmint/hdata/name</value>
  </property>
  <property>
          <name>dfs .datanode.data.dir</name>
          <value>home/tecmint/hdata/data</value>
  </property>
</configuration>

11. Sake, gyaggyara fayil ɗin taswira-site.xml .

# cd $HADOOP_PREFIX/etc/hadoop
# cp mapred-site.xml.template mapred-site.xml
# vi mapred-site.xml

Manna masu bi tsakanin alamun shafi> lambar> alamun da aka nuna.

<configuration>
                <property>
                        <name>mapreduce.framework.name</name>
                        <value>yarn</value>
                </property>
</configuration>

12. A ƙarshe, gyara fayil ɗin yarn-site.xml .

# cd $HADOOP_PREFIX/etc/hadoop
# vi yarn-site.xml

Manna masu bi tsakanin alamun shafi> lambar> alamun da aka nuna.

<configuration>
                <property>
                       <name>yarn.nodemanager.aux-services</name>
                       <value>mapreduce_shuffle</value>
                </property>
</configuration>

13. Kafin fara Cluster, muna buƙatar tsara Hadoop NN a cikin tsarin yankinmu inda aka girka shi. Yawancin lokaci, za a yi shi a matakin farko kafin fara juzu'in a karon farko.

Tsara NN zai haifar da asarar data a cikin metastore na NN, saboda haka dole ne muyi taka tsan-tsan, bai kamata mu tsara NN ba yayin da tarin ke gudana sai dai idan ana bukatar hakan da gangan.

# cd $HADOOP_PREFIX
# bin/hadoop namenode -format

14. Fara sunanNode daemon da DataNode daemon: (tashar jiragen ruwa 50070).

# cd $HADOOP_PREFIX
# sbin/start-dfs.sh

15. Farawa ResourceManager daemon da NodeManager daemon: (tashar jirgin ruwa 8088).

# sbin/start-yarn.sh

16. Don tsaida dukkan aiyukan.

# sbin/stop-dfs.sh
# sbin/stop-dfs.sh

Takaita
A cikin wannan labarin, mun bi tsarin mataki-mataki don kafa Hadoop Pseudonode (Nananan Node) Cluster. Idan kuna da ilimin Linux na yau da kullun kuma ku bi waɗannan matakan, gungu zai kasance UP a cikin minti 40.

Wannan na iya zama da matukar amfani ga mai farawa don fara koyo da yin Hadoop ko kuma za a iya amfani da wannan nau'ikan Hadoop na vanilla don dalilai na Ci gaba. Idan muna so mu sami tarin lokaci na ainihi, ko dai muna buƙatar aƙalla sabobin jiki na 3 a hannu ko kuma mu samar da girgije don samun sabobin da yawa.