hadoop

Apache Hadoop is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model.

Hadoop is a whole eco system and deverse a wiki on its own, but here we will address several components:

  • HDFS (Hadoop Distributed File system)
  • HBase (Hadoop NoSQL Database)
  • Yarn (Resource manager)

You can see the whole eco system below:

In a nutshell, HDFS on its own is storing the data into datanodes which allow many reads but only once write, where the HBase is suitable for many read-write operation again using the HDFS

Start DFS

[oracle@edvmr1p0 ~]$ start-dfs.sh
Starting namenodes on [localhost]
localhost: starting namenode, logging to /opt/hadoop/logs/hadoop-oracle-namenode-edvmr1p0.out
localhost: starting datanode, logging to /opt/hadoop/logs/hadoop-oracle-datanode-edvmr1p0.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /opt/hadoop/logs/hadoop-oracle-secondarynamenode-edvmr1p0.out

Start Yarn

[oracle@edvmr1p0 ~]$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /opt/hadoop/logs/yarn-oracle-resourcemanager-edvmr1p0.out
localhost: starting nodemanager, logging to /opt/hadoop/logs/yarn-oracle-nodemanager-edvmr1p0.out

Start HBase

[oracle@edvmr1p0 ~]$ start-hbase.sh
localhost: starting zookeeper, logging to /opt/hbase/logs/hbase-oracle-zookeeper-edvmr1p0.out

starting master, logging to /opt/hbase/logs/hbase-oracle-master-edvmr1p0.out
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
starting regionserver, logging to /opt/hbase/logs/hbase-oracle-1-regionserver-edvmr1p0.out

Check Processes

[oracle@edvmr1p0 ~]$ $JAVA_HOME/bin/jps
10498 Jps
8932 DataNode
9910 HQuorumPeer
8791 NameNode
9112 SecondaryNameNode
10158 HRegionServer
10030 HMaster
9391 NodeManager
9279 ResourceManager
[oracle@edvmr1p0 ~]$

Create / List / Delete HDFS Directories

[oracle@edvmr1p0 ~]$ hdfs dfs -mkdir /usr
^C[oracle@edvmr1p0 ~]$ 
[oracle@edvmr1p0 ~]$ 
[oracle@edvmr1p0 ~]$ hdfs dfs -mkdir /user
[oracle@edvmr1p0 ~]$ hdfs dfs -mkdir /user/oracle
[oracle@edvmr1p0 ~]$ hdfs dfs -ls /
Found 3 items
drwxr-xr-x   - oracle supergroup          0 2020-11-11 09:16 /hbase
drwxr-xr-x   - oracle supergroup          0 2020-11-11 09:18 /user
drwxr-xr-x   - oracle supergroup          0 2020-11-11 09:18 /usr
[oracle@edvmr1p0 ~]$ hdfs dfs -rmdir /usr
[oracle@edvmr1p0 ~]$ hdfs dfs -ls /
Found 2 items
drwxr-xr-x   - oracle supergroup          0 2020-11-11 09:16 /hbase
drwxr-xr-x   - oracle supergroup          0 2020-11-11 09:18 /user
[oracle@edvmr1p0 ~]$ 
  • hadoop.txt
  • Last modified: 2020/11/11 11:19
  • by andonovj