Overview
Apache Hadoop is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model.
Hadoop is a whole eco system and deverse a wiki on its own, but here we will address several components:
- HDFS (Hadoop Distributed File system)
- HBase (Hadoop NoSQL Database)
- Yarn (Resource manager)
You can see the whole eco system below:
In a nutshell, HDFS on its own is storing the data into datanodes which allow many reads but only once write, where the HBase is suitable for many read-write operation again using the HDFS
Management
Services
Start DFS
[oracle@edvmr1p0 ~]$ start-dfs.sh Starting namenodes on [localhost] localhost: starting namenode, logging to /opt/hadoop/logs/hadoop-oracle-namenode-edvmr1p0.out localhost: starting datanode, logging to /opt/hadoop/logs/hadoop-oracle-datanode-edvmr1p0.out Starting secondary namenodes [0.0.0.0] 0.0.0.0: starting secondarynamenode, logging to /opt/hadoop/logs/hadoop-oracle-secondarynamenode-edvmr1p0.out
Start Yarn
[oracle@edvmr1p0 ~]$ start-yarn.sh starting yarn daemons starting resourcemanager, logging to /opt/hadoop/logs/yarn-oracle-resourcemanager-edvmr1p0.out localhost: starting nodemanager, logging to /opt/hadoop/logs/yarn-oracle-nodemanager-edvmr1p0.out
Start HBase
[oracle@edvmr1p0 ~]$ start-hbase.sh localhost: starting zookeeper, logging to /opt/hbase/logs/hbase-oracle-zookeeper-edvmr1p0.out starting master, logging to /opt/hbase/logs/hbase-oracle-master-edvmr1p0.out Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0 Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0 starting regionserver, logging to /opt/hbase/logs/hbase-oracle-1-regionserver-edvmr1p0.out
Check Processes
[oracle@edvmr1p0 ~]$ $JAVA_HOME/bin/jps 10498 Jps 8932 DataNode 9910 HQuorumPeer 8791 NameNode 9112 SecondaryNameNode 10158 HRegionServer 10030 HMaster 9391 NodeManager 9279 ResourceManager [oracle@edvmr1p0 ~]$
HDFS
Create / List / Delete HDFS Directories
[oracle@edvmr1p0 ~]$ hdfs dfs -mkdir /usr ^C[oracle@edvmr1p0 ~]$ [oracle@edvmr1p0 ~]$ [oracle@edvmr1p0 ~]$ hdfs dfs -mkdir /user [oracle@edvmr1p0 ~]$ hdfs dfs -mkdir /user/oracle [oracle@edvmr1p0 ~]$ hdfs dfs -ls / Found 3 items drwxr-xr-x - oracle supergroup 0 2020-11-11 09:16 /hbase drwxr-xr-x - oracle supergroup 0 2020-11-11 09:18 /user drwxr-xr-x - oracle supergroup 0 2020-11-11 09:18 /usr [oracle@edvmr1p0 ~]$ hdfs dfs -rmdir /usr [oracle@edvmr1p0 ~]$ hdfs dfs -ls / Found 2 items drwxr-xr-x - oracle supergroup 0 2020-11-11 09:16 /hbase drwxr-xr-x - oracle supergroup 0 2020-11-11 09:18 /user [oracle@edvmr1p0 ~]$