Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
hadoop [2020/02/12 13:24] – created andonovjhadoop [2020/11/11 11:19] (current) – [Overview] andonovj
Line 1: Line 1:
-TODO+=====Overview===== 
 +Apache Hadoop is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model. 
 + 
 +Hadoop is a whole eco system and deverse a wiki on its own, but here we will address several components: 
 + 
 +  * HDFS (Hadoop Distributed File system) 
 +  * HBase (Hadoop NoSQL Database) 
 +  * Yarn (Resource manager) 
 + 
 +You can see the whole eco system below: 
 + 
 +{{ :hadoopecoarch.png?600 |}} 
 + 
 +In a nutshell, HDFS on its own is storing the data into datanodes which allow many reads but only once write, where the HBase is suitable for many read-write operation again using the HDFS 
 + 
 + 
 + 
 +=====Management===== 
 + 
 +====Services==== 
 +<Code:bash|Start DFS> 
 +[oracle@edvmr1p0 ~]$ start-dfs.sh 
 +Starting namenodes on [localhost] 
 +localhost: starting namenode, logging to /opt/hadoop/logs/hadoop-oracle-namenode-edvmr1p0.out 
 +localhost: starting datanode, logging to /opt/hadoop/logs/hadoop-oracle-datanode-edvmr1p0.out 
 +Starting secondary namenodes [0.0.0.0] 
 +0.0.0.0: starting secondarynamenode, logging to /opt/hadoop/logs/hadoop-oracle-secondarynamenode-edvmr1p0.out 
 +</Code> 
 + 
 +<Code:bash|Start Yarn> 
 +[oracle@edvmr1p0 ~]$ start-yarn.sh 
 +starting yarn daemons 
 +starting resourcemanager, logging to /opt/hadoop/logs/yarn-oracle-resourcemanager-edvmr1p0.out 
 +localhost: starting nodemanager, logging to /opt/hadoop/logs/yarn-oracle-nodemanager-edvmr1p0.out 
 +</Code> 
 + 
 +<Code:bash|Start HBase> 
 +[oracle@edvmr1p0 ~]$ start-hbase.sh 
 +localhost: starting zookeeper, logging to /opt/hbase/logs/hbase-oracle-zookeeper-edvmr1p0.out 
 + 
 +starting master, logging to /opt/hbase/logs/hbase-oracle-master-edvmr1p0.out 
 +Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0 
 +Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0 
 +starting regionserver, logging to /opt/hbase/logs/hbase-oracle-1-regionserver-edvmr1p0.out 
 +</Code> 
 + 
 +<Code:bash|Check Processes> 
 +[oracle@edvmr1p0 ~]$ $JAVA_HOME/bin/jps 
 +10498 Jps 
 +8932 DataNode 
 +9910 HQuorumPeer 
 +8791 NameNode 
 +9112 SecondaryNameNode 
 +10158 HRegionServer 
 +10030 HMaster 
 +9391 NodeManager 
 +9279 ResourceManager 
 +[oracle@edvmr1p0 ~]$ 
 +</Code> 
 + 
 + 
 +====HDFS==== 
 +<Code:Bash|Create / List / Delete HDFS Directories> 
 +[oracle@edvmr1p0 ~]$ hdfs dfs -mkdir /usr 
 +^C[oracle@edvmr1p0 ~]$  
 +[oracle@edvmr1p0 ~]$  
 +[oracle@edvmr1p0 ~]$ hdfs dfs -mkdir /user 
 +[oracle@edvmr1p0 ~]$ hdfs dfs -mkdir /user/oracle 
 +[oracle@edvmr1p0 ~]$ hdfs dfs -ls / 
 +Found 3 items 
 +drwxr-xr-x   - oracle supergroup          0 2020-11-11 09:16 /hbase 
 +drwxr-xr-x   - oracle supergroup          0 2020-11-11 09:18 /user 
 +drwxr-xr-x   - oracle supergroup          0 2020-11-11 09:18 /usr 
 +[oracle@edvmr1p0 ~]$ hdfs dfs -rmdir /usr 
 +[oracle@edvmr1p0 ~]$ hdfs dfs -ls / 
 +Found 2 items 
 +drwxr-xr-x   - oracle supergroup          0 2020-11-11 09:16 /hbase 
 +drwxr-xr-x   - oracle supergroup          0 2020-11-11 09:18 /user 
 +[oracle@edvmr1p0 ~]$  
 +</Code> 
 + 
 + 
  • hadoop.1581513840.txt.gz
  • Last modified: 2020/02/12 21:24
  • (external edit)