Created
February 13, 2018 02:54
-
-
Save hadoopversity/651036414f3f87e9afe7380b8b2eaf8d to your computer and use it in GitHub Desktop.
This file contains the step by step procedure to install HDFS, yarn,pig, sqoop,hive, oozie and hue in single node centos
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Cloudera Manual Installation - Centos 6 | |
Centos-6,64-bit | |
Hadoop Components | |
1. HDFS | |
2. Yarn | |
3. Pig | |
4. Sqoop | |
5. Hive | |
6. Oozie | |
7. Hue | |
Prerequisite: | |
Switch to sudo user | |
Steps to Install java | |
1. Get Oracle JDK 8 | |
Visit Oracle JDK download page, look for RPM version | |
http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html | |
Copy the download link of jdk-8u102-linux-x64.rpm and wget it | |
wget --header "Cookie: oraclelicense=accept-securebackup-cookie" http://download.oracle.com/otn-pub/java/jdk/8u102-b14/jdk-8u102-linux-x64.rpm | |
2. Install Oracle JDK 8 | |
sudo yum localinstall jdk-8u102-linux-x64.rpm | |
3. Set JAVA_HOME Environment Variables | |
export JRE_HOME=/usr/java/jdk1.8.0_161/jre | |
export PATH=$PATH:$JRE_HOME/bin | |
export JAVA_HOME=/usr/java/jdk1.8.0_161 | |
export JAVA_PATH=$JAVA_HOME | |
export PATH=$PATH:$JAVA_HOME/bin | |
create java.sh file under /etc/profile.d/ and add the above export commands | |
4. Verification | |
cd /usr/java | |
ls -lsah | |
java -version | |
source .bash_profile | |
$ echo $JRE_HOME | |
/usr/java/jdk1.8.0_102/jre | |
$ echo $JAVA_HOME | |
/usr/java/jdk1.8.0_102/ | |
Steps to Install CDH | |
Link: https://www.cloudera.com/documentation/enterprise/5-12-x/topics/cdh_ig_cdh5_install.html#topic_4_4_1__p_32 | |
1. Add CDH to repository | |
[cloudera-cdh5] | |
# Packages for Cloudera's Distribution for Hadoop, Version 5, on RedHat or CentOS 6 x86_64 | |
name=Cloudera's Distribution for Hadoop, Version 5 | |
baseurl=https://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5/ | |
gpgkey =https://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera | |
gpgcheck = 1 | |
save file as cloudera.repo to /etc/yum.repos.d/ | |
2. Optionally Add a Repository Key | |
sudo rpm --import https://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera | |
3. Install CDH 5 with YARN | |
Maser Deamons: | |
1. Resource Manager | |
sudo yum clean all; | |
sudo yum install hadoop-yarn-resourcemanager | |
2. Namenode | |
sudo yum clean all; | |
sudo yum install hadoop-hdfs-namenode | |
3. Secondary Namenode | |
sudo yum clean all; | |
sudo yum install hadoop-hdfs-secondarynamenode | |
Slave Deamons: | |
4. Nodemanager, Datanode and Mapreduce | |
sudo yum clean all; | |
sudo yum install hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce | |
5. History server and proxy server | |
sudo yum clean all; | |
sudo yum install hadoop-mapreduce-historyserver hadoop-yarn-proxyserver | |
6. Hadoop Client | |
sudo yum clean all; sudo yum install hadoop-client | |
7. Customizing Configuration Files | |
core-site.xml: | |
<property> | |
<name>fs.defaultFS</name> | |
<value>hdfs://localhost:8020</value> | |
</property> | |
8. Configuring Local Storage Directories | |
Sample configuration: | |
hdfs-site.xml on the NameNode: | |
<property> | |
<name>dfs.namenode.name.dir</name> | |
<value>file:///data/1/dfs/nn,file:///nfsmount/dfs/nn</value> | |
</property> | |
hdfs-site.xml on each DataNode: | |
<property> | |
<name>dfs.datanode.data.dir</name> | |
<value>file:///data/1/dfs/dn,file:///data/2/dfs/dn,file:///data/3/dfs/dn,file:///data/4/dfs/dn</value> | |
</property> | |
On a NameNode host: create the dfs.name.dir or dfs.namenode.name.dir local | |
directories | |
$ sudo mkdir -p /data/1/dfs/nn /nfsmount/dfs/nn | |
$ sudo mkdir -p /data/1/dfs/dn /data/2/dfs/dn /data/3/dfs/dn /data/4/dfs/dn | |
sudo chown -R hdfs:hdfs /data/1/dfs/nn /nfsmount/dfs/nn /data/1/dfs/dn /data/2/dfs/dn /data/3/dfs/dn /data/4/dfs/dn | |
9. Formatting the NameNode | |
sudo -u hdfs hdfs namenode -format | |
10. Start HDFS | |
for x in `cd /etc/init.d ; ls hadoop-hdfs-*` ; do sudo service $x start ; done | |
11. Create the /tmp Directory | |
$ sudo -u hdfs hadoop fs -mkdir /tmp | |
$ sudo -u hdfs hadoop fs -chmod -R 1777 /tmp | |
12. Yarn configuration | |
Edit /etc/hadoop/conf/yarn-site.xml and add the below lines | |
<property> | |
<name>yarn.resourcemanager.address</name> | |
<value>127.0.0.1:8032</value> | |
</property> | |
<property> | |
<name>yarn.resourcemanager.scheduler.address</name> | |
<value>127.0.0.1:8030</value> | |
</property> | |
<property> | |
<name>yarn.resourcemanager.resource-tracker.address</name> | |
<value>127.0.0.1:8031</value> | |
</property> | |
Start yarn resource and node manager | |
/etc/init.d/hadoop-yarn-nodemanager start | |
/etc/init.d/hadoop-yarn-resourcemanager restart | |
4. Create users | |
sudo -u hdfs hdfs dfs -mkdir /user | |
sudo -u hdfs hdfs dfs -mkdir /user/hadoop | |
sudo -u hdfs hdfs dfs -chown hdfs /user | |
sudo -u hdfs hdfs dfs -chown hdfs /user/hadoop | |
5. Installing Pig | |
sudo yum install pig | |
6. Installing Hive | |
sudo yum install hive | |
sudo yum install hive-metastore | |
sudo yum install hive-server2 | |
Configuring the Hive Metastore | |
sudo yum install mysql-server | |
sudo service mysqld start | |
sudo yum install mysql-connector-java | |
ln -s /usr/share/java/mysql-connector-java.jar /usr/lib/hive/lib/mysql-connector-java.jar | |
Setup MySQL | |
Configure MySQL to use a strong password and to start at boot. Note that in the following procedure, your current root password is blank. Press the Enter key when you're prompted for the root password. | |
$ sudo /usr/bin/mysql_secure_installation | |
[...] | |
Enter current password for root (enter for none): | |
OK, successfully used password, moving on... | |
[...] | |
Set root password? [Y/n] y | |
New password: | |
Re-enter new password: | |
Remove anonymous users? [Y/n] Y | |
[...] | |
Disallow root login remotely? [Y/n] N | |
[...] | |
Remove test database and access to it [Y/n] Y | |
[...] | |
Reload privilege tables now? [Y/n] Y | |
All done! | |
Create the Database and User | |
$ mysql -u root -p | |
Enter password: | |
mysql> CREATE DATABASE metastore; | |
mysql> USE metastore; | |
mysql> SOURCE /usr/lib/hive/scripts/metastore/upgrade/mysql/hive-schema-1.1.0.mysql.sql; | |
mysql> CREATE USER 'hive'@'localhost' IDENTIFIED BY 'mypassword'; | |
... | |
mysql> REVOKE ALL PRIVILEGES, GRANT OPTION FROM 'hive'@'localhost'; | |
mysql> GRANT SELECT,INSERT,UPDATE,DELETE,LOCK TABLES,EXECUTE ON metastore.* TO 'hive'@'localhost'; | |
mysql> FLUSH PRIVILEGES; | |
mysql> quit; | |
Configure the Metastore Service to Communicate with the MySQL Database | |
<property> | |
<name>javax.jdo.option.ConnectionURL</name> | |
<value>jdbc:mysql://localhost/metastore</value> | |
<description>the URL of the MySQL database</description> | |
</property> | |
<property> | |
<name>javax.jdo.option.ConnectionDriverName</name> | |
<value>com.mysql.jdbc.Driver</value> | |
</property> | |
<property> | |
<name>javax.jdo.option.ConnectionUserName</name> | |
<value>hive</value> | |
</property> | |
<property> | |
<name>javax.jdo.option.ConnectionPassword</name> | |
<value>mypassword</value> | |
</property> | |
<property> | |
<name>datanucleus.autoCreateSchema</name> | |
<value>false</value> | |
</property> | |
<property> | |
<name>datanucleus.fixedDatastore</name> | |
<value>true</value> | |
</property> | |
<property> | |
<name>datanucleus.autoStartMechanism</name> | |
<value>SchemaTable</value> | |
</property> | |
<property> | |
<name>hive.support.concurrency</name> | |
<description>Enable Hive's Table Lock Manager Service</description> | |
<value>true</value> | |
</property> | |
<property> | |
<name>hive.zookeeper.quorum</name> | |
<description>Zookeeper quorum used by Hive's Table Lock Manager</description> | |
<value>zk1.myco.com,zk2.myco.com,zk3.myco.com</value> | |
</property> | |
<property> | |
<name>hive.zookeeper.client.port</name> | |
<value>2222</value> | |
<description> | |
The port at which the clients will connect. | |
</description> | |
</property> | |
start services | |
sudo service hive-metastore start | |
sudo service hive-server2 start | |
7. Installing Sqoop | |
sudo yum install sqoop | |
Installing the JDBC Drivers for Sqoop 1 | |
mkdir -p /var/lib/sqoop | |
chown sqoop:sqoop /var/lib/sqoop | |
chmod 755 /var/lib/sqoop | |
cp /usr/share/java/mysql-connector-java.jar /var/lib/sqoop/ | |
8. Creating users | |
sudo -u hdfs dfs -mkdir /root | |
sudo -u hdfs dfs -chown -R root:hadoopusers /root | |
sudo -u hdfs hadoop fs -mkdir /user | |
sudo -u hdfs hadoop fs -mkdir /user/root | |
sudo -u hdfs hadoop fs -chown root /user | |
sudo -u hdfs hadoop fs -chown root /user/root | |
9. Installing Oozie | |
sudo yum install oozie | |
sudo yum install oozie-client | |
Configuring Oozie to Use MySQL | |
$ mysql -u root -p | |
Enter password: | |
mysql> create database oozie default character set utf8; | |
Query OK, 1 row affected (0.00 sec) | |
mysql> grant all privileges on oozie.* to 'oozie'@'localhost' identified by 'oozie'; | |
Query OK, 0 rows affected (0.00 sec) | |
mysql> grant all privileges on oozie.* to 'oozie'@'%' identified by 'oozie'; | |
Query OK, 0 rows affected (0.00 sec) | |
mysql> exit | |
Bye | |
Configure Oozie to use MySQL. | |
Edit properties in the oozie-site.xml file as follows: | |
... | |
<property> | |
<name>oozie.service.JPAService.jdbc.driver</name> | |
<value>com.mysql.jdbc.Driver</value> | |
</property> | |
<property> | |
<name>oozie.service.JPAService.jdbc.url</name> | |
<value>jdbc:mysql://localhost:3306/oozie</value> | |
</property> | |
<property> | |
<name>oozie.service.JPAService.jdbc.username</name> | |
<value>oozie</value> | |
</property> | |
<property> | |
<name>oozie.service.JPAService.jdbc.password</name> | |
<value>oozie</value> | |
</property> | |
ln -s /usr/share/java/mysql-connector-java.jar /var/lib/oozie/mysql-connector-java.jar | |
Create oozie user | |
sudo groupadd cdh-hadoop | |
sudo useradd -g chd-hadoop oozie | |
hdfs dfs -mkdir /user/oozie | |
sudo -u hdfs hdfs dfs -mkdir /user1 | |
sudo -u hdfs hadoop fs -chown oozie /user/oozie | |
sudo -u oozie /usr/lib/oozie/bin/ooziedb.sh create -run | |
wget http://archive.cloudera.com/gplextras/misc/ext-2.2.zip | |
unzip ext-2.2.zip -d /var/lib/oozie/ | |
To install the Oozie shared library in Hadoop HDFS in the oozie user home directory | |
sudo -u hdfs hadoop fs -chown oozie:oozie /user/oozie | |
sudo oozie-setup sharelib create -fs hdfs://localhost:8020 -locallib /usr/lib/oozie/oozie-sharelib-yarn | |
10. Installing Hue | |
sudo yum install hue | |
sudo yum install hue-plugins | |
Configure Hue | |
HDFS | |
add following in hdfs-site.xml | |
<property> | |
<name>dfs.webhdfs.enabled</name> | |
<value>true</value> | |
</property> | |
core-site.xml | |
<property> | |
<name>hadoop.proxyuser.hue.hosts</name> | |
<value>*</value> | |
</property> | |
<property> | |
<name>hadoop.proxyuser.hue.groups</name> | |
<value>*</value> | |
</property> | |
<property> | |
<name>hadoop.proxyuser.httpfs.hosts</name> | |
<value>*</value> | |
</property> | |
<property> | |
<name>hadoop.proxyuser.httpfs.groups</name> | |
<value>*</value> | |
</property> | |
restart hadoop | |
for x in `cd /etc/init.d ; ls hadoop-hdfs-*` ; do sudo service $x restart ; done |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment