柚子快報邀請碼778899分享:HiveOnSpark安裝
柚子快報邀請碼778899分享:HiveOnSpark安裝
1.前提準備
1.安裝好hadoop(建議安裝高可用)
如果沒有安裝,參考 采集項目(HA)(五臺服務(wù)器)_ha數(shù)據(jù)采集-CSDN博客
2.安裝Hive
1.解壓
[atguigu@hadoop100 software]$ tar -zxvf /opt/software/apache-hive-3.1.3.tar.gz -C /opt/module/
[atguigu@hadoop100 software]$ mv /opt/module/apache-hive-3.1.3-bin/ /opt/module/hive
2.環(huán)境變量
[atguigu@hadoop100 software]$ sudo vim /etc/profile.d/my_env.sh
#HIVE_HOME
export HIVE_HOME=/opt/module/hive
export PATH=$PATH:$HIVE_HOME/bin
[atguigu@hadoop100 software]$ source /etc/profile.d/my_env.sh
解決日志Jar包沖突,進入/opt/module/hive/lib
[atguigu@hadoop100 lib]$ mv log4j-slf4j-impl-2.17.1.jar log4j-slf4j-impl-2.17.1.jar.bak
3.hive元數(shù)據(jù)配置到mysql 拷貝驅(qū)動
[atguigu@hadoop102 lib]$ cp /opt/software/mysql/mysql-connector-j-8.0.31.jar /opt/module/hive/lib/
配置Metastore到mysql
[atguigu@hadoop102 conf]$ vim hive-site.xml
4.啟動hive 1.登錄mysql
[atguigu@hadoop100 conf]$ mysql -uroot -p000000
2.新建hive元數(shù)據(jù)庫
mysql> create database metastore;
3.初始化hive元數(shù)據(jù)庫
[atguigu@hadoop100 conf]$ schematool -initSchema -dbType mysql -verbose
4.修改元數(shù)據(jù)字符集
mysql>use metastore;
mysql> alter table COLUMNS_V2 modify column COMMENT varchar(256) character set utf8;
mysql> alter table TABLE_PARAMS modify column PARAM_VALUE mediumtext character set utf8;
mysql> quit;
5.啟動hive客戶端
[atguigu@hadoop100 hive]$ bin/hive
6.用客戶端軟件連接時
[atguigu@hadoop100 bin]$ hiveserver2
3.Spark純凈包安裝
1.純凈包下載地址
Downloads | Apache Spark
2.解壓
[atguigu@hadoop102 software]$ tar -zxvf spark-3.3.1-bin-without-hadoop.tgz -C /opt/module/
[atguigu@hadoop102 software]$ mv /opt/module/spark-3.3.1-bin-without-hadoop /opt/module/spark
3.編輯文檔
[atguigu@hadoop102 software]$ mv /opt/module/spark/conf/spark-env.sh.template /opt/module/spark/conf/spark-env.sh
[atguigu@hadoop102 software]$ vim /opt/module/spark/conf/spark-env.sh
添加內(nèi)容
export SPARK_DIST_CLASSPATH=$(hadoop classpath)
4.環(huán)境變量
[atguigu@hadoop102 software]$ sudo vim /etc/profile.d/my_env.sh
# SPARK_HOME
export SPARK_HOME=/opt/module/spark
export PATH=$PATH:$SPARK_HOME/bin
[atguigu@hadoop102 software]$ source /etc/profile.d/my_env.sh
5.在hive中創(chuàng)建spark配置文件
[atguigu@hadoop102 software]$ vim /opt/module/hive/conf/spark-defaults.conf
spark.master yarn
spark.eventLog.enabled true
spark.eventLog.dir hdfs://mycluster:8020/spark-history
spark.executor.memory 1g
spark.driver.memory 1g
注意:配置文件中hdfs://mycluster:8020/spark-history 是namenode的地址,我本人高可用名稱是mycluster(如果不是高可用,寫ip地址即可)
[atguigu@hadoop102 software]$ hadoop fs -mkdir /spark-history
6.向HDFS上傳Spark純凈版jar包
說明1:采用Spark純凈版jar包,不包含hadoop和hive相關(guān)依賴,能避免依賴沖突。
說明2:Hive任務(wù)最終由Spark來執(zhí)行,Spark任務(wù)資源分配由Yarn來調(diào)度,該任務(wù)有可能被分配到集群的任何一個節(jié)點。所以需要將Spark的依賴上傳到HDFS集群路徑,這樣集群中任何一個節(jié)點都能獲取到。
[atguigu@hadoop102 software]$ hadoop fs -mkdir /spark-jars
[atguigu@hadoop102 software]$ hadoop fs -put /opt/module/spark/jars/* /spark-jars
7.修改hive-site.xml文件
[atguigu@hadoop102 ~]$ vim /opt/module/hive/conf/hive-site.xml
注意:配置文件中hdfs://mycluster:8020/spark-jars/*是namenode的地址,我本人高可用名稱是mycluster(如果不是高可用,寫ip地址即可)
4.Yarn環(huán)境配置
1.修改配置
vim /opt/module/hadoop/etc/hadoop/capacity-scheduler.xml
如果該配置有就修改,沒有就添加
2.分發(fā)
[atguigu@hadoop102 hadoop]$ xsync capacity-scheduler.xml
3.重啟
[atguigu@hadoop103 hadoop]$ stop-yarn.sh
[atguigu@hadoop103 hadoop]$ start-yarn.sh
5.測試
1.測試
[atguigu@hadoop102 hive]$ hive
hive (default)> create table student(id int, name string);
hive (default)> insert into table student values(1,'abc');
2.遠程連接
[atguigu@hadoop102 hive]$ hiveserver2
注意:如果是服務(wù)器,需要打開安全組
柚子快報邀請碼778899分享:HiveOnSpark安裝
精彩內(nèi)容
本文內(nèi)容根據(jù)網(wǎng)絡(luò)資料整理,出于傳遞更多信息之目的,不代表金鑰匙跨境贊同其觀點和立場。
轉(zhuǎn)載請注明,如有侵權(quán),聯(lián)系刪除。