Hive源码阅读(一):源码编译

编译环境

系统:Macos

Java版本:java version "1.8.0_221

Maven版本:Apache Maven 3.3.9

Hive版本:Hive-3.1.0

由于当前公司数仓用的Hive版本为Hive-3.1.0,故源码阅读版本为Hive-3.1.0

Hive组件介绍

  • 三个重要组件

    • serde:这个组件是hive内置的一些序列化解析类,此组件允许用户自己开发自定义序列化、反序列化文件解析器
    • MetaStorehive的元数据服务器,用来存放数据仓库中所有表和分区的信息,hive元数据建表sql、升级sql脚本都存放在此目录下
    • ql:此组件用于解析sql生成执行计划(hive核心包,熟读此包,可了解hive执行流程核心)
  • 次要组件

    • clihive命令的入口,用于处理命令行提交作业
    • service此组件所有对外api接口的服务端(通过thrift实现),可用于其他客户端与hive交互,比如jdbc
    • commonhive基础代码库,hive各个组件信息的传递也是有此包HiveConf类来管理。
    • ant此组件包含一些ant任务需要的基础代码
    • bin此组件包涵hive里的所有脚本,包括hivecli的脚本
    • beeline: HiveServer2提供的一个新的命令行工具Beeline
    • hcatalog:apache开源的对于表和底层数据管理统一服务平台,HCatalog底层依赖于Hive Metastore
    • findbugsFindbugs是一个在java程序中查找bug的程序,它查找bug模式的实例,也就是可能出错的代码实例,注意Findbugs是检查java字节码,也就是.class文件。
    • hwi: hive web页面的接口
    • shims: shims相关类是用来兼容不同的hadoophive版本
    • llap 是基于tez的一种近实时查询方案
    • conf: 此目录包涵hive配置文件hive-default.xmlhive-site.xml
    • data: hive测试所用数据
    • lib: hive运行期间依赖的jar

编译命令

1
mvn clean install -DskipTests

编译结果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Hive Storage API ................................... SUCCESS [ 4.698 s]
[INFO] Hive ............................................... SUCCESS [ 1.994 s]
[INFO] Hive Classifications ............................... SUCCESS [ 0.743 s]
[INFO] Hive Shims Common .................................. SUCCESS [ 4.581 s]
[INFO] Hive Shims 0.23 .................................... SUCCESS [ 4.022 s]
[INFO] Hive Shims Scheduler ............................... SUCCESS [ 1.551 s]
[INFO] Hive Shims ......................................... SUCCESS [ 1.479 s]
[INFO] Hive Standalone Metastore .......................... SUCCESS [ 1.593 s]
[INFO] Hive Standalone Metastore Common Code .............. SUCCESS [ 30.709 s]
[INFO] Hive Common ........................................ SUCCESS [ 6.632 s]
[INFO] Hive Service RPC ................................... SUCCESS [ 1.894 s]
[INFO] Hive Serde ......................................... SUCCESS [ 4.312 s]
[INFO] Hive Metastore ..................................... SUCCESS [ 2.487 s]
[INFO] Hive Vector-Code-Gen Utilities ..................... SUCCESS [ 0.245 s]
[INFO] Hive Parser ........................................ SUCCESS [ 8.911 s]
[INFO] Hive UDF ........................................... SUCCESS [ 1.446 s]
[INFO] Hive Llap Common ................................... SUCCESS [ 3.592 s]
[INFO] Hive Llap Client ................................... SUCCESS [ 2.153 s]
[INFO] Hive Llap Tez ...................................... SUCCESS [ 3.206 s]
[INFO] Hive Spark Remote Client ........................... SUCCESS [ 5.388 s]
[INFO] Hive Metastore Server .............................. SUCCESS [02:13 min]
[INFO] Hive Query Language ................................ SUCCESS [ 51.077 s]
[INFO] Hive Llap Server ................................... SUCCESS [ 6.714 s]
[INFO] Hive Service ....................................... SUCCESS [ 6.551 s]
[INFO] Hive Accumulo Handler .............................. SUCCESS [ 6.697 s]
[INFO] Hive JDBC .......................................... SUCCESS [ 18.083 s]
[INFO] Hive Beeline ....................................... SUCCESS [ 5.000 s]
[INFO] Hive CLI ........................................... SUCCESS [ 2.452 s]
[INFO] Hive Contrib ....................................... SUCCESS [ 1.504 s]
[INFO] Hive Druid Handler ................................. SUCCESS [ 26.299 s]
[INFO] Hive HBase Handler ................................. SUCCESS [ 3.704 s]
[INFO] Hive JDBC Handler .................................. SUCCESS [ 1.639 s]
[INFO] Hive HCatalog ...................................... SUCCESS [ 0.998 s]
[INFO] Hive HCatalog Core ................................. SUCCESS [ 3.993 s]
[INFO] Hive HCatalog Pig Adapter .......................... SUCCESS [ 2.538 s]
[INFO] Hive HCatalog Server Extensions .................... SUCCESS [ 1.996 s]
[INFO] Hive HCatalog Webhcat Java Client .................. SUCCESS [ 2.508 s]
[INFO] Hive HCatalog Webhcat .............................. SUCCESS [ 14.780 s]
[INFO] Hive HPL/SQL ....................................... SUCCESS [ 3.102 s]
[INFO] Hive Streaming ..................................... SUCCESS [ 2.523 s]
[INFO] Hive Llap External Client .......................... SUCCESS [ 1.911 s]
[INFO] Hive Shims Aggregator .............................. SUCCESS [ 0.125 s]
[INFO] Hive Kryo Registrator .............................. SUCCESS [ 3.944 s]
[INFO] Hive Kudu Handler .................................. SUCCESS [ 5.853 s]
[INFO] Hive TestUtils ..................................... SUCCESS [ 0.173 s]
[INFO] Hive Kafka Storage Handler ......................... SUCCESS [ 5.999 s]
[INFO] Hive Packaging ..................................... SUCCESS [ 2.760 s]
[INFO] Hive Metastore Tools ............................... SUCCESS [ 0.009 s]
[INFO] Hive Metastore Tools common libraries .............. SUCCESS [ 7.914 s]
[INFO] Hive metastore benchmarks .......................... SUCCESS [ 9.707 s]
[INFO] Hive Upgrade Acid .................................. SUCCESS [ 0.011 s]
[INFO] Hive Pre Upgrade Acid .............................. SUCCESS [ 2.831 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 07:09 min
[INFO] Finished at: 2020-07-09T17:49:36+08:00
[INFO] Final Memory: 527M/2052M
[INFO] ------------------------------------------------------------------------

总结

粗略的把Hive源码拉下来,并且进行编译,也对各个模块进行了简单介绍,以便以方便的阅读源码