Hive3.1.2的Beeline执行过程
Posted 虎鲸不是鱼
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Hive3.1.2的Beeline执行过程相关的知识,希望对你有一定的参考价值。
Hive3.1.2的Beeline执行过程
前言
由于阿里云DataPhin中台不能识别非DataPhin创建的表,不得已,笔者使用sql Client的beeline方式,实现了导入普通Hive表数据到DataPhin的Hive表:
beline -u "jdbc:hive2://Hive的Host:10000/default;principal=hive/一串HOST@realm域" -e "
insert overwrite table db1.tb1
select
col1
from
db2.tb2
;
"
当然分区表也是支持的。由于经常报错,笔者尝试扒源码,尝试根据beeline的执行过程【beeline执行流程】,寻找优化方向,顺便试试能不能找到可调的参数。
Beeline的使用方法可以参照官网的Confluence:https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients
Hive参数配置官网的Confluence也十分详细:https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-RestrictedListandWhitelist
CDP7的Hive on Tez参数配置官网的Confluence也十分详细:https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Tez
从官网就可以查到哪些情况调哪些参数,不必像肤浅的SQL Boy们那样到处求人。
源码查看
使用Apache的Hive3.1.2。IDE选用idea,Maven有无其实影响不大,毕竟不会去详细看Calcite解析AST。
入口
package org.apache.hive.beeline;
/**
* A console SQL shell with command completion.
* <p>
* TODO:
* <ul>
* <li>User-friendly connection prompts</li>
* <li>Page results</li>
* <li>Handle binary data (blob fields)</li>
* <li>Implement command aliases</li>
* <li>Stored procedure execution</li>
* <li>Binding parameters to prepared statements</li>
* <li>Scripting language</li>
* <li>XA transactions</li>
* </ul>
*
*/
@SuppressWarnings("static-access")
public class BeeLine implements Closeable
/**
* Starts the program.
*/
public static void main(String[] args) throws IOException
mainWithInputRedirection(args, null);
beeline的Model下就有Beeline类,直接Main方法启动,简单粗暴。main方法内部也只有一个方法:
/**
* Starts the program with redirected input. For redirected output,
* setOutputStream() and setErrorStream can be used.
* Exits with 0 on success, 1 on invalid arguments, and 2 on any other error
*
* @param args
* same as main()
*
* @param inputStream
* redirected input, or null to use standard input
*/
public static void mainWithInputRedirection(String[] args, InputStream inputStream)
throws IOException
BeeLine beeLine = new BeeLine();
try
int status = beeLine.begin(args, inputStream);
if (!Boolean.getBoolean(BeeLineOpts.PROPERTY_NAME_EXIT))
System.exit(status);
finally
beeLine.close();
和平时看到的成功返回码0,失败返回码1一致。从main方法的null入参可知,这货使用的是注释标注的标准输入。
初始化Beeline对象
public BeeLine()
this(true);
这个构造方法:
public BeeLine(boolean isBeeLine)
this.isBeeLine = isBeeLine;
this.signalHandler = new SunSignalHandler(this);
this.shutdownHook = new Runnable()
@Override
public void run()
try
if (history != null)
history.setMaxSize(getOpts().getMaxHistoryRows());
history.flush();
catch (IOException e)
error(e);
finally
close();
;
在Beeline的类里有这个私有对象:
import jline.console.history.FileHistory;
private FileHistory history;
// Indicates if this instance of beeline is running in compatibility mode, or beeline mode
private boolean isBeeLine = true;
显然这个多线程任务是用来写历史记录或者Log日志之类的功能,不用过多关注。
正式开始
跳入begin方法:
/**
* Start accepting input from stdin, and dispatch it
* to the appropriate @link CommandHandler until the
* global variable <code>exit</code> is true.
*/
public int begin(String[] args, InputStream inputStream) throws IOException
try
// load the options first, so we can override on the command line
getOpts().load();
catch (Exception e)
// nothing
setupHistory();
//add shutdown hook to cleanup the beeline for smooth exit
addBeelineShutdownHook();
//this method also initializes the consoleReader which is
//needed by initArgs for certain execution paths
ConsoleReader reader = initializeConsoleReader(inputStream);
if (isBeeLine)
int code = initArgs(args);
if (code != 0)
return code;
else
int code = initArgsFromCliVars(args);
if (code != 0 || exit)
return code;
defaultConnect(false);
if (getOpts().isHelpAsked())
return 0;
if (getOpts().getScriptFile() != null)
return executeFile(getOpts().getScriptFile());
try
info(getApplicationTitle());
catch (Exception e)
// ignore
return execute(reader, false);
这个begin方法才正式开始执行。可以看到有获取配置、读取输入流、初始化参数、连接、执行、执行脚本文件之类的方法。
读取配置load
跳入BeelineOpts.java可以看到:
public void load() throws IOException
try (InputStream in = new FileInputStream(rcFile))
load(in);
再跳:
public void load(InputStream fin) throws IOException
Properties p = new Properties();
p.load(fin);
loadProperties(p);
再跳:
public static final String PROPERTY_NAME_EXIT = PROPERTY_PREFIX + "system.exit";
public static final String PROPERTY_PREFIX = "beeline.";
public void loadProperties(Properties props)
for (Object element : props.keySet())
String key = element.toString();
if (key.equals(PROPERTY_NAME_EXIT))
// fix for sf.net bug 879422
continue;
if (key.startsWith(PROPERTY_PREFIX))
set(key.substring(PROPERTY_PREFIX.length()),
props.getProperty(key));
这个方法其实就是判断如果key=“beeline.system.exit”就跳出本次循环,否则根据key去掉“beeline.”后的值作为新key,根据源key获取配置的值作为新value传入set方法:
public void set(String key, String value)
set(key, value, false);
再跳:
public boolean set(String key, String value, boolean quiet)
try
beeLine.getReflector().invoke(this, "set" + key, new Object[] value);
return true;
catch (Exception e)
if (!quiet)
beeLine.error(beeLine.loc("error-setting", new Object[] key, e));
return false;
这里委托执行:
package org.apache.hive.beeline;
class Reflector
public Object invoke(Object on, String method, Object[] args)
throws InvocationTargetException, IllegalAccessException,
ClassNotFoundException
return invoke(on, method, Arrays.asList(args));
public Object invoke(Object on, String method, List args)
throws InvocationTargetException, IllegalAccessException,
ClassNotFoundException
return invoke(on, on == null ? null : on.getClass(), method, args);
public Object invoke(Object on, Class defClass,
String method, List args)
throws InvocationTargetException, IllegalAccessException,
ClassNotFoundException
Class c = defClass != null ? defClass : on.getClass();
List<Method> candidateMethods = new LinkedList<Method>();
Method[] m = c.getMethods();
for (int i = 0; i < m.length; i++)
if (m[i].getName().equalsIgnoreCase(method))
candidateMethods.add(m[i]);
if (candidateMethods.size() == 0)
throw new IllegalArgumentException(beeLine.loc("no-method",
new Object[] method, c.getName()));
for (Iterator<Method> i = candidateMethods.iterator(); i.hasNext();)
Method meth = i.next();
Class[] ptypes = meth.getParameterTypes();
if (!(ptypes.length == args.size()))
continue;
Object[] converted = convert(args, ptypes);
if (converted == null)
continue;
if (!Modifier.isPublic(meth.getModifiers()))
continue;
return meth.invoke(on, converted);
return null;
这里会反射获取到所有方法名称为“set某个key”的类和方法并添加到List。之后就会跳入Method.java:
@CallerSensitive
public Object invoke(Object obj, Object... args)
throws IllegalAccessException, IllegalArgumentException,
InvocationTargetException
if (!override)
if (!Reflection.quickCheckMemberAccess(clazz, modifiers))
Class<?> caller = Reflection.getCallerClass();
checkAccess(caller, clazz, obj, modifiers);
MethodAccessor ma = methodAccessor; // read volatile
if (ma == null)
ma = acquireMethodAccessor();
return ma.invoke(obj, args);
遍历吊起所有Beeline类的public的方法。
例如Beeline类本身会调用自己的部分方法:
Properties confProps = commandLine.getOptionProperties("hiveconf");
for (String propKey : confProps.stringPropertyNames())
setHiveConfVar(propKey, confProps.getProperty(propKey));
getOpts().setScriptFile(commandLine.getOptionValue("f"));
if (commandLine.getOptionValues("i") != null)
getOpts().setInitFiles(commandLine.getOptionValues("i"));
dbName = commandLine.getOptionValue("database");
getOpts().setVerbose(Boolean.parseBoolean(commandLine.getOptionValue("verbose")));
getOpts().setSilent(Boolean.parseBoolean(commandLine.getOptionValue("silent")));
int code = 0;
if (cl.getOptionValues('e') != null)
commands = Arrays.asList(cl.getOptionValues('e'));
opts.setAllowMultiLineCommand(false); //When using -e, command is always a single line
if (cl.hasOption("help"))
usage();
getOpts().setHelpAsked(true);
return true;
Properties hiveConfs = cl.getOptionProperties("hiveconf");
for (String key : hiveConfs.stringPropertyNames())
setHiveConfVar(key, hiveConfs.getProperty(key));
driver = cl.getOptionValue("d");
auth = cl.getOptionValue("a");
user = cl.getOptionValue("n");
getOpts().setAuthType(auth);
if (cl.hasOption("w"))
pass = obtainPasswordFromFile(cl.getOptionValue("w"));
else
if (beelineParser.isPasswordOptionSet)
pass = cl.getOptionValue("p");
url = cl.getOptionValue("u");
if ((url == null) && cl.hasOption("reconnect"))
// If url was not specified with -u, but -r was present, use that.
url = getOpts().getLastConnectedUrl();
getOpts().setInitFiles(cl.getOptionValues("i"));
getOpts().setScriptFile(cl.getOptionValue("f"));
public void updateOptsForCli()
getOpts().updateBeeLineOptsFromConf();
getOpts().setShowHeader(false);
getOpts().setEscapeCRLF(false);
getOpts().setOutputFormat("dsv");
getOpts().setDelimiterForDSV(' ');
getOpts().setNullEmptyString(true);
setupHistory();
查看get方法顺路看到了为神马-u是穿用户连接参数 ,-n是敲用户名,-p是敲密码。。。这些都是代码里直接写死的。不用狐疑。
启动历史setupHistory
private void setupHistory() throws IOExc以上是关于Hive3.1.2的Beeline执行过程的主要内容,如果未能解决你的问题,请参考以下文章