用java编写的hive udf、udaf、udtfs如何在eclipse之类的ide中调试?
Posted
技术标签:
【中文标题】用java编写的hive udf、udaf、udtfs如何在eclipse之类的ide中调试?【英文标题】:How are hive udf, udaf, udtfs written in java debugged in an ide like eclipse? 【发布时间】:2016-05-09 09:42:44 【问题描述】:例如,对于调试 pig udf,这可以工作:http://ben-tech.blogspot.ie/2011/08/how-to-debug-pig-udfs-in-eclipse.html 我有一个配置单元脚本,我在其中使用了失败的 udaf,所以我想逐步完成 udf 代码。
【问题讨论】:
下面我的回答有用吗? 这就是我所做的 yibingshi1977.wordpress.com/2012/12/27/debug-hive-in-eclipse 也可以看到 issues.apache.org/jira/browse/HIVE-2665 也需要导出,因为另一个 sh 将是一个孩子 好的。但是我们也采用了我们在 HIVE 中测试和测试的 junit 方式。如果你愿意,我可以为你提供更多带有 UDF 的测试用例样本,它也可以扩展 GenericUDF 【参考方案1】:JUNIT 可以从 Eclipse IDE 中调试。,因为它是一个 java 类。
考虑这个 UDF。
示例 1
class SimpleHelloWorldUDFExample extends UDF
public Text evaluate(Text input)
if(input == null) return null;
return new Text("Hello " + input.toString());
Junit 测试方法应该是这样的...
@Test
public void testUDFNullCheck()
SimpleHelloWorldUDFExample example = new SimpleHelloWorldUDFExample();
Assert.assertNull(example.evaluate(null));
示例 2
package com.hive.udftest
import java.util.List;
import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
import org.apache.hadoop.hive.ql.exec.UDFArgumentLengthException;
import org.apache.hadoop.hive.ql.metadata.HiveException;
import org.apache.hadoop.hive.ql.udf.generic.GenericUDF;
import org.apache.hadoop.hive.serde2.objectinspector.ListObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
import org.apache.hadoop.hive.serde2.objectinspector.primitive.StringObjectInspector;
class HiveUDFTest extends GenericUDF
ListObjectInspector listOI;
StringObjectInspector elementOI;
@Override
public String getDisplayString(String[] arg0)
return "arrayContainsExample()"; // this should probably be better
@Override
public ObjectInspector initialize(ObjectInspector[] arguments) throws UDFArgumentException
if (arguments.length != 2)
throw new UDFArgumentLengthException("arrayContainsExample only takes 2 arguments: List<T>, T");
// 1. Check we received the right object types.
ObjectInspector a = arguments[0];
ObjectInspector b = arguments[1];
if (!(a instanceof ListObjectInspector) || !(b instanceof StringObjectInspector))
throw new UDFArgumentException("first argument must be a list / array, second argument must be a string");
this.listOI = (ListObjectInspector) a;
this.elementOI = (StringObjectInspector) b;
// 2. Check that the list contains strings
if(!(listOI.getListElementObjectInspector() instanceof StringObjectInspector))
throw new UDFArgumentException("first argument must be a list of strings");
// the return type of our function is a boolean, so we provide the correct object inspector
return PrimitiveObjectInspectorFactory.javaBooleanObjectInspector;
@Override
public Object evaluate(DeferredObject[] arguments) throws HiveException
// get the list and string from the deferred objects using the object inspectors
List<String> list = (List<String>) this.listOI.getList(arguments[0].get());
String arg = elementOI.getPrimitiveJavaObject(arguments[1].get());
// check for nulls
if (list == null || arg == null)
return null;
// see if our list contains the value we need
for(String s: list)
if (arg.equals(s)) return new Boolean(true);
return new Boolean(false);
Junit 测试用例是
package com.hive.udftest
import java.util.ArrayList;
import java.util.List;
import junit.framework.Assert;
import org.apache.hadoop.hive.ql.metadata.HiveException;
import org.apache.hadoop.hive.ql.udf.generic.GenericUDF.DeferredJavaObject;
import org.apache.hadoop.hive.ql.udf.generic.GenericUDF.DeferredObject;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory;
import org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaBooleanObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
import org.junit.Test;
public class HiveUDFTestTest
@Test
public void testComplexUDFReturnsCorrectValues() throws HiveException
// set up the models we need
HiveUDFTest example = new HiveUDFTest();
ObjectInspector stringOI = PrimitiveObjectInspectorFactory.javaStringObjectInspector;
ObjectInspector listOI = ObjectInspectorFactory.getStandardListObjectInspector(stringOI);
JavaBooleanObjectInspector resultInspector = (JavaBooleanObjectInspector) example.initialize(new ObjectInspector[]listOI, stringOI);
// create the actual UDF arguments
List<String> list = new ArrayList<String>();
list.add("a");
list.add("b");
list.add("c");
// test our results
// the value exists
Object result = example.evaluate(new DeferredObject[]new DeferredJavaObject(list), new DeferredJavaObject("a"));
Assert.assertEquals(true, resultInspector.get(result));
// the value doesn't exist
Object result2 = example.evaluate(new DeferredObject[]new DeferredJavaObject(list), new DeferredJavaObject("d"));
Assert.assertEquals(false, resultInspector.get(result2));
// arguments are null
Object result3 = example.evaluate(new DeferredObject[]new DeferredJavaObject(null), new DeferredJavaObject(null));
Assert.assertNull(result3);
类似的方式 UDAF,UDTF 以及...
【讨论】:
【参考方案2】:这是一个不错的博客,带有一个示例测试用例。
http://www.spryinc.com/blog/making-use-aspectj-test-hive-udtfs
【讨论】:
仅链接的答案不好。因为链接随时可能失效。请在此处为答案添加重要部分。 : 回顾 无法打开链接。请使用工作链接更新答案。以上是关于用java编写的hive udf、udaf、udtfs如何在eclipse之类的ide中调试?的主要内容,如果未能解决你的问题,请参考以下文章