从 bigquery 查询整行时如何将模式与行合并 - java
Posted
技术标签:
【中文标题】从 bigquery 查询整行时如何将模式与行合并 - java【英文标题】:How to merge Schema With Rows when querying a full row from bigquery - java 【发布时间】:2016-12-17 21:29:29 【问题描述】:我将以下记录上传到 biqQuery:
insertId: "1234",
executionId: "1111",
jobs:[
name:"aaaa", version:"0.0.0",
name:"bbbb", version:"0.0.0",
name:"cccc", version:"0.0.0",
]
这是我的架构
[
"name":"insertId", "type":"STRING",
"name": "executionId","type": "STRING",
"name": "jobs",
"type": "record",
"mode": "repeated",
"fields": [
"name": "name",
"type": "STRING"
,
"name": "version",
"type": "STRING"
]
]
现在我在java中查询
"SELECT * FROM `myDataset.myTable` where executionId=\"1111\" ;"
这是我使用的代码,取自here:
String projectId = "myProjectId";
String queryString = "SELECT * FROM `myDataset.myTable` where executionId=\"1111\" ;";
long waitTime = 10000;
boolean useLegacySql = false;
Iterator<GetQueryResultsResponse> pages = run(projectId, queryString, waitTime, useLegacySql);
List<TableRow> tableRow = pages.next().getRows();
for(TableRow row: tableRow)
System.out.println(row);
这是我得到的输出:
"f": [
"v": "1234",
"v": "1111" ,
"v": [
"v": "f": [
"v": "aaaa" ,
"v": "0.0.0"
]
,
"v": "f": [
"v": "bbbb",
"v": "0.0.0"
]
,
"v":
"f": [
"v": "cccc" ,
"v": "0.0.0"
]
]]
现在我的架构是动态的,可能包含嵌套和重复的字段,有些是空的,我如何将架构与行合并并根据架构动态获取我的原始数据?
(mergeSchemaWithRows(schema, rows)
之类的东西在 google-cloud npm 包中)
【问题讨论】:
【参考方案1】:这就是我所做的:
/**
* Merge a rowset returned from the API with a table schema.
*
* @static
* @private
*
* @param object schema
* @param array rows
* @return array Fields using their matching names from the table's schema.
*/
public static JSONObject mergeRowsWithSchema(JSONArray schemaArr, JSONObject row)
JSONObject convertedJson = new JSONObject();
JSONArray jsonFields = row.getJSONArray("f");
int i = -1;
for (Object field : jsonFields)
i++;
JSONObject fieldObj = (JSONObject)field;
if(fieldObj.isNull("v")) continue;
Object value = fieldObj.get("v");
JSONObject schemaField = schemaArr.getJSONObject(i);
Object convertedValue = null;
if (schemaField.has("mode") && schemaField.getString("mode").toUpperCase().equals("REPEATED"))
JSONArray convertedArray = new JSONArray();
for (Object val : (JSONArray)value)
convertedArray.put(convert(schemaField,((JSONObject)val).get("v")));
convertedValue = convertedArray;
else
convertedValue = convert(schemaField, value);
convertedJson.put(schemaField.getString("name"), convertedValue);
return convertedJson;
private static Object convert(JSONObject schemaField, Object value)
if(value == null || value.equals(null)) return value;
switch (schemaField.getString("type").toUpperCase())
case "STRING":
return value;
case "BOOLEAN":
return value.equals("true");
case "FLOAT":
return Float.parseFloat((String)value);
case "INTEGER":
return Integer.parseInt((String)value);
case "RECORD":
return mergeRowsWithSchema(schemaField.getJSONArray("fields"), (JSONObject)value);
case "TIMESTAMP":
long lng =(long)Double.parseDouble((String)value);
Date date = new Date(lng * 1000);
DateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSS'Z'");
return dateFormat.format(date);
return null;
【讨论】:
以上是关于从 bigquery 查询整行时如何将模式与行合并 - java的主要内容,如果未能解决你的问题,请参考以下文章
运行查询时出现 BigQuery 错误“解析从位置开始的行时检测到错误:219019。错误:缺少右双引号 (”) 字符
使用联合查询将 bigquery 表与谷歌云 postgres 表合并