如何使用 pentaho 水壶查看 http 标头
Posted
技术标签:
【中文标题】如何使用 pentaho 水壶查看 http 标头【英文标题】:How to see an http header with pentaho kettle 【发布时间】:2013-08-12 12:10:45 【问题描述】:有什么方法可以查看 http 调用的响应标头吗? 我会更具体。我需要查看资源(由网络上的 URL 指向)何时被修改。知道最后修改的日期,我决定是否下载它。我认为这样做的一种方法是查看 http 调用的标头。有什么建议么 ?
【问题讨论】:
不是 Web 开发人员,除了使用 javascript 步骤并检查代码中的标头之外,我不知道其他方法。无论如何,这是 Kettle/ETL 工具的常见问题,我很想知道您找到什么解决方案。 【参考方案1】:这将使用用户定义的 Java 类轻松完成。在这里,您是一个 Class 示例,期望在上一步中输入一个名为 picture(图片的 url)的输入行。现在使用以下代码添加您的用户定义的 java 类:
import java.util.*;
import java.lang.System.*;
import java.net.*;
import java.io.*;
import java.text.*;
import java.util.Date;
import java.util.Calendar;
public boolean processRow(StepMetaInterface smi, StepDataInterface sdi) throws KettleException, Exception
//First, get a row from the default input hop
Object[] r = getRow();
//If the row object is null, we are done processing.
if (r == null)
setOutputDone();
return false;
String filesSavePath = getParameter("filesSavePath")+"/tmp/pictures";
//remove "file://" from filesSavePath, otherwise gives a file io exception, file not found
filesSavePath = filesSavePath.replace("file://","");
String picture = get(Fields.In, "picture").getString(r);
//get the last chunk of picture as filename to save in disk
String filePictureName = picture.substring(picture.lastIndexOf('/') + 1);
String fileFullPath = filesSavePath+ "/"+ filePictureName;
//lets get the headers from picture
try
boolean fileExists = new File(fileFullPath).isFile();
//if picture do not exists save it
if(fileExists != true)
saveImage(picture, fileFullPath);
System.out.println("new picture saved = " + filePictureName);
System.out.println("*******************************");
//if file exists compare date last modified file from header, younger than yesterday.
//if true save it.
else
//get the last-modified header
URL url = new URL(picture);
URLConnection conn = url.openConnection();
long lastModified = conn.getLastModified();
//get last-modified date
Date lastModifiedDate = new Date(lastModified);
//get yesterday date
Calendar cal = Calendar.getInstance();
cal.add(Calendar.DATE, -1);
Date yesterdayDate = cal.getTime();
//today just for testing
//Date today = new Date();
//boolean dateCompare = today.after(yesterdayDate);
boolean dateCompare = lastModifiedDate.after(yesterdayDate);
//if true save it!
if(dateCompare == true)
saveImage(picture, fileFullPath);
System.out.println("new picture saved(last modified after yesterday) = " + filePictureName);
System.out.println("picture = " + picture);
System.out.println("last modified after yesterday = " + dateCompare);
System.out.println("last modified = " + lastModifiedDate);
//System.out.println("today = " + today);
System.out.println("yesterday date = " + yesterdayDate);
System.out.println("*******************************");
catch (Exception e)
System.out.println("error: " + e);
String fullStackTrace = org.apache.commons.lang.exception.ExceptionUtils.getFullStackTrace(e);
System.out.println("fullStackTrace: " + fullStackTrace);
return true;
private static void saveImage(String imageUrl, String destinationFile) throws IOException
URL url = new URL(imageUrl);
InputStream is = url.openStream();
OutputStream os = new FileOutputStream(destinationFile);
byte[] b = new byte[2048];
int length;
while ((length = is.read(b)) != -1)
os.write(b, 0, length);
is.close();
os.close();
【讨论】:
以上是关于如何使用 pentaho 水壶查看 http 标头的主要内容,如果未能解决你的问题,请参考以下文章