需要解析来自其他网页的值。首先我需要调用其他网页并从中解析 XML 值

Posted 2023-02-22

技术标签:

【中文标题】需要解析来自其他网页的值。首先我需要调用其他网页并从中解析 XML 值【英文标题】：Need to parse a value from other webpage. First I need to call other webpage and parse the XML value from it 【发布时间】：2016-06-11 08:23:48 【问题描述】：

我正在开发一个应该显示货币汇率的项目，因此我计划调用另一个网页以从该页面获取汇率值。我在 Angular-js 中尝试过，但无法从网页中获得响应（在 Angular JS 中：我们只能调用 JSON/Rest url ）。我在 XMLHttpRequest 中尝试过，但如果我们调用，它不会调用网页（url）来自其他域的网页（由于 CORS）。

同样，我尝试使用 Java 并成功调用了网页并获取了 XML，但我无法解析值（出现错误：“未格式化的 XML”）。

有人可以指导我，我如何从任何网页获得价值。请让我知道无论如何我可以通过使用 API 调用或任何 web 服务调用来实现。如果我使用 API 或 Webservice 调用，那么我是否需要与 Moneyexchange 网站的 IT 供应商沟通，以使 API/webservice 使用特定值？？

请帮助我（我已准备好实施任何技术）

Java 代码：

包 webXMRead; 导入 java.io.IOException；导入 java.io.InputStream；导入 java.net.HttpURLConnection；导入 java.net.MalformedURLException；导入 java.net.URISyntaxException；导入 java.net.URL; 导入 javax.xml.parsers.DocumentBuilder；导入 javax.xml.parsers.DocumentBuilderFactory; 导入 org.apache.http.HttpEntity；导入 org.apache.http.HttpResponse；导入 org.apache.http.client.ClientProtocolException；导入 org.apache.http.client.HttpClient；导入 org.apache.http.client.methods.HttpGet；导入 org.apache.http.impl.client.DefaultHttpClient；导入 org.apache.http.util.EntityUtils；导入 org.w3c.dom.Document；导入 org.w3c.dom.Element；导入 org.w3c.dom.Node；导入 org.w3c.dom.NodeList; 公共类 webPageXMLRead public static void main(String args[]) 抛出 URISyntaxException， ClientProtocolException、IOException、MalformedURLException //出于学习和示例目的，我取了url：http://www.google.com，需要解析这个网站，我不用于任何盈利目的字符串 url = "http://www.google.com"; System.out.println("Url is careated****"); 网址 url2 = 新网址（网址）； HttpGet httpGet = new HttpGet(url); HttpClient httpClient = new DefaultHttpClient();

HttpResponse httpResponse = httpClient.execute(httpGet);
HttpEntity entity = httpResponse.getEntity();
System.out.println("Entity is*****" + entity);
try 
String xmlParseString = EntityUtils.toString(entity);
System.out.println("This Stirng ***" + xmlParseString);

HttpURLConnection connection = (HttpURLConnection) url2
                .openConnection();
InputStream inputStream = connection.getInputStream();

  DocumentBuilderFactory builderFactory = DocumentBuilderFactory
               .newInstance();
  DocumentBuilder documentBuilder = builderFactory
               .newDocumentBuilder();
 Document document = documentBuilder.parse(inputStream);
document.getDocumentElement().normalize();



  NodeList nodeList = document.getElementsByTagName("rss");
  System.out.println("This is firstnode" + nodeList);
   for (int getChild = 0; getChild < nodeList.getLength(); getChild++) 
     Node Listnode = nodeList.item(getChild);
     System.out.println("Into the for loop"
                    + Listnode.getAttributes().getLength());
     Element firstnoderss = (Element) Listnode;
     System.out.println("ListNodes" + Listnode.getAttributes());
     System.out.println("This is node list length"
                + nodeList.getLength());

     Node Subnode = nodeList.item(getChild);
     System.out.println("This is list node" + Subnode);

  

  catch (Exception exception) 

        System.out.println("Exception is" + exception);

Angular-JS：（我只是尝试检查它是否返回任何值，但没有成功。但是当我在不同的域中尝试时，我在 XMLHttpRequest(javascript) 中遇到了 CORS 问题）

Angular-JS 代码：

<!DOCTYPE html>
<html>
<head>
    <title>test your webservice</title>
</head>
<body>


<script src="https://ajax.googleapis.com/ajax/libs/angularjs/1.2.23/angular.min.js"></script>
<article ng-app="webpage">
  <section ng-controller="booksCtrl">
  <h2 >data </h2>
  </section>
</article>
<script type="text/javascript">
var app = angular.module('webpage', []);

app.controller('booksCtrl', function($scope, $http) 
/* $httpProvider.defaults.useXDomain = true;*/
    /*delete $http.defaults.headers.common['X-Requested-With'];*/

/*just for study purpose, not for any profit usage, so for example purpose I used URL:http://www.google.com, */

  $http.get("http://www.google.com")
    .then(function(response) 
        $scope.data=response.data;
        
 
    ,

    function(errresponse) 
     alert("err"+errresponse.status);
    );
);

</script>
</body>
</html>

【问题讨论】：

你想获取数据的其他站点是否支持JSONP？见remysharp.com/2007/10/08/what-is-jsonp @SteveJorgensen，感谢您的更新，现在我使用 jsoup 获得了解决方案 【参考方案1】：

基本上你需要解析一个 HTML 文档。为此，请使用 JSoup。这将是您的四个用例的理想选择。在 java 中拥有 Document 对象后，您可以解析并从中获取所需的值。

String html = "<html><head><title>First parse</title></head>"
  + "<body><p>Parsed HTML into a doc.</p></body></html>";
Document doc = Jsoup.parse(html);

【讨论】：

以上是关于需要解析来自其他网页的值。首先我需要调用其他网页并从中解析 XML 值的主要内容，如果未能解决你的问题，请参考以下文章