curl采集ebay乱码怎样解决?
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了curl采集ebay乱码怎样解决?相关的知识,希望对你有一定的参考价值。
各位朋友,我现在使用curl采集信息,发现采集ebay店铺信息时老是显示为乱码,比如:
$url="http://stores.ebay.com/sportingamerica/";
$caiji=curl_get_contents($url);
print_r($caiji);
哪位朋友能否解释下?谢谢!
function curl_get_contents($url)
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
//curl_setopt($ch,CURLOPT_HEADER,1);
curl_setopt($ch, CURLOPT_TIMEOUT, 5);
curl_setopt($ch, CURLOPT_USERAGENT, _USERAGENT_);
curl_setopt($ch, CURLOPT_REFERER,_REFERER_);
curl_setopt($ch,CURLOPT_FOLLOWLOCATION,1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$r = curl_exec($ch);
curl_close($ch);
return $r;
curl_setopt($ch, CURLOPT_ENCODING, "gzip");
来处理。
更多关于curl应用的知识可以访问我博客《php cURL 应用》了解。
参考地址:http://www.zjmainstay.cn/php-curl 参考技术A 仔细检查一下你当前编辑的php文件的编码。追问
你好,我的php文件编码是utf-8哦,有问题吗?
追答那你使用iconv转码一下
追问ebay的页面编码也是utf-8,都是utf-8还需要转码么?另外麻烦您测试上面的地址能否获取到内容呢?谢谢了。
PHP CURL本地可以采集服务器上不能采集解决办法
PHP CURL本地可以采集服务器上不能采集解决办法,今天采集一个站,在本机上写好代码,发到网站服务器上确采集不到数据。这里分析,会不会是目标站对网站做了防采集。
网上搜了下解决办法,这里用PHP CURL伪造IP和来源测试看看。代码如下。
//随机IP function Rand_IP(){ $ip2id= round(rand(600000, 2550000) / 10000); //第一种方法,直接生成 $ip3id= round(rand(600000, 2550000) / 10000); $ip4id= round(rand(600000, 2550000) / 10000); //下面是第二种方法,在以下数据中随机抽取 $arr_1 = array("218","218","66","66","218","218","60","60","202","204","66","66","66","59","61","60","222","221","66","59","60","60","66","218","218","62","63","64","66","66","122","211"); $randarr= mt_rand(0,count($arr_1)-1); $ip1id = $arr_1[$randarr]; return $ip1id.".".$ip2id.".".$ip3id.".".$ip4id; } //抓取页面内容 function Curl($url){ $ch2 = curl_init(); $user_agent = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.66 Safari/537.36";//模拟windows用户正常访问 curl_setopt($ch2, CURLOPT_URL, $url); curl_setopt($ch2, CURLOPT_TIMEOUT, 10); curl_setopt($ch2, CURLOPT_HTTPHEADER, array(‘X-FORWARDED-FOR:‘.Rand_IP(), ‘CLIENT-IP:‘.Rand_IP())); //追踪返回302状态码,继续抓取 curl_setopt($ch2, CURLOPT_HEADER, true); curl_setopt($ch2, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch2, CURLOPT_FOLLOWLOCATION, true); curl_setopt($ch2, CURLOPT_NOBODY, false); curl_setopt($ch2, CURLOPT_REFERER, ‘http://www.chinaobd2.com/‘);//模拟来路 curl_setopt($ch2, CURLOPT_USERAGENT, $user_agent); $temp = curl_exec($ch2); curl_close($ch2); return $temp; }
php curl伪造来源ip和来路refer实例代码2:
<?php $postData = array( "user" => "root", "pwd" => "123456" ); $headerIp = array( ‘CLIENT-IP:88.88.88.88‘, ‘X-FORWARDED-FOR:88.88.88.88‘, ); $refer = ‘http://www.chinaobd2.com‘; $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, ‘http://localhost/phpdemo/test.php‘); //伪造来源refer curl_setopt($ch, CURLOPT_REFERER, $refer); //伪造来源ip curl_setopt($ch, CURLOPT_HTTPHEADER, $headerIp); //提交post传参 curl_setopt($ch, CURLOPT_POSTFIELDS, $postData); //...各种curl属性参数设置 $out_put = curl_exec($ch); curl_close($ch); var_dump($out_put);
以上代码我在chinaobd2.com网站测试,可行。遇到类似问题,大家可以参考一下。
以上是关于curl采集ebay乱码怎样解决?的主要内容,如果未能解决你的问题,请参考以下文章
php 用curl_exec 采集页面内容,结果 302重定向
解决php无法通过file_get_contents或curl采集页面内容