java爬取网页上qq号,邮箱号等
Posted zxwm
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了java爬取网页上qq号,邮箱号等相关的知识,希望对你有一定的参考价值。
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.InputStreamReader;
import java.net.URL;
import java.net.URLConnection;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class GetMail {
public static void main(String[] args) throws Exception {
//getMails();
getMails_url();
}
public static void getMails_url() throws Exception {
URL url = new URL("https://wenku.baidu.com/view/ce81b0a1ddccda38366baf61.html");//这里就是要爬取的网页
URLConnection conn = url.openConnection();
BufferedReader bufr = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String line = null;
String maileRes = "[u4E00-u9FA5]+";//这里存放需要设定的规则
//匹配邮箱:"\[email protected]\w+(\.\w+)+"
//匹配汉字:"[u4E00-u9FA5]+";
//匹配QQ号:"[1-9][0-9]{4,14}"
//qq邮箱:"(.)[email protected](.)+(\.[a-z]+){1,}";
Pattern p = Pattern.compile(maileRes);
while((line=bufr.readLine())!=null) {
Matcher m = p.matcher(line);
while(m.find()) {
System.out.println(m.group());
}
}
}
以上是关于java爬取网页上qq号,邮箱号等的主要内容,如果未能解决你的问题,请参考以下文章