前言
Httpclient 3.x和Httpclient 4.x如何获取服务端返回的cookie和缓存等数据? 随着网站安全意识的提高,很多网站都会把部分重要信息,如加密的key,重要的网页数据,需要跳转的链接等数据放到服务器端返回的cookie或者session中,那么Java爬虫,Java HttpClient把这部分内容拿出来做处理,是Java HttpClient在实际使用中必不可少必须掌握的技能,下面会对如何获取服务器端返回的cookie和session做简单分析,请详细查看:
String cookiessessionStr = "" ;
Cookie[] cookiesssession = httpClient.getState().getCookies();
for (Cookie c : cookiesssession ) {
cookiessessionStr += c.toString() + ";" ;
}
System.out.println("cookiessession:" + cookiessessionStr );Httpclient 4.X如何获取服务端返回的cookie和缓存等数据
List<Cookie> cookiesget = httpClient.getCookieStore().getCookies();
if (cookiesget.isEmpty()) {
//do nothing
} else {
for (int i = 0; i < cookiesget.size(); i++) {
Cookie co = cookiesget.get(i) ;
String newcooikes = co.getName() + "=" + co.getValue() + "; " ;
cookies += newcooikes ;
System.out.println(newcooikes);
}
}分享一个新浪weibo登录时候请求的例子,代码太长,只贴了部分
String securl = "https://login.sina.com.cn/sso/login.php?client=ssologin.js(v1.4.19)" ;
g3 = new HttpPost(securl) ;
g3.setHeader("Accept", "application/json, text/javascript, */*; q=0.01") ;
g3.setHeader("X-Requested-With", "XMLHttpRequest") ;
g3.setHeader("Accept-Language", "zh-CN") ;
g3.setHeader("Pragma", "no-cache") ;
g3.setHeader("Connection", "Keep-Alive") ;
g3.setHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko") ;
g3.setHeader("Content-Type", "application/x-www-form-urlencoded; charset=UTF-8") ;
g3.setHeader("Cookie", cookies) ;
String content = "entry=weibo&gateway=1&from=&savestate=7&qrcode_flag=false&useticket=1" +
"&pagerefer=https%3A%2F%2Fpassport.weibo.com%2Fvisitor%2Fvisitor%3F
entry%3Dminiblog%26a%3Denter%26url%3Dhttps%253A%252F%252Fweibo.com%252F%26domain%3D.
weibo.com%26ua%3Dphp-sso_sdk_client-0.6.23%26_rand%3D1509026650.1567" +
"&pcid=" + pcid +
"&door=" + captcha +
"&vsnf=1" +
"&su=" + URLEncoder.encode(su) +
"&service=miniblog" +
"&servertime=" + servertime +
"&nonce=" + nonce +
"&pwencode=rsa2" +
"&rsakv=" + rsakv +
"&sp=" +sp +
"&sr=1920*1080&encoding=UTF-8&prelt=200&url=
https%3A%2F%2Fweibo.com%2Fajaxlogin.php%3Fframelogin%3D1%26callback%3D
parent.sinaSSOController.feedBackUrlCallBack&returntype=META" ;
StringEntity reqEntity = new StringEntity(content);
g3.setEntity(reqEntity) ;
response2 = httpClient.execute(g3);
//httpClient.executeMethod(g1) ;
sg1 = EntityUtils.toString(response2.getEntity(),"GBK") ;
sg1 = StringRandomUtils.unicodeToString(sg1) ;
if(response2.getStatusLine().getStatusCode()==200 &&
sg1.indexOf("https://login.sina.com.cn/crossdomain2.php")!=-1 ) {
newCookies2 = StringRandomUtils.handleCookies(httpClient);
cookies += newCookies2 ;
String crossdomainurl = "" ;
sg1 = sg1.substring(sg1.indexOf("https://login.sina.com.cn/crossdomain2.php")) ;
crossdomainurl = sg1.substring(0, sg1.indexOf("\"")) ;
}注明: 代码中的网站均为例子,仅供参考,不一定真实有效额,请勿全部Copy。
【蝴蝶效应-虎】