摘要:Java-爬回網頁傳回資料中文亂碼問題
這發生在我在第一台電腦寫Java時,
爬回的網頁資料,正常,是正確格式,
卻在上正式機之後,發生爬回網頁資料發生亂碼。
看了這篇文章之後
http://blog.csdn.net/izard999/article/details/8213178
發生,問題應該是在爬回後的串流,轉為字串時,未指定編碼格式造成,
可能用的就是windows的預設編碼格式,
剛好兩台的預設解析格式不一樣造成中文亂碼。
public String getContentByHttpURLConnection(String gateway_url,String encoding, String methodStr)
{
String respond = "";
HttpURLConnection conn = null;
try
{
byte[] postData = "".getBytes("UTF-8");
URL url = new URL(gateway_url);
conn = (HttpURLConnection) url.openConnection();
conn.setDoOutput(true);
conn.setUseCaches(false);
conn.setRequestMethod(methodStr);
conn.setRequestProperty("Content-Type",
"application/x-www-form-urlencoded;charset="+encoding);
conn.setRequestProperty("Content-Length",
Integer.toString(postData.length));
OutputStream out = conn.getOutputStream();
out.write(postData);
out.close();
respond = conn.getResponseCode() + " "+ conn.getResponseMessage() ;
System.out.println(respond);
InputStream ist = conn.getInputStream();
BufferedReader in = new BufferedReader(new InputStreamReader(ist,encoding));
StringBuffer sb = new StringBuffer();
char[] c = new char[1];
while(in.read(c,0,1)==1)
{
sb.append(c[0]);
}
respond = sb.toString();
sb.setLength(1);
in.close();
ist.close();
postData = null;
sb =null;
in =null;
ist = null;
out =null;
url =null;
} catch (Exception e) {
//e.printStackTrace();
respond = e.getMessage();
e.printStackTrace();
} finally {
if (conn != null) {
conn.disconnect();
}
conn =null;
}
return respond;
}
重點在編碼格式參數
BufferedReader in = new BufferedReader(new InputStreamReader(ist,encoding));