Apache HTTP Client 是目前最常被使用的 Java HTTP Client Library,他經歷了許多次改版,目前的正式版為 4.5,5.0 還在測試階段。HttpClient 沒有完整的瀏覽器的功能,最大的差異是缺少了 UI,他單純地只有提供 HTTP Protocol 1.0 及 1.1 的資料傳輸及互動的功能,通常用在 Server Side,需要對其他有 HTTP 介面的 Server 進行資料傳輸互動時,例如在 Server 對 Google 發送搜尋的 reqest,並對 Google 回應的 HTML 內容進行解析。
除了 HTTP Request 及 Response 的 Messages 以外,所有 HTTP Request 都要以某一個 HTTP Method 的形式發送給 Server,最常用的是 GET 及 POST,在 HTTP Message 中會有多個 Headers 描述 metadatas,而在 Response 中,會夾帶可儲存在 Client 的 Cookie 資料,並在下一次的 Request 回傳給 Server,Session 是指一連串的多個 Http Request 及 Response 的互動過程,通常會以 Cookie 的方式紀錄 Session ID。
HTTP Fundamentals
這是最基本的 HttpGet
CloseableHttpClient httpclient = HttpClients.createDefault();
HttpGet httpget = new HttpGet("http://localhost/");
CloseableHttpResponse response = null;
try {
response = httpclient.execute(httpget);
} catch (IOException e) {
e.printStackTrace();
logger.error("error: ", e);
} finally {
try {
response.close();
} catch (IOException e) {
e.printStackTrace();
logger.error("error: ", e);
}
}
Http methods 有 GET, HEAD, POST, PUT, DELETE, TRACE and OPTIONS,針對每一種 method 都有提供專屬的 class: HttpGet,
HttpHead, HttpPost, HttpPut, HttpDelete, HttpTrace, and HttpOptions。
Request URI 是 Uniform Resource Identifier,可識別資源的位置,HTTP Request URI 包含了 protocol scheme, host name, optional port, resource path,
optional query, and optional fragment 這幾個部分,可用 URIBuilder 產生 URI。
// uri=http://www.google.com/search?q=httpclient&btnG=Google+Search&aq=f&oq=
URI uri = new URIBuilder()
.setScheme("http")
.setHost("www.google.com")
.setPath("/search")
.setParameter("q", "httpclient")
.setParameter("btnG", "Google Search")
.setParameter("aq", "f")
.setParameter("oq", "")
.build();
HttpGet httpget = new HttpGet(uri);
HTTP response
HTTP response 是 server 回傳給 client 的 message。
HttpResponse httpResponse = new BasicHttpResponse(HttpVersion.HTTP_1_1,
HttpStatus.SC_OK, "OK");
System.out.println(httpResponse.getProtocolVersion());
System.out.println(httpResponse.getStatusLine().getStatusCode());
System.out.println(httpResponse.getStatusLine().getReasonPhrase());
System.out.println(httpResponse.getStatusLine().toString());
HttpResponse httpResponse2 = new BasicHttpResponse(HttpVersion.HTTP_1_1,
HttpStatus.SC_OK, "OK");
httpResponse2.addHeader("Set-Cookie",
"c1=a; path=/; domain=localhost");
httpResponse2.addHeader("Set-Cookie",
"c2=b; path=\"/\", c3=c; domain=\"localhost\"");
Header h1 = httpResponse2.getFirstHeader("Set-Cookie");
System.out.println(h1);
Header h2 = httpResponse2.getLastHeader("Set-Cookie");
System.out.println(h2);
Header[] hs = httpResponse2.getHeaders("Set-Cookie");
System.out.println(hs.length);
輸出結果
HTTP/1.1
200
OK
HTTP/1.1 200 OK
Set-Cookie: c1=a; path=/; domain=localhost
Set-Cookie: c2=b; path="/", c3=c; domain="localhost"
2
http message 中包含了許多 headers,可利用 HeaderIterator 逐項處理每一個 header,另外有一個 BasicHeaderElementIterator 可以針對某一種 header,處理所有 header elements。
HttpResponse response = new BasicHttpResponse(HttpVersion.HTTP_1_1,
HttpStatus.SC_OK, "OK");
response.addHeader("Set-Cookie",
"c1=a; path=/; domain=localhost");
response.addHeader("Set-Cookie",
"c2=b; path=\"/\", c3=c; domain=\"localhost\"");
// HeaderIterator
HeaderIterator it = response.headerIterator("Set-Cookie");
while (it.hasNext()) {
System.out.println(it.next());
}
// HeaderElementIterator
HeaderElementIterator it2 = new BasicHeaderElementIterator(
response.headerIterator("Set-Cookie"));
while (it2.hasNext()) {
HeaderElement elem = it2.nextElement();
System.out.println(elem.getName() + " = " + elem.getValue());
NameValuePair[] params = elem.getParameters();
for (int i = 0; i < params.length; i++) {
System.out.println(" " + params[i]);
}
}
HTTP entity
HTTP message 能封裝某個 request/response 的某些 content,可在某些 request/response 中找到,他是 optional 的資料。使用 entities 的 request 稱為 entity enclosing requests,HTTP request 中有兩種 entity request methods: POST 及 PUT。
除了回應 HEAD method 的 response 以及 204 No Content, 304 Not Modified, 205 Reset Content 以外,大部分的 response 較常使用 entity。
HttpClient 區分了三種 entities: streamed, self-contained, wrapping,通常會將 non-repeatable entities 視為 streamed,而將 repeatable entities 視為 self-contained。
streamed: content 是由 stream 取得,常用在 response,streamed entities 不能重複。
self-contained: content 存放在記憶體中,或是由 connection 以外的方式取得的,這種 entity 可以重複,通常用在 entity enclosing HTTP requests。repeatable 就是可以重複讀取 content 的 entity,ex: ByteArrayEntity or StringEntity。
wrapping: 由另一個 entity 取得的 content
因 entity 可存放 binary 及 character content,因此支援了 character encodings。
可利用 HttpEntity#getContentType(), HttpEntity#getContentLength() 取得 Content-Type and Content-Length 欄位的資訊,因 Content-Type 包含了 character encoding 的資訊,可用 HttpEntity#getContentEncoding() 取得,如果 HttpEntity 包含了 Content-Type header,就能取得 Header 物件。
StringEntity myEntity = new StringEntity("important message",
ContentType.create("text/plain", "UTF-8"));
System.out.println(myEntity.getContentType());
System.out.println(myEntity.getContentLength());
System.out.println(EntityUtils.toString(myEntity));
System.out.println(EntityUtils.toByteArray(myEntity).length);
結果
Content-Type: text/plain; charset=UTF-8
17
important message
17
Ensuring release of low level resources
為確保系統資源有回收,必須要關閉 entity 取得的 content stream,或是直接關閉 response,關閉 stream 時,還能保持 connection,但如果關閉 response,就直接關閉並 discards connection。
CloseableHttpClient httpclient = HttpClients.createDefault();
HttpGet httpget = new HttpGet("http://localhost/");
CloseableHttpResponse response = httpclient.execute(httpget);
try {
HttpEntity entity = response.getEntity();
if (entity != null) {
InputStream instream = entity.getContent();
try {
// do something useful
} finally {
instream.close();
}
}
} finally {
response.close();
}
HttpEntity#writeTo(OutputStream) 也能用來保證在 entity 完全寫入後, resource 能被釋放。如果是用 HttpEntity#getContent() 取得了 java.io.InputStream,就必須自行在 finally 中 close stream。如果是處理 streaming entities,使用 EntityUtils#consume(HttpEntity) 可保證 entity content 能完全被處理並回收 stream。
如果只需要處理部分 response content,可直接呼叫 response.close,就不需要消化所有的 response content,但 connection 也無法被 reused。
CloseableHttpClient httpclient = HttpClients.createDefault();
HttpGet httpget = new HttpGet("http://localhost/");
CloseableHttpResponse response = httpclient.execute(httpget);
try {
HttpEntity entity = response.getEntity();
if (entity != null) {
InputStream instream = entity.getContent();
int byteOne = instream.read();
int byteTwo = instream.read();
// Do not need the rest
}
} finally {
response.close();
}
Consuming entity content
最好的方式是呼叫 HttpEntity#getContent() 或是 HttpEntity#wrtieTo(OutputStream),但 HttpClient 同時也提供 EntityUtils 類別有多個處理 content 的 static methods,不建議使用 EntityUtils,除非 response 是由 trusted HTTP server 回傳,且是有限的長度。
CloseableHttpClient httpclient = HttpClients.createDefault();
HttpGet httpget = new HttpGet("http://localhost/");
CloseableHttpResponse response = httpclient.execute(httpget);
try {
HttpEntity entity = response.getEntity();
if (entity != null) {
long len = entity.getContentLength();
if (len != -1 && len < 2048) {
System.out.println(EntityUtils.toString(entity));
} else {
// Stream content out
}
}
} finally {
response.close();
}
如果需要多次讀取整個 entity content,最簡單的方式是以 BufferedHttpEntity class 封裝原本的 entity,這可讓 content 放如 in-memory buffer。
CloseableHttpResponse response = <...>
HttpEntity entity = response.getEntity();
if (entity != null) {
entity = new BufferedHttpEntity(entity);
}
Producing entity content
StringEntity, ByteArrayEntity, InputSreamEntity, FileEntity 可用來透過 HTTP connetion stream out 資料。InputStreamEntity 只能被使用一次,不能重複讀取資料。
File file = new File("somefile.txt");
FileEntity entity = new FileEntity(file,
ContentType.create("text/plain", "UTF-8"));
HttpPost httppost = new HttpPost("http://localhost/action.do");
httppost.setEntity(entity);
HTML forms
UrlEncodedFormEntity 模擬 submitting an HTML form。以下等同用 POST method 發送 param1=value1¶m2=value2。
List<NameValuePair> formparams = new ArrayList<NameValuePair>();
formparams.add(new BasicNameValuePair("param1", "value1"));
formparams.add(new BasicNameValuePair("param2", "value2"));
UrlEncodedFormEntity entity = new UrlEncodedFormEntity(formparams, Consts.UTF_8);
HttpPost httppost = new HttpPost("http://localhost/handler.do");
httppost.setEntity(entity);
Content chunking
可直接呼叫 HttpEntity#setChunked(true) 建議分塊處理 content,但如果遇到不支援的 HTTP/1.0,還是會忽略這個設定值。
StringEntity entity = new StringEntity("important message",
ContentType.create("plain/text", Consts.UTF_8));
entity.setChunked(true);
HttpPost httppost = new HttpPost("http://localhost/acrtion.do");
httppost.setEntity(entity);
response handlers
透過 ResponseHandler interface 的 handleResponse(HttpResponse response) 這個方法處理 response 這個方式最簡單,programmer 不需要處理 connection management,HttpClient 會自動確保 connection 回到 connection manager。
public static void main(String[] args) {
try {
CloseableHttpClient httpClient = HttpClients.createDefault();
HttpGet httpget = new HttpGet("http://localhost/json");
ResponseHandler<MyJsonObject> rh = new ResponseHandler<MyJsonObject>() {
public MyJsonObject handleResponse(final HttpResponse response) throws IOException {
StatusLine statusLine = response.getStatusLine();
HttpEntity entity = response.getEntity();
if (statusLine.getStatusCode() >= 300) {
throw new HttpResponseException(statusLine.getStatusCode(), statusLine.getReasonPhrase());
}
if (entity == null) {
throw new ClientProtocolException("Response contains no content");
}
Gson gson = new GsonBuilder().create();
Reader reader = new InputStreamReader(entity.getContent(), ContentType.getOrDefault(entity)
.getCharset());
return gson.fromJson(reader, MyJsonObject.class);
}
};
MyJsonObject myjson = httpClient.execute(httpget, rh);
System.out.println(myjson.toString());
} catch (Exception e) {
e.printStackTrace();
}
}
public class MyJsonObject {
}
HttpClient interface
HttpClient 是 thread safe,包含了多個 handler 及 strategy interface implementations,可以自訂 HttpClient。
ConnectionKeepAliveStrategy keepAliveStrat = new DefaultConnectionKeepAliveStrategy() {
@Override
public long getKeepAliveDuration(
HttpResponse response,
HttpContext context) {
long keepAlive = super.getKeepAliveDuration(response, context);
if (keepAlive == -1) {
// Keep connections alive 5 seconds if a keep-alive value
// has not be explicitly set by the server
keepAlive = 5000;
}
return keepAlive;
}
};
CloseableHttpClient httpclient = HttpClients.custom()
.setKeepAliveStrategy(keepAliveStrat)
.build();
如果 CloseableHttpClient 已經不需要使用了,且不要再被 connection manager 管理,就必須要呼叫 CloseableHttpClient#close()
CloseableHttpClient httpclient = HttpClients.createDefault();
try {
<...>
} finally {
httpclient.close();
}
HTTP execution context
HTTP 是 stateless, response-request protocol,但實際上 application 需要在數個 request-response 之間保存 state information。HTTP context functions 類似 java.util.Map
HttpClient 4.x 可以維持 HTTP session,只要使用同一個 HttpClient 且未關閉連接,則可以使用相同會話來訪問其他要求登錄驗證的服務。
如果需要使用 HttpClient Pool,並且想要做到一次登錄的會話供多個HttpClient連接使用,就需要自己保存 session information。因為客戶端的會話信息是保存在cookie中的(JSESSIONID),所以只需要將登錄成功返回的 cookie 複製到各個HttpClient 使用即可。
使用 Cookie 的方法有3種,可使用同一個 HttpClient,可以自己使用CookieStore來保存,也可以通過HttpClientContext上下文來維持。
- 使用同一個 CloseableHttpClient
public class TestHttpClient {
public static void main(String[] args) {
TestHttpClient test = new TestHttpClient();
try {
test.testTheSameHttpClient();
} catch (Exception e) {
e.printStackTrace();
}
}
String loginUrl = "http://192.168.1.24/admin/config.php";
String testUrl = "http://192.168.1.24/admin/ajax.php?module=core&command=getExtensionGrid";
public void testTheSameHttpClient() throws Exception {
System.out.println("----testTheSameHttpClient");
//// 由 HttpClientBuilder 產生 CloseableHttpClient
// HttpClientBuilder httpClientBuilder = HttpClientBuilder.create();
// CloseableHttpClient client = httpClientBuilder.build();
//// 直接產生 CloseableHttpClient
CloseableHttpClient client = HttpClients.createDefault();
HttpPost httpPost = new HttpPost(loginUrl);
Map parameterMap = new HashMap();
parameterMap.put("username", "admin");
parameterMap.put("password", "password");
UrlEncodedFormEntity postEntity = new UrlEncodedFormEntity(
getParam(parameterMap), "UTF-8");
httpPost.setEntity(postEntity);
System.out.println("request line:" + httpPost.getRequestLine());
try {
// 執行post請求
CloseableHttpResponse httpResponse = client.execute(httpPost);
boolean loginFailedFlag = false;
try {
String responseString = printResponse(httpResponse);
loginFailedFlag = responseString.contains("Please correct the following errors");
} finally {
httpResponse.close();
}
System.out.println("loginFailedFlag?:" + loginFailedFlag);
if( !loginFailedFlag ) {
// 執行get請求
System.out.println("----the same client");
HttpGet httpGet = new HttpGet(testUrl);
System.out.println("request line:" + httpGet.getRequestLine());
CloseableHttpResponse httpResponse1 = client.execute(httpGet);
try {
printResponse(httpResponse1);
} finally {
httpResponse1.close();
}
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
// close HttpClient and release all system resources
client.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
private static String printResponse(HttpResponse httpResponse)
throws ParseException, IOException {
HttpEntity entity = httpResponse.getEntity();
// response status code
System.out.println("status:" + httpResponse.getStatusLine());
System.out.println("headers:");
HeaderIterator iterator = httpResponse.headerIterator();
while (iterator.hasNext()) {
System.out.println("\t" + iterator.next());
}
// 判斷 response entity 是否 null
String responseString = null;
if (entity != null) {
responseString = EntityUtils.toString(entity);
System.out.println("response length:" + responseString.length());
System.out.println("response content:"
+ responseString.replace("\r\n", ""));
}
return responseString;
}
private static List<NameValuePair> getParam(Map parameterMap) {
List<NameValuePair> param = new ArrayList<NameValuePair>();
Iterator it = parameterMap.entrySet().iterator();
while (it.hasNext()) {
Entry parmEntry = (Entry) it.next();
param.add(new BasicNameValuePair((String) parmEntry.getKey(),
(String) parmEntry.getValue()));
}
return param;
}
}
- 使用 HttpContext
HttpContext 能夠保存任意的物件,因此在兩個不同的 thread 中共享上下文是不安全的,建議每個線程都一個它自己執行的context。
在執行 HTTP request 時,HttpClient 會將以下屬性放到 context 中
- HttpConnection instance: 代表連接到目標服務器的當前 connection。
- HttpHost instance: 代表當前 connection連接到的目標 server
- HttpRoute instance: 完整的連線路由
- HttpRequest instance: 代表了當前的HTTP request。HttpRequest object 在 context 中總是準確代表了狀態信息,因為它已經發送給了服務器。 預設的HTTP/1.0 和 HTTP/1.1使用相對的請求URI,但以non-tunneling模式通過代理發送 request 時,URI會是絕對的。
- HttpResponse instance: 代表當前的 HTTP response。
- java.lang.Boolean object 是一個標識,它標誌著當前請求是否完整地傳輸給連接目標。
- RequestConfig object: 代表當前請求配置
- java.util.List
object: 代表一個含有執行請求過程中所有的重定向地址。
public class TestHttpContext {
public static void main(String[] args) {
TestHttpContext test = new TestHttpContext();
try {
test.testHttpContext();
} catch (Exception e) {
e.printStackTrace();
}
}
String loginUrl = "http://192.168.1.24/admin/config.php";
String testUrl = "http://192.168.1.24/admin/ajax.php?module=core&command=getExtensionGrid&type=all&order=asc";
public void testHttpContext() throws Exception {
System.out.println("----testHttpContext");
//// 由 HttpClientBuilder 產生 CloseableHttpClient
// HttpClientBuilder httpClientBuilder = HttpClientBuilder.create();
// CloseableHttpClient client = httpClientBuilder.build();
//// 直接產生 CloseableHttpClient
CloseableHttpClient client = HttpClients.createDefault();
// Create a local instance of cookie store
CookieStore cookieStore = new BasicCookieStore();
// Create local HTTP context
HttpClientContext localContext = HttpClientContext.create();
localContext.setCookieStore(cookieStore);
HttpPost httpPost = new HttpPost(loginUrl);
Map parameterMap = new HashMap();
parameterMap.put("username", "admin");
parameterMap.put("password", "max168kit");
UrlEncodedFormEntity postEntity = new UrlEncodedFormEntity(
getParam(parameterMap), "UTF-8");
httpPost.setEntity(postEntity);
System.out.println("request line:" + httpPost.getRequestLine());
try {
CloseableHttpResponse httpResponse = client.execute(httpPost, localContext);
boolean loginFailedFlag = false;
try {
String responseString = printResponse(httpResponse, cookieStore);
loginFailedFlag = responseString.contains("Please correct the following errors");
} finally {
httpResponse.close();
}
System.out.println("loginFailedFlag?:" + loginFailedFlag);
if( !loginFailedFlag ) {
// 使用新的 CloseableHttpClient
CloseableHttpClient client2 = HttpClients.createDefault();
// 執行get請求
HttpGet httpGet = new HttpGet(testUrl);
System.out.println("request line:" + httpGet.getRequestLine());
CloseableHttpResponse httpResponse2 = client2.execute(httpGet, localContext);
try {
printResponse(httpResponse2, cookieStore);
} finally {
httpResponse2.close();
client2.close();
}
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
// close HttpClient and release all system resources
client.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
private static String printResponse(HttpResponse httpResponse, CookieStore cookieStore)
throws ParseException, IOException {
HttpEntity entity = httpResponse.getEntity();
// response status code
System.out.println("status:" + httpResponse.getStatusLine());
System.out.println("headers:");
HeaderIterator iterator = httpResponse.headerIterator();
while (iterator.hasNext()) {
System.out.println("\t" + iterator.next());
}
System.out.println("cookies:");
List<Cookie> cookies = cookieStore.getCookies();
for (int i = 0; i < cookies.size(); i++) {
System.out.println("\t" + cookies.get(i));
}
// 判斷 response entity 是否 null
String responseString = null;
if (entity != null) {
responseString = EntityUtils.toString(entity);
System.out.println("response length:" + responseString.length());
System.out.println("response content:"
+ responseString.replace("\r\n", ""));
}
return responseString;
}
private static List<NameValuePair> getParam(Map parameterMap) {
List<NameValuePair> param = new ArrayList<NameValuePair>();
Iterator it = parameterMap.entrySet().iterator();
while (it.hasNext()) {
Entry parmEntry = (Entry) it.next();
param.add(new BasicNameValuePair((String) parmEntry.getKey(),
(String) parmEntry.getValue()));
}
return param;
}
}
- 使用 CookieStore
修改 TestHttpContext,利用既有的 cookieStore 產生新的 CloseableHttpClient: CloseableHttpClient client2 = HttpClients.custom().setDefaultCookieStore(cookieStore).build();
if( !loginFailedFlag ) {
// 以 cookieStore, 建立新的 CloseableHttpClient
CloseableHttpClient client2 = HttpClients.custom()
.setDefaultCookieStore(cookieStore).build();
// 執行get請求
HttpGet httpGet = new HttpGet(testUrl);
System.out.println("request line:" + httpGet.getRequestLine());
CloseableHttpResponse httpResponse2 = client2.execute(httpGet);
try {
printResponse(httpResponse2, cookieStore);
} finally {
httpResponse2.close();
client2.close();
}
}
```
### HTTP Protocol Interceptors
可在處理 http message 時,加上一些特定的 Header,也可以在 outgoing message 中加上特別的 header,或是進行 content 壓縮/解壓縮,通常是用 "Decorator" pattern 實作的。
interceptor 可透過 context 共享資訊,例如在連續多個 request 中儲存 processing state。
protocol interceptor 必須要實作為 thread-safe,除非有將變數 synchronized,否則不要使用 instance variable。
public class TestHttpInterceptors {
public static void main(String[] args) {
TestHttpInterceptors test = new TestHttpInterceptors();
try {
test.testInterceptors();
} catch (Exception e) {
e.printStackTrace();
}
}
public void testInterceptors() throws IOException {
final HttpClientContext httpClientContext = HttpClientContext.create();
AtomicInteger count = new AtomicInteger(1);
httpClientContext.setAttribute("Count", count);
// request interceptor
HttpRequestInterceptor httpRequestInterceptor = new HttpRequestInterceptor() {
public void process(HttpRequest httpRequest, HttpContext httpContext) throws HttpException, IOException {
AtomicInteger count = (AtomicInteger) httpContext.getAttribute("Count");
httpRequest.addHeader("Count", String.valueOf(count.getAndIncrement()));
}
};
// response handler
ResponseHandler<String> responseHandler = new ResponseHandler<String>() {
public String handleResponse(HttpResponse httpResponse) throws ClientProtocolException, IOException {
// HeaderIterator iterator = httpResponse.headerIterator();
// while (iterator.hasNext()) {
// System.out.println("\t" + iterator.next());
// }
HttpEntity entity = httpResponse.getEntity();
if (entity != null) {
return EntityUtils.toString(entity);
}
return null;
}
};
final CloseableHttpClient httpClient = HttpClients
.custom()
.addInterceptorLast(httpRequestInterceptor)
.build();
final HttpGet httpget = new HttpGet("http://192.168.1.24/");
for (int i = 0; i < 20; i++) {
String result = httpClient.execute(httpget, responseHandler, httpClientContext);
// System.out.println(result);
}
}
}
```
Exception Handling
HTTP Protocol processor 會產生兩種 Exceptions: java.io.IOException (socket timeout, socket reset) 及 HttpException (HTTP failure)。HttpClient 會 re-throw HttpException 為 ClientProtocolExcpetion (subclass of java.io.IOException),因此我們只需要 catch IOException,就可同時處理兩種錯誤狀況。
HTTP protocol 是一種簡單的 request/response protocol,沒有 transaction processing 的功能。預設 HttpClient 會自動由 I/O Exception 恢復。
HttpClient不會嘗試從任何邏輯或HTTP協議錯誤中恢復(繼承自HttpException class)
HttpClient將自動重試被認定的冪等方法
HttpClient將自動重試當HTTP請求仍然在傳送到目標服務器,但卻失敗的方法(例如請求還沒有完全傳輸到服務器)
HttpRequestRetryHandler myRetryHandler = new HttpRequestRetryHandler(){
public boolean retryRequest(IOException exception, int executionCouont, HttpContext context){
if(executionCount >= 5){
return false;
}
if(exception instanceof InterruptedIOException){
return false;
}
if(exception instanceof UnknownHostException){
return false;
}
if(exception instanceof ConnecTimeoutException){
return false;
}
if(exception instanceof SSLException){
return false;
}
HttpClientContext clientContext = HttpClientContext.adapt(context);
HttpRequest request = clientContext.getRequest();
boolean idmpotent - !(request instanceof HttpEntityEnclosingRequest);
if(idempotent){
return true;
}
return false;
}
};
CloseableHttpClient httpclient = HttpClients.custom().setRetryHandler(myRetryHandler).build();
Aborting Requests
可以在執行的任何階段呼叫 HttpUriRequest#abort() 方法終止 request,提前終止該 request 並解除執行線程對I/O操作的阻塞。該方法是 thread-safe,可以從任何thread 呼叫該 method,如果HTTP請求終止,會拋出InterruptedIOException。
Redirect Handling
HttpClient自動處理所有類型的 Redirect,除了那些HTTP spec 要求必須用戶介入的狀況,POST 和 PUT 的 see Other(狀態code 303)重定向按HTTP規範的要求轉換成GET請求。可以自定義重定向策略覆蓋HTTP規範規定的方式.
LaxRedirectStrategy redirectStrategy = new LaxRedirectStrategy();
CloseableHttpClient httpclient = HttpClients.custom()
.setRedirectStrategy(redirectStrategy)
.build();
HttpClient經常需要在執行過程中重寫請求信息,默認的HTTP/1.0和HTTP/1.1通常使用相對請求URIs,原始的請求也可能從其他位置重定向多次,最終的絕對HTTP位置可使用原始的 request 和 context 獲得。URIUtils#resolve可以用來解釋絕對URI用於最終的 request,該方法包括重定向請求或原始請求的最後一個片段的 identifier。
CloseableHttpClient httpclient = HttpClients.createDefault();
HttpClientContext context = HttpClientContext.create();
HttpGet httpget = new HttpGet();
CloseableHttpResponse response = httpclient.execute(httpget,context);
try{
HttpPost target = context.getTargetHost();
List<URI> redirectLocations = context.getRedirectLocations();
URI location = URIUtils.resolve(httpget.getURI(),target,redirectLocations);
System.out.println("Final HTTP location: " + location.toASCIIString());
} finally {
response.close();
}
References
Java爬蟲入門簡介(三)——HttpClient保存、使用Cookie請求