2018/2/12

Apache HTTP Client 4.5 Fundamentals


Apache HTTP Client 是目前最常被使用的 Java HTTP Client Library,他經歷了許多次改版,目前的正式版為 4.5,5.0 還在測試階段。HttpClient 沒有完整的瀏覽器的功能,最大的差異是缺少了 UI,他單純地只有提供 HTTP Protocol 1.0 及 1.1 的資料傳輸及互動的功能,通常用在 Server Side,需要對其他有 HTTP 介面的 Server 進行資料傳輸互動時,例如在 Server 對 Google 發送搜尋的 reqest,並對 Google 回應的 HTML 內容進行解析。


除了 HTTP Request 及 Response 的 Messages 以外,所有 HTTP Request 都要以某一個 HTTP Method 的形式發送給 Server,最常用的是 GET 及 POST,在 HTTP Message 中會有多個 Headers 描述 metadatas,而在 Response 中,會夾帶可儲存在 Client 的 Cookie 資料,並在下一次的 Request 回傳給 Server,Session 是指一連串的多個 Http Request 及 Response 的互動過程,通常會以 Cookie 的方式紀錄 Session ID。


HTTP Fundamentals


這是最基本的 HttpGet


CloseableHttpClient httpclient = HttpClients.createDefault();
HttpGet httpget = new HttpGet("http://localhost/");
CloseableHttpResponse response = null;

try {
    response = httpclient.execute(httpget);
} catch (IOException e) {
    e.printStackTrace();

    logger.error("error: ", e);
} finally {
    try {
        response.close();
    } catch (IOException e) {
        e.printStackTrace();

        logger.error("error: ", e);
    }
}



Http methods 有 GET, HEAD, POST, PUT, DELETE, TRACE and OPTIONS,針對每一種 method 都有提供專屬的 class: HttpGet,
HttpHead, HttpPost, HttpPut, HttpDelete, HttpTrace, and HttpOptions。


Request URI 是 Uniform Resource Identifier,可識別資源的位置,HTTP Request URI 包含了 protocol scheme, host name, optional port, resource path,
optional query, and optional fragment 這幾個部分,可用 URIBuilder 產生 URI。


// uri=http://www.google.com/search?q=httpclient&btnG=Google+Search&aq=f&oq=
URI uri = new URIBuilder()
                    .setScheme("http")
                    .setHost("www.google.com")
                    .setPath("/search")
                    .setParameter("q", "httpclient")
                    .setParameter("btnG", "Google Search")
                    .setParameter("aq", "f")
                    .setParameter("oq", "")
                    .build();
HttpGet httpget = new HttpGet(uri);



HTTP response


HTTP response 是 server 回傳給 client 的 message。


HttpResponse httpResponse = new BasicHttpResponse(HttpVersion.HTTP_1_1,
        HttpStatus.SC_OK, "OK");
System.out.println(httpResponse.getProtocolVersion());
System.out.println(httpResponse.getStatusLine().getStatusCode());
System.out.println(httpResponse.getStatusLine().getReasonPhrase());
System.out.println(httpResponse.getStatusLine().toString());
            


HttpResponse httpResponse2 = new BasicHttpResponse(HttpVersion.HTTP_1_1,
        HttpStatus.SC_OK, "OK");
httpResponse2.addHeader("Set-Cookie",
        "c1=a; path=/; domain=localhost");
httpResponse2.addHeader("Set-Cookie",
        "c2=b; path=\"/\", c3=c; domain=\"localhost\"");
Header h1 = httpResponse2.getFirstHeader("Set-Cookie");
System.out.println(h1);
Header h2 = httpResponse2.getLastHeader("Set-Cookie");
System.out.println(h2);
Header[] hs = httpResponse2.getHeaders("Set-Cookie");
System.out.println(hs.length);

輸出結果


HTTP/1.1
200
OK
HTTP/1.1 200 OK


Set-Cookie: c1=a; path=/; domain=localhost
Set-Cookie: c2=b; path="/", c3=c; domain="localhost"
2

http message 中包含了許多 headers,可利用 HeaderIterator 逐項處理每一個 header,另外有一個 BasicHeaderElementIterator 可以針對某一種 header,處理所有 header elements。


HttpResponse response = new BasicHttpResponse(HttpVersion.HTTP_1_1,
        HttpStatus.SC_OK, "OK");
response.addHeader("Set-Cookie",
        "c1=a; path=/; domain=localhost");
response.addHeader("Set-Cookie",
        "c2=b; path=\"/\", c3=c; domain=\"localhost\"");

// HeaderIterator
HeaderIterator it = response.headerIterator("Set-Cookie");
while (it.hasNext()) {
    System.out.println(it.next());
}

// HeaderElementIterator
HeaderElementIterator it2 = new BasicHeaderElementIterator(
        response.headerIterator("Set-Cookie"));
while (it2.hasNext()) {
    HeaderElement elem = it2.nextElement();
    System.out.println(elem.getName() + " = " + elem.getValue());
    NameValuePair[] params = elem.getParameters();
    for (int i = 0; i < params.length; i++) {
        System.out.println(" " + params[i]);
    }
}



HTTP entity


HTTP message 能封裝某個 request/response 的某些 content,可在某些 request/response 中找到,他是 optional 的資料。使用 entities 的 request 稱為 entity enclosing requests,HTTP request 中有兩種 entity request methods: POST 及 PUT。


除了回應 HEAD method 的 response 以及 204 No Content, 304 Not Modified, 205 Reset Content 以外,大部分的 response 較常使用 entity。


HttpClient 區分了三種 entities: streamed, self-contained, wrapping,通常會將 non-repeatable entities 視為 streamed,而將 repeatable entities 視為 self-contained。


  1. streamed: content 是由 stream 取得,常用在 response,streamed entities 不能重複。

  2. self-contained: content 存放在記憶體中,或是由 connection 以外的方式取得的,這種 entity 可以重複,通常用在 entity enclosing HTTP requests。repeatable 就是可以重複讀取 content 的 entity,ex: ByteArrayEntity or StringEntity。

  3. wrapping: 由另一個 entity 取得的 content


因 entity 可存放 binary 及 character content,因此支援了 character encodings。


可利用 HttpEntity#getContentType(), HttpEntity#getContentLength() 取得 Content-Type and Content-Length 欄位的資訊,因 Content-Type 包含了 character encoding 的資訊,可用 HttpEntity#getContentEncoding() 取得,如果 HttpEntity 包含了 Content-Type header,就能取得 Header 物件。


StringEntity myEntity = new StringEntity("important message",
        ContentType.create("text/plain", "UTF-8"));
    
System.out.println(myEntity.getContentType());
System.out.println(myEntity.getContentLength());
System.out.println(EntityUtils.toString(myEntity));
System.out.println(EntityUtils.toByteArray(myEntity).length);

結果


Content-Type: text/plain; charset=UTF-8
17
important message
17



Ensuring release of low level resources


為確保系統資源有回收,必須要關閉 entity 取得的 content stream,或是直接關閉 response,關閉 stream 時,還能保持 connection,但如果關閉 response,就直接關閉並 discards connection。


CloseableHttpClient httpclient = HttpClients.createDefault();
HttpGet httpget = new HttpGet("http://localhost/");
CloseableHttpResponse response = httpclient.execute(httpget);
try {
    HttpEntity entity = response.getEntity();
    if (entity != null) {
        InputStream instream = entity.getContent();
        try {
            // do something useful
        } finally {
            instream.close();
        }
    }
} finally {
    response.close();
}

HttpEntity#writeTo(OutputStream) 也能用來保證在 entity 完全寫入後, resource 能被釋放。如果是用 HttpEntity#getContent() 取得了 java.io.InputStream,就必須自行在 finally 中 close stream。如果是處理 streaming entities,使用 EntityUtils#consume(HttpEntity) 可保證 entity content 能完全被處理並回收 stream。

如果只需要處理部分 response content,可直接呼叫 response.close,就不需要消化所有的 response content,但 connection 也無法被 reused。


CloseableHttpClient httpclient = HttpClients.createDefault();
HttpGet httpget = new HttpGet("http://localhost/");
CloseableHttpResponse response = httpclient.execute(httpget);
try {
    HttpEntity entity = response.getEntity();
    if (entity != null) {
        InputStream instream = entity.getContent();
        int byteOne = instream.read();
        int byteTwo = instream.read();
        
        // Do not need the rest
    }
} finally {
    response.close();
}



Consuming entity content


最好的方式是呼叫 HttpEntity#getContent() 或是 HttpEntity#wrtieTo(OutputStream),但 HttpClient 同時也提供 EntityUtils 類別有多個處理 content 的 static methods,不建議使用 EntityUtils,除非 response 是由 trusted HTTP server 回傳,且是有限的長度。


CloseableHttpClient httpclient = HttpClients.createDefault();
HttpGet httpget = new HttpGet("http://localhost/");
CloseableHttpResponse response = httpclient.execute(httpget);
try {
    HttpEntity entity = response.getEntity();
    if (entity != null) {
        long len = entity.getContentLength();
        if (len != -1 && len < 2048) {
            System.out.println(EntityUtils.toString(entity));
        } else {
            // Stream content out
        }
    }
} finally {
    response.close();
}

如果需要多次讀取整個 entity content,最簡單的方式是以 BufferedHttpEntity class 封裝原本的 entity,這可讓 content 放如 in-memory buffer。


CloseableHttpResponse response = <...>
HttpEntity entity = response.getEntity();
if (entity != null) {
    entity = new BufferedHttpEntity(entity);
}



Producing entity content


StringEntity, ByteArrayEntity, InputSreamEntity, FileEntity 可用來透過 HTTP connetion stream out 資料。InputStreamEntity 只能被使用一次,不能重複讀取資料。


File file = new File("somefile.txt");
FileEntity entity = new FileEntity(file,
        ContentType.create("text/plain", "UTF-8"));
HttpPost httppost = new HttpPost("http://localhost/action.do");
httppost.setEntity(entity);



HTML forms


UrlEncodedFormEntity 模擬 submitting an HTML form。以下等同用 POST method 發送 param1=value1&param2=value2。


List<NameValuePair> formparams = new ArrayList<NameValuePair>();
formparams.add(new BasicNameValuePair("param1", "value1"));
formparams.add(new BasicNameValuePair("param2", "value2"));
UrlEncodedFormEntity entity = new UrlEncodedFormEntity(formparams, Consts.UTF_8);

HttpPost httppost = new HttpPost("http://localhost/handler.do");
httppost.setEntity(entity);



Content chunking


可直接呼叫 HttpEntity#setChunked(true) 建議分塊處理 content,但如果遇到不支援的 HTTP/1.0,還是會忽略這個設定值。


StringEntity entity = new StringEntity("important message",
ContentType.create("plain/text", Consts.UTF_8));
entity.setChunked(true);
HttpPost httppost = new HttpPost("http://localhost/acrtion.do");
httppost.setEntity(entity);



response handlers


透過 ResponseHandler interface 的 handleResponse(HttpResponse response) 這個方法處理 response 這個方式最簡單,programmer 不需要處理 connection management,HttpClient 會自動確保 connection 回到 connection manager。


public static void main(String[] args) {
    try {
        CloseableHttpClient httpClient = HttpClients.createDefault();
        HttpGet httpget = new HttpGet("http://localhost/json");

        ResponseHandler<MyJsonObject> rh = new ResponseHandler<MyJsonObject>() {
            public MyJsonObject handleResponse(final HttpResponse response) throws IOException {
                StatusLine statusLine = response.getStatusLine();
                HttpEntity entity = response.getEntity();
                if (statusLine.getStatusCode() >= 300) {
                    throw new HttpResponseException(statusLine.getStatusCode(), statusLine.getReasonPhrase());
                }
                if (entity == null) {
                    throw new ClientProtocolException("Response contains no content");
                }
                Gson gson = new GsonBuilder().create();
                Reader reader = new InputStreamReader(entity.getContent(), ContentType.getOrDefault(entity)
                        .getCharset());
                return gson.fromJson(reader, MyJsonObject.class);
            }
        };
        MyJsonObject myjson = httpClient.execute(httpget, rh);
        System.out.println(myjson.toString());

    } catch (Exception e) {
        e.printStackTrace();
    }
}

public class MyJsonObject {

}

HttpClient interface


HttpClient 是 thread safe,包含了多個 handler 及 strategy interface implementations,可以自訂 HttpClient。


ConnectionKeepAliveStrategy keepAliveStrat = new DefaultConnectionKeepAliveStrategy() {
    @Override
    public long getKeepAliveDuration(
            HttpResponse response,
            HttpContext context) {
        long keepAlive = super.getKeepAliveDuration(response, context);
        if (keepAlive == -1) {
            // Keep connections alive 5 seconds if a keep-alive value
            // has not be explicitly set by the server
            keepAlive = 5000;
        }
        return keepAlive;
    }
};
CloseableHttpClient httpclient = HttpClients.custom()
        .setKeepAliveStrategy(keepAliveStrat)
        .build();

如果 CloseableHttpClient 已經不需要使用了,且不要再被 connection manager 管理,就必須要呼叫 CloseableHttpClient#close()


CloseableHttpClient httpclient = HttpClients.createDefault();
try {
    <...>
} finally {
    httpclient.close();
}

HTTP execution context


HTTP 是 stateless, response-request protocol,但實際上 application 需要在數個 request-response 之間保存 state information。HTTP context functions 類似 java.util.Map 的概念,

HttpClient 4.x 可以維持 HTTP session,只要使用同一個 HttpClient 且未關閉連接,則可以使用相同會話來訪問其他要求登錄驗證的服務。


如果需要使用 HttpClient Pool,並且想要做到一次登錄的會話供多個HttpClient連接使用,就需要自己保存 session information。因為客戶端的會話信息是保存在cookie中的(JSESSIONID),所以只需要將登錄成功返回的 cookie 複製到各個HttpClient 使用即可。


使用 Cookie 的方法有3種,可使用同一個 HttpClient,可以自己使用CookieStore來保存,也可以通過HttpClientContext上下文來維持。


  • 使用同一個 CloseableHttpClient

public class TestHttpClient {

    public static void main(String[] args) {
        TestHttpClient test = new TestHttpClient();

        try {
            test.testTheSameHttpClient();

        } catch (Exception e) {
            e.printStackTrace();
        }

    }

    String loginUrl = "http://192.168.1.24/admin/config.php";
    String testUrl = "http://192.168.1.24/admin/ajax.php?module=core&command=getExtensionGrid";

    public void testTheSameHttpClient() throws Exception {
        System.out.println("----testTheSameHttpClient");

        //// 由 HttpClientBuilder 產生 CloseableHttpClient
        // HttpClientBuilder httpClientBuilder = HttpClientBuilder.create();
        // CloseableHttpClient client = httpClientBuilder.build();

        //// 直接產生 CloseableHttpClient
        CloseableHttpClient client = HttpClients.createDefault();

        HttpPost httpPost = new HttpPost(loginUrl);
        Map parameterMap = new HashMap();
        parameterMap.put("username", "admin");
        parameterMap.put("password", "password");

        UrlEncodedFormEntity postEntity = new UrlEncodedFormEntity(
                getParam(parameterMap), "UTF-8");
        httpPost.setEntity(postEntity);

        System.out.println("request line:" + httpPost.getRequestLine());
        try {
            // 執行post請求
            CloseableHttpResponse httpResponse = client.execute(httpPost);

            boolean loginFailedFlag = false;
            try {
                String responseString = printResponse(httpResponse);

                loginFailedFlag = responseString.contains("Please correct the following errors");

            } finally {
                httpResponse.close();
            }
            System.out.println("loginFailedFlag?:" + loginFailedFlag);

            if( !loginFailedFlag ) {
                // 執行get請求
                System.out.println("----the same client");
                HttpGet httpGet = new HttpGet(testUrl);
                System.out.println("request line:" + httpGet.getRequestLine());
                CloseableHttpResponse httpResponse1 = client.execute(httpGet);

                try {
                    printResponse(httpResponse1);
                } finally {
                    httpResponse1.close();
                }
            }

        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            try {
                // close HttpClient and release all system resources
                client.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

    private static String printResponse(HttpResponse httpResponse)
            throws ParseException, IOException {
        HttpEntity entity = httpResponse.getEntity();
        // response status code
        System.out.println("status:" + httpResponse.getStatusLine());
        System.out.println("headers:");
        HeaderIterator iterator = httpResponse.headerIterator();
        while (iterator.hasNext()) {
            System.out.println("\t" + iterator.next());
        }
        // 判斷 response entity 是否 null
        String responseString = null;
        if (entity != null) {
            responseString = EntityUtils.toString(entity);
            System.out.println("response length:" + responseString.length());
            System.out.println("response content:"
                    + responseString.replace("\r\n", ""));
        }

        return responseString;
    }

    private static List<NameValuePair> getParam(Map parameterMap) {
        List<NameValuePair> param = new ArrayList<NameValuePair>();
        Iterator it = parameterMap.entrySet().iterator();
        while (it.hasNext()) {
            Entry parmEntry = (Entry) it.next();
            param.add(new BasicNameValuePair((String) parmEntry.getKey(),
                    (String) parmEntry.getValue()));
        }
        return param;
    }
}

  • 使用 HttpContext

HttpContext 能夠保存任意的物件,因此在兩個不同的 thread 中共享上下文是不安全的,建議每個線程都一個它自己執行的context。


在執行 HTTP request 時,HttpClient 會將以下屬性放到 context 中


  1. HttpConnection instance: 代表連接到目標服務器的當前 connection。
  2. HttpHost instance: 代表當前 connection連接到的目標 server
  3. HttpRoute instance: 完整的連線路由
  4. HttpRequest instance: 代表了當前的HTTP request。HttpRequest object 在 context 中總是準確代表了狀態信息,因為它已經發送給了服務器。 預設的HTTP/1.0 和 HTTP/1.1使用相對的請求URI,但以non-tunneling模式通過代理發送 request 時,URI會是絕對的。
  5. HttpResponse instance: 代表當前的 HTTP response。
  6. java.lang.Boolean object 是一個標識,它標誌著當前請求是否完整地傳輸給連接目標。
  7. RequestConfig object: 代表當前請求配置
  8. java.util.List object: 代表一個含有執行請求過程中所有的重定向地址。

public class TestHttpContext {

    public static void main(String[] args) {
        TestHttpContext test = new TestHttpContext();

        try {
            test.testHttpContext();

        } catch (Exception e) {
            e.printStackTrace();
        }

    }

    String loginUrl = "http://192.168.1.24/admin/config.php";
    String testUrl = "http://192.168.1.24/admin/ajax.php?module=core&command=getExtensionGrid&type=all&order=asc";

    public void testHttpContext() throws Exception {
        System.out.println("----testHttpContext");

        //// 由 HttpClientBuilder 產生 CloseableHttpClient
        // HttpClientBuilder httpClientBuilder = HttpClientBuilder.create();
        // CloseableHttpClient client = httpClientBuilder.build();

        //// 直接產生 CloseableHttpClient
        CloseableHttpClient client = HttpClients.createDefault();

        // Create a local instance of cookie store
        CookieStore cookieStore = new BasicCookieStore();

        // Create local HTTP context
        HttpClientContext localContext = HttpClientContext.create();
        localContext.setCookieStore(cookieStore);


        HttpPost httpPost = new HttpPost(loginUrl);
        Map parameterMap = new HashMap();
        parameterMap.put("username", "admin");
        parameterMap.put("password", "max168kit");

        UrlEncodedFormEntity postEntity = new UrlEncodedFormEntity(
                getParam(parameterMap), "UTF-8");
        httpPost.setEntity(postEntity);

        System.out.println("request line:" + httpPost.getRequestLine());
        try {

            CloseableHttpResponse httpResponse = client.execute(httpPost, localContext);

            boolean loginFailedFlag = false;
            try {
                String responseString = printResponse(httpResponse, cookieStore);

                loginFailedFlag = responseString.contains("Please correct the following errors");

            } finally {
                httpResponse.close();
            }

            System.out.println("loginFailedFlag?:" + loginFailedFlag);

            if( !loginFailedFlag ) {
                // 使用新的 CloseableHttpClient
                CloseableHttpClient client2 = HttpClients.createDefault();

                // 執行get請求
                HttpGet httpGet = new HttpGet(testUrl);
                System.out.println("request line:" + httpGet.getRequestLine());
                CloseableHttpResponse httpResponse2 = client2.execute(httpGet, localContext);

                try {
                    printResponse(httpResponse2, cookieStore);
                } finally {
                    httpResponse2.close();
                    client2.close();
                }
            }

        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            try {
                // close HttpClient and release all system resources
                client.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

    private static String printResponse(HttpResponse httpResponse, CookieStore cookieStore)
            throws ParseException, IOException {
        HttpEntity entity = httpResponse.getEntity();
        // response status code
        System.out.println("status:" + httpResponse.getStatusLine());
        System.out.println("headers:");
        HeaderIterator iterator = httpResponse.headerIterator();
        while (iterator.hasNext()) {
            System.out.println("\t" + iterator.next());
        }

        System.out.println("cookies:");
        List<Cookie> cookies = cookieStore.getCookies();
        for (int i = 0; i < cookies.size(); i++) {
            System.out.println("\t" + cookies.get(i));
        }
        // 判斷 response entity 是否 null
        String responseString = null;
        if (entity != null) {
            responseString = EntityUtils.toString(entity);
            System.out.println("response length:" + responseString.length());
            System.out.println("response content:"
                    + responseString.replace("\r\n", ""));
        }

        return responseString;
    }

    private static List<NameValuePair> getParam(Map parameterMap) {
        List<NameValuePair> param = new ArrayList<NameValuePair>();
        Iterator it = parameterMap.entrySet().iterator();
        while (it.hasNext()) {
            Entry parmEntry = (Entry) it.next();
            param.add(new BasicNameValuePair((String) parmEntry.getKey(),
                    (String) parmEntry.getValue()));
        }
        return param;
    }
}

  • 使用 CookieStore

修改 TestHttpContext,利用既有的 cookieStore 產生新的 CloseableHttpClient: CloseableHttpClient client2 = HttpClients.custom().setDefaultCookieStore(cookieStore).build();


    if( !loginFailedFlag ) {
        // 以 cookieStore, 建立新的 CloseableHttpClient
        CloseableHttpClient client2 = HttpClients.custom()
                .setDefaultCookieStore(cookieStore).build();

        // 執行get請求
        HttpGet httpGet = new HttpGet(testUrl);
        System.out.println("request line:" + httpGet.getRequestLine());
        CloseableHttpResponse httpResponse2 = client2.execute(httpGet);

        try {
            printResponse(httpResponse2, cookieStore);
        } finally {
            httpResponse2.close();
            client2.close();
        }
    }
```

### HTTP Protocol Interceptors

可在處理 http message 時,加上一些特定的 Header,也可以在 outgoing message 中加上特別的 header,或是進行 content 壓縮/解壓縮,通常是用 "Decorator" pattern 實作的。

interceptor 可透過 context 共享資訊,例如在連續多個 request 中儲存 processing state。

protocol interceptor 必須要實作為 thread-safe,除非有將變數 synchronized,否則不要使用 instance variable。

public class TestHttpInterceptors {


public static void main(String[] args) {
    TestHttpInterceptors test = new TestHttpInterceptors();

    try {
        test.testInterceptors();

    } catch (Exception e) {
        e.printStackTrace();
    }

}

public void testInterceptors() throws IOException {
    final HttpClientContext httpClientContext = HttpClientContext.create();

    AtomicInteger count = new AtomicInteger(1);
    httpClientContext.setAttribute("Count", count);

    // request interceptor
    HttpRequestInterceptor httpRequestInterceptor = new HttpRequestInterceptor() {
        public void process(HttpRequest httpRequest, HttpContext httpContext) throws HttpException, IOException {
            AtomicInteger count = (AtomicInteger) httpContext.getAttribute("Count");

            httpRequest.addHeader("Count", String.valueOf(count.getAndIncrement()));
        }
    };

    // response handler
    ResponseHandler<String> responseHandler = new ResponseHandler<String>() {
        public String handleResponse(HttpResponse httpResponse) throws ClientProtocolException, IOException {

// HeaderIterator iterator = httpResponse.headerIterator();
// while (iterator.hasNext()) {
// System.out.println("\t" + iterator.next());
// }


            HttpEntity entity = httpResponse.getEntity();
            if (entity != null) {
                return EntityUtils.toString(entity);
            }
            return null;
        }
    };

    final CloseableHttpClient httpClient = HttpClients
            .custom()
            .addInterceptorLast(httpRequestInterceptor)
            .build();

    final HttpGet httpget = new HttpGet("http://192.168.1.24/");

    for (int i = 0; i < 20; i++) {

        String result = httpClient.execute(httpget, responseHandler, httpClientContext);

// System.out.println(result);
}


}

}
```


Exception Handling


HTTP Protocol processor 會產生兩種 Exceptions: java.io.IOException (socket timeout, socket reset) 及 HttpException (HTTP failure)。HttpClient 會 re-throw HttpException 為 ClientProtocolExcpetion (subclass of java.io.IOException),因此我們只需要 catch IOException,就可同時處理兩種錯誤狀況。


HTTP protocol 是一種簡單的 request/response protocol,沒有 transaction processing 的功能。預設 HttpClient 會自動由 I/O Exception 恢復。


  1. HttpClient不會嘗試從任何邏輯或HTTP協議錯誤中恢復(繼承自HttpException class)

  2. HttpClient將自動重試被認定的冪等方法

  3. HttpClient將自動重試當HTTP請求仍然在傳送到目標服務器,但卻失敗的方法(例如請求還沒有完全傳輸到服務器)


HttpRequestRetryHandler myRetryHandler = new HttpRequestRetryHandler(){
    public boolean retryRequest(IOException exception, int executionCouont, HttpContext context){
        if(executionCount >= 5){
            return false;
        }
        if(exception instanceof InterruptedIOException){
            return false;
        }
        if(exception instanceof UnknownHostException){
            return false;
        }
        if(exception instanceof ConnecTimeoutException){
            return false;
        }
        if(exception instanceof SSLException){
            return false;
        }
        HttpClientContext clientContext = HttpClientContext.adapt(context);
        HttpRequest request = clientContext.getRequest();
        boolean idmpotent - !(request instanceof HttpEntityEnclosingRequest);
        if(idempotent){
            return true;
        }
        return false;
    }
};
CloseableHttpClient httpclient = HttpClients.custom().setRetryHandler(myRetryHandler).build();

Aborting Requests


可以在執行的任何階段呼叫 HttpUriRequest#abort() 方法終止 request,提前終止該 request 並解除執行線程對I/O操作的阻塞。該方法是 thread-safe,可以從任何thread 呼叫該 method,如果HTTP請求終止,會拋出InterruptedIOException。


Redirect Handling


HttpClient自動處理所有類型的 Redirect,除了那些HTTP spec 要求必須用戶介入的狀況,POST 和 PUT 的 see Other(狀態code 303)重定向按HTTP規範的要求轉換成GET請求。可以自定義重定向策略覆蓋HTTP規範規定的方式.


LaxRedirectStrategy redirectStrategy = new LaxRedirectStrategy();
CloseableHttpClient httpclient = HttpClients.custom()
.setRedirectStrategy(redirectStrategy)
.build();

HttpClient經常需要在執行過程中重寫請求信息,默認的HTTP/1.0和HTTP/1.1通常使用相對請求URIs,原始的請求也可能從其他位置重定向多次,最終的絕對HTTP位置可使用原始的 request 和 context 獲得。URIUtils#resolve可以用來解釋絕對URI用於最終的 request,該方法包括重定向請求或原始請求的最後一個片段的 identifier。


CloseableHttpClient httpclient = HttpClients.createDefault();
HttpClientContext context = HttpClientContext.create();
HttpGet httpget = new HttpGet();
CloseableHttpResponse response = httpclient.execute(httpget,context);
try{
    HttpPost target = context.getTargetHost();
    List<URI> redirectLocations = context.getRedirectLocations();
    URI location = URIUtils.resolve(httpget.getURI(),target,redirectLocations);
    System.out.println("Final HTTP location: " + location.toASCIIString());
} finally {
    response.close();
}

References


使用 httpclient 連接池及注意事項


HttpClient 4 Cookbook


Posting with HttpClient


HttpClient tutorial


HTTP context的使用


HttpClient4.x 使用cookie保持會話


Java爬蟲入門簡介(三)——HttpClient保存、使用Cookie請求


HttpClient獲取Cookie的一次踩坑實錄


Apache HttpClient 4.5 How to Get Server Certificates

沒有留言:

張貼留言