Hot questions for Using Azure in elasticsearch

Top Java Programmings / Azure / elasticsearch

Question:

I'm facing SocketTimeoutException while retrieving/inserting data from/to elastic. This is happening when there are around 10-30 request/second. These requests are combination of get/put.

Here is my elastic configuration:

  • 3 master nodes each of 4GB RAM
  • 2 data nodes each of 8GM RAM
  • Azure load balancer which connects to above data node (seems only 9200 port is opened on it). And java client connects to this load balancer as it's only exposed.
  • Elastic Version: 7.2.0
  • Rest High Level Client:

    <dependency>
        <groupId>org.elasticsearch.client</groupId>
        <artifactId>elasticsearch-rest-high-level-client</artifactId>
        <version>7.2.0</version>
    </dependency>
    
    <dependency>
        <groupId>org.elasticsearch</groupId>
        <artifactId>elasticsearch</artifactId>
        <version>7.2.0</version>
    </dependency>
    

Index Information:

  • Index shards: 2
  • Index replica: 1
  • Index total fields: 10000
  • Size of index from kibana: Total-27.2 MB & Primaries: 12.2MB
  • Index structure:
    {
      "dev-index": {
        "mappings": {
          "properties": {
            "dataObj": {
              "type": "object",
              "enabled": false
            },
            "generatedID": {
              "type": "keyword"
            },
            "transNames": { //it's array of string
              "type": "keyword"
            }
          }
        }
      }
    }
    
  • Dynamic mapping is disabled.

Following is my elastic Config file. Here I've two connection bean, one is for read & another for write to elastic.

ElasticConfig.java:

@Configuration
public class ElasticConfig {

    @Value("${elastic.host}")
    private String elasticHost;

    @Value("${elastic.port}")
    private int elasticPort;

    @Value("${elastic.user}")
    private String elasticUser;

    @Value("${elastic.pass}")
    private String elasticPass;

    @Value("${elastic-timeout:20}")
    private int timeout;

    @Bean(destroyMethod = "close")
    @Qualifier("readClient")
    public RestHighLevelClient readClient(){

        final CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
        credentialsProvider.setCredentials(AuthScope.ANY, new UsernamePasswordCredentials(elasticUser, elasticPass));

        RestClientBuilder builder = RestClient
                .builder(new HttpHost(elasticHost, elasticPort))
                .setHttpClientConfigCallback(httpClientBuilder -> 
                        httpClientBuilder
                                .setDefaultCredentialsProvider(credentialsProvider)
                                .setDefaultIOReactorConfig(IOReactorConfig.custom().setIoThreadCount(5).build())
                );

        builder.setRequestConfigCallback(requestConfigBuilder -> 
                requestConfigBuilder
                        .setConnectTimeout(10000)
                        .setSocketTimeout(60000)
                        .setConnectionRequestTimeout(0)
        );

        RestHighLevelClient restClient = new RestHighLevelClient(builder);
        return restClient;
    }

    @Bean(destroyMethod = "close")
    @Qualifier("writeClient")
    public RestHighLevelClient writeClient(){

        final CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
        credentialsProvider.setCredentials(AuthScope.ANY, new UsernamePasswordCredentials(elasticUser, elasticPass));

        RestClientBuilder builder = RestClient
                .builder(new HttpHost(elasticHost, elasticPort))
                .setHttpClientConfigCallback(httpClientBuilder -> 
                        httpClientBuilder
                                .setDefaultCredentialsProvider(credentialsProvider)
                                .setDefaultIOReactorConfig(IOReactorConfig.custom().setIoThreadCount(5).build())
                );

        builder.setRequestConfigCallback(requestConfigBuilder -> 
                requestConfigBuilder
                        .setConnectTimeout(10000)
                        .setSocketTimeout(60000)
                        .setConnectionRequestTimeout(0)
        );

        RestHighLevelClient restClient = new RestHighLevelClient(builder);
        return restClient;
    }

}

Here is the function which makes a call to elastic, if data is available in elastic it will take it else it will generate data & put into elastic.

public Object getData(Request request) {

    DataObj elasticResult = elasticService.getData(request);
    if(elasticResult!=null){
        return elasticResult;
    }
    else{
        //code to generate data
        DataObj generatedData = getData();//some function which will generated data
        //put above data into elastic by Async call.
        elasticAsync.putData(generatedData);
        return generatedData;
    }
}

ElasticService.java getData Function:

@Service
public class ElasticService {

    @Value("${elastic.index}")
    private String elasticIndex;

    @Autowired
    @Qualifier("readClient")
    private RestHighLevelClient readClient;

    public DataObj getData(Request request){
        String generatedId = request.getGeneratedID();

        GetRequest getRequest = new GetRequest()
                .index(elasticIndex)   //elastic index name
                .id(generatedId);   //retrieving by index id from elastic _id field (as key-value)

        DataObj result = null;
        try {
            GetResponse response = readClient.get(getRequest, RequestOptions.DEFAULT);
            if(response.isExists()) {
                ObjectMapper objectMapper = new ObjectMapper();
                result = objectMapper.readValue(response.getSourceAsString(), DataObj.class);
            }
        }  catch (Exception e) {
            LOGGER.error("Exception occurred during  fetch from elastic !!!! " + ,e);
        }
        return result;
    }

}

ElasticAsync.java Async Put Data Function:

@Service
public class ElasticAsync {

    private static final Logger LOGGER = Logger.getLogger(ElasticAsync.class.getName());

    @Value("${elastic.index}")
    private String elasticIndex;

    @Autowired
    @Qualifier("writeClient")
    private RestHighLevelClient writeClient;

    @Async
    public void putData(DataObj generatedData){
     ElasticVO updatedRequest = toElasticVO(generatedData);//ElasticVO matches to the structure of index given above.

        try {
            ObjectMapper objectMapper = new ObjectMapper();
            String jsonString = objectMapper.writeValueAsString(updatedRequest);

            IndexRequest request = new IndexRequest(elasticIndex);
            request.id(generatedData.getGeneratedID());
            request.source(jsonString, XContentType.JSON);
            request.setRefreshPolicy(WriteRequest.RefreshPolicy.NONE);
            request.timeout(TimeValue.timeValueSeconds(5));
            IndexResponse indexResponse = writeClient.index(request, RequestOptions.DEFAULT);
            LOGGER.info("response id: " + indexResponse.getId());

            }

        } catch (Exception e) {
            LOGGER.error("Exception occurred during saving into elastic !!!!",e);
        }


    }

}

Here is the some part of the stack trace when exception is occurred during saving data into elastic:

2019-07-19 07:32:19.997 ERROR [data-retrieval,341e6ecc5b10f3be,1eeb0722983062b2,true] 1 --- [askExecutor-894] a.c.s.a.service.impl.ElasticAsync        : Exception occurred during saving into elastic !!!!

java.net.SocketTimeoutException: 60,000 milliseconds timeout on connection http-outgoing-34 [ACTIVE]
at org.elasticsearch.client.RestClient.extractAndWrapCause(RestClient.java:789) ~[elasticsearch-rest-client-7.2.0.jar!/:7.2.0]
    at org.elasticsearch.client.RestClient.performRequest(RestClient.java:225) ~[elasticsearch-rest-client-7.2.0.jar!/:7.2.0]
    at org.elasticsearch.client.RestClient.performRequest(RestClient.java:212) ~[elasticsearch-rest-client-7.2.0.jar!/:7.2.0]
    at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1448) ~[elasticsearch-rest-high-level-client-7.2.0.jar!/:7.2.0]
    at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1418) ~[elasticsearch-rest-high-level-client-7.2.0.jar!/:7.2.0]
    at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1388) ~[elasticsearch-rest-high-level-client-7.2.0.jar!/:7.2.0]
    at org.elasticsearch.client.RestHighLevelClient.index(RestHighLevelClient.java:836) ~[elasticsearch-rest-high-level-client-7.2.0.jar!/:7.2.0]


Caused by: java.net.SocketTimeoutException: 60,000 milliseconds timeout on connection http-outgoing-34 [ACTIVE]
    at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.timeout(HttpAsyncRequestExecutor.java:387) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
    at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:92) ~[httpasyncclient-4.1.3.jar!/:4.1.3]
    at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:39) ~[httpasyncclient-4.1.3.jar!/:4.1.3]
    at org.apache.http.impl.nio.reactor.AbstractIODispatch.timeout(AbstractIODispatch.java:175) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
    at org.apache.http.impl.nio.reactor.BaseIOReactor.sessionTimedOut(BaseIOReactor.java:263) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
    at org.apache.http.impl.nio.reactor.AbstractIOReactor.timeoutCheck(AbstractIOReactor.java:492) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
    at org.apache.http.impl.nio.reactor.BaseIOReactor.validate(BaseIOReactor.java:213) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
    at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:280) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
    at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
    at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
    ... 1 common frames omitted

Here is the some part of the stack trace when exception is occurred during retrieving data into elastic:

2019-07-19 07:22:37.844 ERROR [data-retrieval,104cf6b2ab5b3349,b302d3d3cd7ebc84,true] 1 --- [o-8080-exec-346] a.c.s.a.service.impl.ElasticService      : Exception occurred during  fetch from elastic !!!! 

java.net.SocketTimeoutException: 60,000 milliseconds timeout on connection http-outgoing-30 [ACTIVE]
    at org.elasticsearch.client.RestClient.extractAndWrapCause(RestClient.java:789) ~[elasticsearch-rest-client-7.1.1.jar!/:7.1.1]
    at org.elasticsearch.client.RestClient.performRequest(RestClient.java:225) ~[elasticsearch-rest-client-7.1.1.jar!/:7.1.1]
    at org.elasticsearch.client.RestClient.performRequest(RestClient.java:212) ~[elasticsearch-rest-client-7.1.1.jar!/:7.1.1]
    at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1433) ~[elasticsearch-rest-high-level-client-7.1.1.jar!/:7.1.1]
    at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1403) ~[elasticsearch-rest-high-level-client-7.1.1.jar!/:7.1.1]
    at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1373) ~[elasticsearch-rest-high-level-client-7.1.1.jar!/:7.1.1]
    at org.elasticsearch.client.RestHighLevelClient.get(RestHighLevelClient.java:699) ~[elasticsearch-rest-high-level-client-7.1.1.jar!/:7.1.1]



Caused by: java.net.SocketTimeoutException: 60,000 milliseconds timeout on connection http-outgoing-30 [ACTIVE]
        at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.timeout(HttpAsyncRequestExecutor.java:387) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
    at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:92) ~[httpasyncclient-4.1.3.jar!/:4.1.3]
    at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:39) ~[httpasyncclient-4.1.3.jar!/:4.1.3]
    at org.apache.http.impl.nio.reactor.AbstractIODispatch.timeout(AbstractIODispatch.java:175) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
    at org.apache.http.impl.nio.reactor.BaseIOReactor.sessionTimedOut(BaseIOReactor.java:263) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
    at org.apache.http.impl.nio.reactor.AbstractIOReactor.timeoutCheck(AbstractIOReactor.java:492) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
    at org.apache.http.impl.nio.reactor.BaseIOReactor.validate(BaseIOReactor.java:213) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
    at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:280) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
    at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
    at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
    ... 1 common frames omitted

I've gone through couple of stackoverflow & elastic related blogs where they have mentioned this issue could be due to RAM & cluster configuration of elastic. Then I've changed my shards from 5 to 2 as there were only two data nodes. Also increased ram of Data nodes from 4GB to 8GB, as I get to know that elastic will use only 50% of total RAM. The occurrences of exception have decreased but problem still persist.

What could be possible ways to solve this problem ? What I'm missing from java/elastic configuration point of view which frequently throwing this kind of SocketTimeoutException ? Let me know if you require any more details regarding the configuration.


Answer:

We've had the same issue and after quite some digging I found the root cause: a config mismatch of the firewall between the client and the elastic servers kernel config for tcp keep alive.

The firewall drops idle connections after 3600 seconds. The problem was that the kernel parameter for the tcp keep alive was set to 7200 seconds (default in RedHat 6.x/7.x):

sysctl -n net.ipv4.tcp_keepalive_time
7200

So the connections are dropped before a keep alive probe is being sent. The asyncHttpClient in the elastic http client doesn't seem to handle dropped connections very well, it just waits until the socket timeout.

So check whether you have any network device (Loadbalancer, Firewall, Proxy etc.) between your client and server which has a session timeout or similar and either increase that timeout or lower the tcp_keep_alive kernel parameter.

Question:

I downloaded ElasticSearch, ran bin/elasticsearch.bat and it worked on my local machine. Then I added the elasticsearch folder to my repository and updated the deploymentscript (deploy.cmd), adding these lines:

echo starting ElasticSearch...
elasticsearch-1.7.2\bin\elasticsearch.bat
echo ElasticSearch started!

After pushing my repository to my Azure Website, this error occurs in the log:

starting ElasticSearch...
Error occurred during initialization of VM
Error: Could not create the Java Virtual Machine.
Could not reserve enough space for object heap
Error: A fatal exception has occurred. Program will exit.

Java is turned on in the configuration of my WebApp. So what's the problem? Why couldn't the Java VM be created?

Edit: Could not reserve enough space for object heap looks like I have less RAM, but I already tried it out with 3.5GB RAM and the error also occurs - ElasticSearch only uses 155 MB RAM on my local machine)

Edit2: After some tries I get a new error log:

starting ElasticSearch...
[2015-10-15 12:59:18,879][INFO ][node                     ] [Marsha Rosenberg] version[1.7.2], pid[3728], build[e43676b/2015-09-14T09:49:53Z]
[2015-10-15 12:59:18,879][INFO ][node                     ] [Marsha Rosenberg] initializing ...
[2015-10-15 12:59:19,273][INFO ][plugins                  ] [Marsha Rosenberg] loaded [], sites []
[2015-10-15 12:59:20,692][INFO ][env                      ] [Marsha Rosenberg] using [1] data paths, mounts [[Windows (D:)]], net usable_space [13.5gb], net total_space [32gb], types [NTFS]
[2015-10-15 12:59:28,869][INFO ][node                     ] [Marsha Rosenberg] initialized
[2015-10-15 12:59:28,869][INFO ][node                     ] [Marsha Rosenberg] starting ...
{1.7.2}: Startup Failed ...
- ChannelException[Failed to create a selector.]
    IOException[Unable to establish loopback connection]
        SocketException[Address family not supported by protocol family: bind]

Answer:

I suspect that this is failing due to its attempt to make local requests. Please see this document for more information on the restrictions on the Azure Web App sandbox.

Question:

I have an index in azure search which has the following json

        "id": "1847234520751",
        "orderNo": "1847234520751",
        "orderType": "ONLINE",
        "orderState": "OPROCESSING",
        "orderDate": "2018-10-02T18:28:07Z",
        "lastModified": "2018-11-01T19:13:46Z",
        "docType": "SALES_ORDER",
        "paymentType": "PREPAID",
        "buyerInfo_primaryContact_name_firstName": "",
        "buyerInfo_primaryContact_name_lastName": "",
        "buyerInfo_primaryContact_email_emailAddress": "test@gmail.com"

I have indexed almost 0.8 million documents and have written the following JAVA code to query azure search

        IndexSearchOptions options = new IndexSearchOptions();
        options.setSearchFields("orderNo");
        long startTime1 = System.currentTimeMillis();
        IndexSearchResult result = indexClient.search(filter, options);
        long stopTime1 = System.currentTimeMillis();
        long elapsedTime1 = stopTime1 - startTime1;
        System.out.println("elapsed time " + elapsedTime1);

The timings for this comes out to be 1400 miliseconds. If anyone can help me reduce this time, it would be really really helpful


Answer:

If you are trying to simply return a document based on an orderNo, rather than doing a full text search, I would recommend using the "Lookup" API to do so

https://docs.microsoft.com/en-us/rest/api/searchservice/lookup-document

Also, using a client side timer to calculate elapsed time will not give you accurate results. The time elapsed will be affected by many factors, including your client machine configuration and your network performance. If you are interested in how much time it took the server to process your request, I would suggest experimenting with the REST api, and then inspect the "elapsed-time" value in the response header of your search query. This will be more useful for monitoring your search performance as it will omit any time spent on the network. If you do so, I would suggest running multiple queries and then take the average elapsed time as a metric.

If you see that te elapsed time is quick, but that the search query it still relatively slow due to network performance issues, then make sure to re-use the Search Client object in between calls, rather than creating a new one for each call, as this is a common reason queries do not get optimal latency.

Finally, here's a full article about tuning performance for your Azure Search service.

https://docs.microsoft.com/en-us/azure/search/search-performance-optimization

In your case, it seems like you are trying to speed up single query performances, rather than trying to increase how many queries can be handled at once. If your query was particularly complex (e.g. trying to return a lot of documents while using sorting and faceting), increasing the number of partitions could help, as your 0.8 million document will be spread across multiple machines, allowing each of them to execute the search over a smaller amount of documents in parallel, rather than relying on a single machine to process the full load. However, in your case, the query look relatively simple, so my suggestion would be as I mentioned above and collect accurate metric first to understand if the bottleneck is during the processing of the request or if its network related.

Hope this helps