Elastic query bool must match issue

1/16/2020

Below is the query part in Elastic GET API via command line inside openshift pod , i get all the match query as well as unmatch element in the fetch of 2000 documents. how can i limit to only the match element. i want to specifically get {\"kubernetes.container_name\":\"xyz\"}} only.

any suggestions will be appreciated

-d ' {\"query\": { \"bool\" :{\"must\" :{\"match\" :{\"kubernetes.container_name\":\"xyz\"}},\"filter\" : {\"range\": {\"@timestamp\": {\"gte\": \"now-2m\",\"lt\": \"now-1m\"}}}}},\"_source\":[\"@timestamp\",\"message\",\"kubernetes.container_name\"],\"size\":2000}'"
-- Sarbojeet Das
devops
elasticsearch
kubernetes
openshift

1 Answer

1/16/2020

For exact matches there are two things you would need to do:

Text datatype goes through Analysis phase.

For e.g. if you data is This is a beautiful day, during ingestion, text datatype would break down the words into tokens, lowercase them [this, is, a, beautiful, day] and then add them to the inverted index. This process happens via Standard Analyzer which is the default analyzer applied on text field.

So now when you query, it would again apply the analyzer at querying time and would search if the words are present in the respective documents. As a result you see documents even without exact match appearing.

In order to do an exact match, you would need to make use of keyword fields as it does not goes through the analysis phase.

What I'd suggest is to create a keyword sibling field for text field that you have in below manner and then re-ingest all the data:

Mapping:

PUT my_sample_index
{
  "mappings": {
    "properties": {
      "kubernetes":{
        "type": "object",
        "properties": {
          "container_name": {
            "type": "text",
            "fields":{               <--- Note this
              "keyword":{            <--- This is container_name.keyword field
                "type": "keyword"
              }
            }
          }
        }
      }
    }
  }
}

Note that I'm assuming you are making use of object type.

Request Query:

POST my_sample_index
{
  "query":{
    "bool": {
      "must": [
        {
          "term": {
            "kubernetes.container_name.keyword": {
              "value": "xyz"
            }
          }
        }
      ]
    }
  }
}

Hope this helps!

-- Opster ES Ninja - Kamal
Source: StackOverflow