跳到主要内容

多筛选聚合 filters

Filters Aggregation

基于多桶聚合,其中每个存储桶都与一个filter关联。每个存储桶将收集与其关联筛选器匹配的所有文档。

PUT /logs/_doc/_bulk?refresh
{ "index" : { "_id" : 1 } }
{ "body" : "warning: page could not be rendered" }
{ "index" : { "_id" : 2 } }
{ "body" : "authentication error" }
{ "index" : { "_id" : 3 } }
{ "body" : "warning: connection timed out" }

GET logs/_search
{
"size": 0,
"aggs" : {
"messages" : {
"filters" : {
"filters" : {
"errors" : { "match" : { "body" : "error" }},
"warnings" : { "match" : { "body" : "warning" }}
}
}
}
}
}

在上面的示例中,我们分析了日志消息。聚合将构建两个日志消息集合(桶),一个用于所有包含错误的消息,另一个用于包含警告的消息。

{
"took": 9,
"timed_out": false,
"_shards": ...,
"hits": ...,
"aggregations": {
"messages": {
"buckets": {
"errors": {
"doc_count": 1
},
"warnings": {
"doc_count": 2
}
}
}
}
}

Anonymous filters

filters字段也可以作为数组提供,如以下请求所示

GET logs/_search
{
"size": 0,
"aggs" : {
"messages" : {
"filters" : {
"filters" : [
{ "match" : { "body" : "error" }},
{ "match" : { "body" : "warning" }}
]
}
}
}
}

过滤后的桶以请求中提供的相同顺序返回。该示例的响应如下:

{
"took": 4,
"timed_out": false,
"_shards": ...,
"hits": ...,
"aggregations": {
"messages": {
"buckets": [
{
"doc_count": 1
},
{
"doc_count": 2
}
]
}
}
}

Other Bucket

other_bucket参数可以设置为向响应中添加一个bucket,该bucket将包含与任何给定过滤器都不匹配的所有文档。此参数的值可以如下所示:

  • false 不计算其他存储桶
  • true 如果正在使用命名过滤器,则返回另一个存储桶(默认情况下命名为other),如果正在使用匿名过滤器,则作为最后一个存储

下面的代码片段显示了一个响应,其中请求将另一个bucket命名为other_messages。

PUT logs/_doc/4?refresh
{
"body": "info: user Bob logged out"
}

GET logs/_search
{
"size": 0,
"aggs" : {
"messages" : {
"filters" : {
"other_bucket_key": "other_messages",
"filters" : {
"errors" : { "match" : { "body" : "error" }},
"warnings" : { "match" : { "body" : "warning" }}
}
}
}
}
}

返回

{
"took": 3,
"timed_out": false,
"_shards": ...,
"hits": ...,
"aggregations": {
"messages": {
"buckets": {
"errors": {
"doc_count": 1
},
"warnings": {
"doc_count": 2
},
"other_messages": {
"doc_count": 1
}
}
}
}
}