Skip to main content

反向检索 percolate

反向查询:将查询存储到索引中,然后通过Percolate API定义文档以检索这些查询。

使用场景

反向查询通常用于通知的场景

举例:用户订阅了对阿凡达电影感兴趣,当阿凡达电影上映时给该用户发送通知。

PUT /my-index
{
"mappings": {
"_doc": {
"properties": {
"message": {
"type": "text"
},
"query": {
"type": "percolator"
}
}
}
}
}

PUT /my-index/_doc/1?refresh
{
"query" : {
"match" : {
"message" : "avatar"
}
}
}

执行反向查询,将文档文本与注册的query进行匹配:

GET /my-index/_search
{
"query" : {
"percolate" : {
"field" : "query",
"document" : {
"message" : "avatar is offer"
}
}
}
}

返回结果

{
"took": 13,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped" : 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.5753642,
"hits": [
{
"_index": "my-index",
"_type": "_doc",
"_id": "1",
"_score": 0.5753642,
"_source": {
"query": {
"match": {
"message": "avatar"
}
}
},
"fields" : {
"_percolator_document_slot" : [0]
}
}
]
}
}
  1. id 为 1 的查询与我们的文档匹配。
  2. _percolator_document_slot 字段指示哪个文档与此查询匹配。
参数含义
field(必填项) 保存索引查询的过滤器类型的字段。
name(可选项)如果指定了多个渗透查询,则用于 _percolator_document_slot 字段的后缀。这是一个可选参数。
document正在渗透的文档的来源。
documents与文档参数类似,但通过 json 数组接受多个文档。

Percolating 关闭算分以提升性能

GET /my-index/_search
{
"query" : {
"constant_score": {
"filter": {
"percolate" : {
"field" : "query",
"document" : {
"message" : "avatar is offer"
}
}
}
}
}
}

如果不需要计算分数,则应该将Percolating Query 包装在 Constant Score Query 或 Bool 查询的 filter 子句中,以获取查询性能的提升。

反向查找多个文档

percolate 查询可以使用索引的 percolator 查询同时匹配多个文档。在单个请求中渗透多个文档可以提高性能,因为查询只需要解析和匹配一次而不是多次。每个匹配的过滤器查询返回的 _percolator_document_slot 字段在同时过滤多个文档时很重要。它指示哪些文档与特定的过滤器查询匹配。这些数字与渗透查询中指定的文档数组中的slot相关

GET /my-index/_search
{
"query" : {
"percolate" : {
"field" : "query",
"documents" : [
{
"message" : "avatar is offer"
},
{
"message" : "offer movies"
},
{
"message" : "Sparta is offer"
}
]
}
}
}

返回

{
"took": 13,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped" : 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1.5606477,
"hits": [
{
"_index": "my-index",
"_type": "_doc",
"_id": "1",
"_score": 1.5606477,
"_source": {
"query": {
"match": {
"message": "bonsai tree"
}
}
},
"fields" : {
"_percolator_document_slot" : [0, 1, 3]
}
}
]
}
}

反向查找已存在的文档

PUT /my-index/_doc/2
{
"message" : "A new bonsai tree in the office"
}

percolate 基于已存在的索引构建新搜索请求:

GET /my-index/_search
{
"query" : {
"percolate" : {
"field": "query",
"index" : "my-index",
"type" : "_doc",
"id" : "2",
"version" : 1
}
}
}

版本是可选的,但在某些情况下很有用。我们可以确保我们正在尝试渗透我们刚刚索引的文档。在我们建立索引后可能会进行更改,如果是这种情况,搜索请求将因版本冲突错误而失败

反向查找与高亮

PUT /my-index/_doc/3?refresh
{
"query" : {
"match" : {
"message" : "brown fox"
}
}
}

PUT /my-index/_doc/4?refresh
{
"query" : {
"match" : {
"message" : "lazy dog"
}
}
}

执行高亮查询:

GET /my-index/_search
{
"query" : {
"percolate" : {
"field": "query",
"document" : {
"message" : "The quick brown fox jumps over the lazy dog"
}
}
},
"highlight": {
"fields": {
"message": {}
}
}
}

返回

{
"took": 7,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped" : 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0.5753642,
"hits": [
{
"_index": "my-index",
"_type": "_doc",
"_id": "4",
"_score": 0.5753642,
"_source": {
"query": {
"match": {
"message": "lazy dog"
}
}
},
"highlight": {
"message": [
"The quick brown fox jumps over the <em>lazy</em> <em>dog</em>"
]
},
"fields" : {
"_percolator_document_slot" : [0]
}
},
{
"_index": "my-index",
"_type": "_doc",
"_id": "3",
"_score": 0.5753642,
"_source": {
"query": {
"match": {
"message": "brown fox"
}
}
},
"highlight": {
"message": [
"The quick <em>brown</em> <em>fox</em> jumps over the lazy dog"
]
},
"fields" : {
"_percolator_document_slot" : [0]
}
}
]
}
}

搜索请求中的查询不是高亮过滤结果,而是高亮过滤查询中定义的文档。 当同时过滤多个文档时(如下面的请求),高亮响应是不同的

GET /my-index/_search
{
"query" : {
"percolate" : {
"field": "query",
"documents" : [
{
"message" : "bonsai tree"
},
{
"message" : "new tree"
},
{
"message" : "the office"
},
{
"message" : "office tree"
}
]
}
},
"highlight": {
"fields": {
"message": {}
}
}
}

略有不同的响应:

{
"took": 13,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped" : 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1.5606477,
"hits": [
{
"_index": "my-index",
"_type": "_doc",
"_id": "1",
"_score": 1.5606477,
"_source": {
"query": {
"match": {
"message": "bonsai tree"
}
}
},
"fields" : {
"_percolator_document_slot" : [0, 1, 3]
},
"highlight" : {
"0_message" : [
"<em>bonsai</em> <em>tree</em>"
],
"3_message" : [
"office <em>tree</em>"
],
"1_message" : [
"new <em>tree</em>"
]
}
}
]
}
}

指定多个反向查询

可以在单个搜索请求中指定多个反向查询

GET /my-index/_search
{
"query" : {
"bool" : {
"should" : [
{
"percolate" : {
"field" : "query",
"document" : {
"message" : "bonsai tree"
},
"name": "query1"
}
},
{
"percolate" : {
"field" : "query",
"document" : {
"message" : "tulip flower"
},
"name": "query2"
}
}
]
}
}
}

name 参数将用于标识哪个 percolator document slots 属于哪个 percolate 查询。

_percolator_document_slot 字段名称将以 _name 参数中指定的内容为后缀。如果未指定,则将使用 field 参数,在这种情况下会导致歧义。

上面的搜索请求返回类似这样的响应

{
"took": 13,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped" : 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.5753642,
"hits": [
{
"_index": "my-index",
"_type": "_doc",
"_id": "1",
"_score": 0.5753642,
"_source": {
"query": {
"match": {
"message": "bonsai tree"
}
}
},
"fields" : {
"_percolator_document_slot_query1" : [0]
}
}
]
}
}