Skip to main content

地理距离聚合 geo_distance

Geo Distance Aggregation

在geo_point字段上工作的多桶聚合,在概念上与范围聚合非常相似。用户可以定义源点和一组距离范围桶。聚合评估每个文档值与原点的距离,并根据范围确定其所属的桶(如果文档与原点之间的距离在桶的距离范围内,则文档属于该桶)。

PUT /museums
{
"mappings": {
"_doc": {
"properties": {
"location": {
"type": "geo_point"
}
}
}
}
}

POST /museums/_doc/_bulk?refresh
{"index":{"_id":1}}
{"location": "52.374081,4.912350", "name": "NEMO Science Museum"}
{"index":{"_id":2}}
{"location": "52.369219,4.901618", "name": "Museum Het Rembrandthuis"}
{"index":{"_id":3}}
{"location": "52.371667,4.914722", "name": "Nederlands Scheepvaartmuseum"}
{"index":{"_id":4}}
{"location": "51.222900,4.405200", "name": "Letterenhuis"}
{"index":{"_id":5}}
{"location": "48.861111,2.336389", "name": "Musée du Louvre"}
{"index":{"_id":6}}
{"location": "48.860000,2.327000", "name": "Musée d'Orsay"}

POST /museums/_search?size=0
{
"aggs" : {
"rings_around_amsterdam" : {
"geo_distance" : {
"field" : "location",
"origin" : "52.3760, 4.894",
"ranges" : [
{ "to" : 100000 },
{ "from" : 100000, "to" : 300000 },
{ "from" : 300000 }
]
}
}
}
}

返回

{
...
"aggregations": {
"rings_around_amsterdam" : {
"buckets": [
{
"key": "*-100000.0",
"from": 0.0,
"to": 100000.0,
"doc_count": 3
},
{
"key": "100000.0-300000.0",
"from": 100000.0,
"to": 300000.0,
"doc_count": 1
},
{
"key": "300000.0-*",
"from": 300000.0,
"doc_count": 2
}
]
}
}
}

field 必须是geo_point类型,也可以为一组geo_point字段。

origin 可以接受geo_point类型支持的所有格式:

  • 对象格式:{“lat”:52.3760,“lon”:4.894}-这是最安全的格式,因为它是最明确的lat和lon值
  • 字符串格式:“52.3760,4.894”-其中第一个数字是lat,第二个数字是lon
  • 数组格式:[4.894,52.3760]-基于GeoJson标准,其中第一个数字是lon,第二个数字是lat

默认情况下,距离单位为m(米),但也可以接受:mi(英里)、in(英寸)、yd(码)、km(公里)、cm(厘米)、mm(毫米)。

POST /museums/_search?size=0
{
"aggs" : {
"rings" : {
"geo_distance" : {
"field" : "location",
"origin" : "52.3760, 4.894",
"unit" : "km",
"ranges" : [
{ "to" : 100 },
{ "from" : 100, "to" : 300 },
{ "from" : 300 }
]
}
}
}
}

有两种距离计算模式:arc 圆弧(默认)和 plane 平面。圆弧计算最准确。plane最快但最不准确的。当您的搜索上下文为狭窄,并且跨越较小的地理区域(约5km)时,请考虑使用plane。plane将为跨越非常大区域的搜索(例如跨大陆搜索)返回更高的误差容限。可以使用distance_type参数设置距离计算类型:

POST /museums/_search?size=0
{
"aggs" : {
"rings" : {
"geo_distance" : {
"field" : "location",
"origin" : "52.3760, 4.894",
"unit" : "km",
"distance_type" : "plane",
"ranges" : [
{ "to" : 100 },
{ "from" : 100, "to" : 300 },
{ "from" : 300 }
]
}
}
}
}

Keyed Response

将keyed标志设置为true会将一个唯一的字符串键与每个bucket相关联,并将范围作为哈希而不是数组返回:

POST /museums/_search?size=0
{
"aggs" : {
"rings_around_amsterdam" : {
"geo_distance" : {
"field" : "location",
"origin" : "52.3760, 4.894",
"ranges" : [
{ "to" : 100000 },
{ "from" : 100000, "to" : 300000 },
{ "from" : 300000 }
],
"keyed": true
}
}
}
}

返回

{
...
"aggregations": {
"rings_around_amsterdam" : {
"buckets": {
"*-100000.0": {
"from": 0.0,
"to": 100000.0,
"doc_count": 3
},
"100000.0-300000.0": {
"from": 100000.0,
"to": 300000.0,
"doc_count": 1
},
"300000.0-*": {
"from": 300000.0,
"doc_count": 2
}
}
}
}
}

还可以自定义每个范围的key:

POST /museums/_search?size=0
{
"aggs" : {
"rings_around_amsterdam" : {
"geo_distance" : {
"field" : "location",
"origin" : "52.3760, 4.894",
"ranges" : [
{ "to" : 100000, "key": "first_ring" },
{ "from" : 100000, "to" : 300000, "key": "second_ring" },
{ "from" : 300000, "key": "third_ring" }
],
"keyed": true
}
}
}
}

返回

{
...
"aggregations": {
"rings_around_amsterdam" : {
"buckets": {
"first_ring": {
"from": 0.0,
"to": 100000.0,
"doc_count": 3
},
"second_ring": {
"from": 100000.0,
"to": 300000.0,
"doc_count": 1
},
"third_ring": {
"from": 300000.0,
"doc_count": 2
}
}
}
}
}