Elasticsearch 7.x
简介
- Elasticsearch是一个开源,基于Apache Lucene库构建的Restful搜索引擎
- Elasticsearch是在Solr之后几年推出的。它提供了一个分布式,多租户能力的全文搜索引擎,具有HTTP Web界面(REST)和无架构JSON文档。 Elasticsearch的官方客户端库提供Java, Groovy, PHP, Ruby, Perl, Python, .NET和Javascript
官网地址
核心概念
- 索引(index)
- 一个索引可以理解成一个关系型数据库
- 类型(type)
- 一种type就像一类表,比如user表, order表
- 注意:
- ES 5.x中一个index可以有多种type
- ES 6.x中一个index只能有一种type
- ES 7.x以后已经移除type这个概念
- 映射(mapping)
- mapping定义了每个字段的类型等信息。相当于关系型数据库中的表结构
- 文档(document)
- 一个document相当于关系型数据库中的一行记录
- 字段(field)
- 相当于关系型数据库表的字段
- 集群(cluster)
- 集群由一个或多个节点组成,一个集群有一个默认名称"elasticsearch"
- 节点(node)
- 集群的节点,一台机器或者一个进程
- 分片和副本(shard)
- 副本是分片的副本。分片有主分片(primary Shard)和副本分片(replica Shard)之分
- 一个Index数据在物理上被分布在多个主分片中,每个主分片只存放部分数据
- 每个主分片可以有多个副本,叫副本分片,是主分片的复制
字段类型
核心数据类型
分类 | 类型 | 描述 |
---|---|---|
字符串 | text | 用于全文索引,该类型的字段将通过分词器进行分词 |
字符串 | keyword | 不分词,只能搜索该字段的完整的值 |
数值型 | long, integer, short, byte, double, float, half_float, scaled_float | - |
布尔 | boolean | - |
二进制 | binary | 该类型的字段把值当做经过 base64 编码的字符串,默认不存储,且不可搜索 |
范围类型 | integer_range, float_range, long_range, double_range, date_range | 范围类型表示值是一个范围,而不是一个具体的值;譬如 age 的类型是 integer_range,那么值可以是 {"gte" : 20, "lte" : 40};搜索 "term" :{"age": 21} 可以搜索该值 |
日期 | date | 由于Json没有date类型,所以es通过识别字符串是否符合format定义的格式来判断是否为date类型;format默认为strict_date_optiona_time||epoch_millis;格式"2022-01-01","2022/01/01 12:10:30",或从开始纪元(1970年年1⽉ 1⽇日 0点) 开始的毫秒数 |
复杂数据类型
-
数组类型 Array
- ES中没有专门的数组类型, 直接使用[]定义即可,数组中所有的值必须是同一种数据类型, 不支持混合数据类型的数组
- 字符串数组 [ "one", "two" ] ,整数数组 [ 1, 2 ]
- Object对象数组 [ { "name": "Louis", "age": 18 }, { "name": "Daniel", "age": 17 }]
- 同一个数组只能存同类型的数据,不能混存,譬如 [ 10, "some string" ] 是错误的
-
对象类型 Object
-
对象类型可能有内部对象
{ "name": "李蒙", "age": 14, "sex": "0", "class": "7(2)班", "birthday": "2005-10-15" "hobbies": [ "阅读", "跑步" ], "address": { "province": "山东", "location": { "city": "日照" } } }
-
专⽤用数据类型
-
IP类型
IP类型的字段⽤用于存储IPv4或IPv6的地址, 本质上是⼀一个⻓长整型字段
索引
功能 | 请求方式 | url | 参数 |
---|---|---|---|
新增 | PUT(必须) | localhost:9200/stu | - |
获取 | GET | localhost:9200/stu | - |
删除 | DELETE | localhost:9200/stu | - |
批量获取 | GET | localhost:9200/stu,tea | - |
获取所有1 | GET | localhost:9200/_all | - |
获取所有2 | GET | localhost:9200/_cat/indices?v | - |
存在 | HEAD | localhost:9200/stu | - |
关闭 | POST | localhost:9200/stu/_close | - |
打开 | POST | localhost:9200/stu/_open | - |
自动创建索引 | PUT | localhost:9200/_cluster/settings | 见下 |
数据复制 | POST | localhost:9200/_reindex | 见下 |
-
新增
PUT localhost:9200/stu
// 响应 { "acknowledged": true, "shards_acknowledged": true, "index": "stu" }
-
获取
GET localhost:9200/stu
// 响应 { "stu": { "aliases": {},//别名 "mappings": {},//映射 "settings": { "index": { "creation_date": "1576139082806",//创建时间 "number_of_shards": "1",//分片 "number_of_replicas": "1",//副本 "uuid": "-ocQkbgoSyG2vDTsugK_9Q", "version": { "created": "7020099" }, "provided_name": "stu" } } } }
-
删除
DELETE localhost:9200/stu
{ "acknowledged": true }
-
批量获取
GET localhost:9200/stu,tea
// 响应 { "stu": { "aliases": {}, "mappings": {}, "settings": { "index": { "creation_date": "1576139586417", "number_of_shards": "1", "number_of_replicas": "1", "uuid": "H9dyTutEQg-4OsV2Byt-gA", "version": { "created": "7020099" }, "provided_name": "stu" } } }, "tea": { "aliases": {}, "mappings": {}, "settings": { "index": { "creation_date": "1576139593175", "number_of_shards": "1", "number_of_replicas": "1", "uuid": "nYhKuggbT_Wa2RI-M_COGA", "version": { "created": "7020099" }, "provided_name": "tea" } } } }
-
获取所有1
GET localhost:9200/_all
// 响应 { "stu": { "aliases": {}, "mappings": {}, "settings": { "index": { "creation_date": "1576139586417", "number_of_shards": "1", "number_of_replicas": "1", "uuid": "H9dyTutEQg-4OsV2Byt-gA", "version": { "created": "7020099" }, "provided_name": "stu" } } }, "tea": { "aliases": {}, "mappings": {}, "settings": { "index": { "creation_date": "1576139593175", "number_of_shards": "1", "number_of_replicas": "1", "uuid": "nYhKuggbT_Wa2RI-M_COGA", "version": { "created": "7020099" }, "provided_name": "tea" } } } }
-
获取所有2
GET localhost:9200/_cat/indices?v
// 响应 health status index uuid pri rep docs.count docs.deleted store.size pri.store.size green open .kibana_task_manager IjZxE0H9TtmTrpBgzjr-qg 1 0 2 0 12.8kb 12.8kb yellow open stu H9dyTutEQg-4OsV2Byt-gA 1 1 0 0 283b 283b yellow open tea nYhKuggbT_Wa2RI-M_COGA 1 1 0 0 283b 283b
-
存在
HEAD localhost:9200/stu
// 响应-存在 200 ok
-
关闭
POST localhost:9200/stu/_close
// 响应 { "acknowledged": true, "shards_acknowledged": true }
-
打开
POST localhost:9200/stu/_open
// 响应 { "acknowledged": true, "shards_acknowledged": true }
-
自动创建索引
插入文档时(见下)是否自动创建索引
GET 请求http://localhost:9200/_cluster/settings 查看auto_create_index 的状态
true自动创建
-
修改auto_create_index 的状态
PUT localhost:9200/_cluster/settings
// 参数 { "persistent": { "action.auto_create_index": "true"//true或false } }
-
-
数据复制(结合索引别名,可以重建索引并导入数据)
POST localhost:9200/_reindex
{ "source": { "index": "stu" }, "dest": { "index": "stu_oth" } }
索引别名
在开发中,随着业务需求的迭代,较⽼的业务逻辑就要⾯临更新甚⾄是重构,⽽对于es来说,为了适应新的业务逻辑,可能就要对原有的索引做⼀些修改,⽐如对某些字段做调整,甚⾄是重建索引。⽽做这些操作的时候,可能会对业务造成影响,甚⾄是停机调整等问题。由此,es提供了索引别名来解决这些问题。 索引别名就像⼀个快捷⽅式或是软连接,可以指向⼀个或多个索引,也可以给任意⼀个需要索引名的API来使⽤。别名的应⽤为程序提供了极⼤地灵活性
多个索引可以指定同一个别名,一个索引也可以指定多个别名
功能 | 请求方式 | url | 参数 |
---|---|---|---|
查询 | GET | localhost:9200/_alias; localhost:9200/stu/_alias | - |
新增 | POST | localhost:9200/_aliases | 见下 |
新增 | PUT | localhost:9200/stu/_alias/stu_v1.0 | - |
删除 | POST | localhost:9200/_aliases | 见下 |
删除 | DELETE | localhost:9200/stu/_alias/stu_v1.0 | - |
重命名 | POST | localhost:9200/_aliases | 见下 |
-
新增
POST localhost:9200/_aliases
{ "actions": [ { "add": { "index": "stu", "alias": "stu_1214" } } ] }
-
删除
POST localhost:9200/_aliases
{ "actions": [ { "remove": { "index": "stu", "alias": "stu_v1.1" } } ] }
-
重命名
POST localhost:9200/_aliases
{ "actions": [ { "remove": { "index": "stu", "alias": "stu_1214" } }, { "add": { "index": "stu", "alias": "stu_1215" } } ] }
-
当别名指定了多个索引,可以指定写某个索引
POST localhost:9200/_aliases
{ "actions": [ { "add": { "index": "stu", "alias": "alia_v1.0", "is_write_index": "true" } }, { "add": { "index": "tea", "alias": "alia_v1.0" } } ] }
映射
功能 | 请求方式 | url | 参数 |
---|---|---|---|
新增 | PUT | localhost:9200/stu/_mapping | 见下 |
获取 | GET | localhost:9200/stu/_mapping | - |
批量获取 | GET | localhost:9200/stu,tea/_mapping | - |
获取所有1 | GET | localhost:9200/_mapping | - |
获取所有2 | GET | localhost:9200/_all/_mapping | - |
修改 | PUT | localhost:9200/stu/_mapping | 见下 |
-
新增
PUT localhost:9200/stu/_mapping
// 参数 { "properties": { "name": { "type": "text" }, "age": { "type": "long" }, "sex": { "type": "keyword" }, "class": { "type": "keyword" } } }
-
获取
GET localhost:9200/stu/_mapping
// 响应 { "stu": { "mappings": { "properties": { "age": { "type": "long" }, "class": { "type": "keyword" }, "name": { "type": "text" }, "sex": { "type": "keyword" } } } } }
-
批量获取
GET localhost:9200/stu,tea/_mapping
// 响应 { "tea": { "mappings": {} }, "stu": { "mappings": { "properties": { "age": { "type": "long" }, "class": { "type": "keyword" }, "name": { "type": "text" }, "sex": { "type": "keyword" } } } } }
-
获取所有1
GET localhost:9200/_mapping
// 响应 { "stu": { "mappings": { "properties": { "age": { "type": "long" }, "class": { "type": "keyword" }, "name": { "type": "text" }, "sex": { "type": "keyword" } } } }, "tea": { "mappings": {} }, }
-
获取所有2
GET localhost:9200/_all/_mapping
// 响应 { "stu": { "mappings": { "properties": { "age": { "type": "long" }, "class": { "type": "keyword" }, "name": { "type": "text" }, "sex": { "type": "keyword" } } } }, "tea": { "mappings": {} }, }
-
修改
注意:
修改映射时,只能新增字段,不能修改或删除已存在的字段
PUT localhost:9200/stu/_mapping
// 参数 { "properties": { "name": { "type": "text" }, "age": { "type": "long" }, "sex": { "type": "keyword" }, "class": { "type": "keyword" }, "birthday": { "type": "date" } } }
文档
功能 | 请求方式 | url | 参数 |
---|---|---|---|
新增(指定id) | PUT | localhost:9200/stu/_doc/1 | 见下 |
新增(不指定id) | POST(必须) | localhost:9200/stu/_doc | 见下 |
指定操作类型 | PUT | localhost:9200/stu/_doc/1?op_type=create | 见下 |
查看 | GET | localhost:9200/stu/_doc/1 | - |
查看多个⽂文档 | POST | localhost:9200/_mget | 见下 |
修改 | POST | localhost:9200/stu/_update/1 | 见下 |
删除 | DELETE | localhost:9200/stu/_doc/1 | - |
删除全部 | POST | localhost:9200/stu/_delete_by_query |
-
新增(指定id)
PUT localhost:9200/stu/_doc/1
// 参数 { "name": "杨光", "age": 14, "sex": "1", "class": "7(2)班", "birthday": "2005-08-26" }
// 响应 { "_index": "stu", "_type": "_doc", "_id": "1", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 0, "_primary_term": 3 }
-
新增(不指定id)
不指定id,系统会自动分配id
POST localhost:9200/stu/_doc
// 参数 { "name": "张世杰", "age": 13, "sex": "0", "class": "7(5)班", "birthday": "2004-11-01" }
// 响应 { "_index": "stu", "_type": "_doc", "_id": "C_SI-W4Bj7nk6pLmw4Er",//系统分配id "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 1, "_primary_term": 3 }
-
指定操作类型
若不指定插入时的操作类型,向已存在的id插入数据,原数据会被更新掉,并生成一个新的版本
PUT localhost:9200/stu/_doc/1
{ "name": "杨光11", "age": 14, "sex": "1", "class": "7(2)班", "birthday": "2005-08-26" }
{ "_index": "stu", "_type": "_doc", "_id": "1", "_version": 2,//产生新的版本 "result": "updated",//执行结果时updated,而不是created "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 2, "_primary_term": 3 }
PUT localhost:9200/stu/_doc/1?op_type=create (向已存在的id插入数据会报错)
// 参数 { "name": "杨光22", "age": 14, "sex": "1", "class": "7(2)班", "birthday": "2005-08-26" }
// 响应 { "error": { "root_cause": [ { "type": "version_conflict_engine_exception", "reason": "[1]: version conflict, document already exists (current version [2])", "index_uuid": "H9dyTutEQg-4OsV2Byt-gA", "shard": "0", "index": "stu" } ], "type": "version_conflict_engine_exception", "reason": "[1]: version conflict, document already exists (current version [2])", "index_uuid": "H9dyTutEQg-4OsV2Byt-gA", "shard": "0", "index": "stu" }, "status": 409 }
-
查看
GET localhost:9200/stu/_doc/1
// 响应 { "_index": "stu", "_type": "_doc", "_id": "1", "_version": 2, "_seq_no": 2, "_primary_term": 3, "found": true, "_source": { "name": "杨光11", "age": 14, "sex": "1", "class": "7(2)班", "birthday": "2005-08-26" } }
-
查看多个⽂文档
-
方式一
POST localhost:9200/_mget
// 参数 { "docs": [ { "_index": "stu", "_type": "_doc", "_id": "1" }, { "_index": "stu", "_type": "_doc", "_id": "C_SI-W4Bj7nk6pLmw4Er" }] }
// 响应 { "docs": [ { "_index": "stu", "_type": "_doc", "_id": "1", "_version": 3, "_seq_no": 3, "_primary_term": 3, "found": true, "_source": { "name": "杨光33", "age": 14, "sex": "1", "class": "7(2)班", "birthday": "2005-08-26" } }, { "_index": "stu", "_type": "_doc", "_id": "C_SI-W4Bj7nk6pLmw4Er", "_version": 1, "_seq_no": 1, "_primary_term": 3, "found": true, "_source": { "name": "张世杰", "age": 13, "sex": "0", "class": "7(5)班", "birthday": "2004-11-01" } } ] }
-
方式二
POST localhost:9200/stu/_mget
// 参数 { "docs": [ { "_type": "_doc", "_id": "1" }, { "_type": "_doc", "_id": "C_SI-W4Bj7nk6pLmw4Er" }] }
-
方式三
POST localhost:9200/stu/_doc/_mget
// 参数 { "docs": [ { "_id": "1" }, { "_id": "C_SI-W4Bj7nk6pLmw4Er" }] }
-
-
修改
-
根据提供的⽂文档⽚片段更更新数据
POST localhost:9200/stu/_update/1
// 参数 { "doc": { "name": "杨光33", "age": 14, "sex": "1", "class": "7(2)班", "birthday": "2005-08-26" } }
-
向_source字段,增加一个字段
POST localhost:9200/stu/_update/1
// 参数 { "script": "ctx._source.height = \"173cm\"" }
-
从_source字段,删除一个字段
POST localhost:9200/stu/_update/1
// 参数 { "script": "ctx._source.remove(\"height\")" }
-
-
删除
DELETE localhost:9200/stu/_doc/1
// 响应 { "_index": "stu", "_type": "_doc", "_id": "1", "_version": 6, "result": "deleted", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 6, "_primary_term": 3 }
-
删除全部
POST localhost:9200/stu/_delete_by_query
// 参数 { "query": { "match_all": { } } }
查询搜索
数据准备:批量导入数据-ES提供了一个叫 bulk 的API 来进行批量操作
-
数据
{"index": {"_index": "stu", "_type": "_doc", "_id": 1}} {"name":"杨光","age":14,"sex":"1","class":"7(2)班","birthday":"2005-08-26"} {"index": {"_index": "stu", "_type": "_doc", "_id": 2}} {"name":"张世杰","age":13,"sex":"0","class":"7(5)班","birthday":"2004-11-01"} {"index": {"_index": "stu", "_type": "_doc", "_id": 3}} {"name":"李蒙","age":14,"sex":"0","class":"7(2)班","birthday":"2005-10-15"} {"index": {"_index": "stu", "_type": "_doc", "_id": 4}} {"name":"李沁","age":15,"sex":"0","class":"7(3)班","birthday":"2004-10-15"} {"index": {"_index": "stu", "_type": "_doc", "_id": 5}} {"name":"王昭","age":14,"sex":"1","class":"7(3)班","birthday":"2005-01-26"} {"index": {"_index": "stu", "_type": "_doc", "_id": 6}} {"name":"李明","age":14,"sex":"1","class":"7(2)班","birthday":"2005-03-26"} {"index": {"_index": "stu", "_type": "_doc", "_id": 7}} {"name":"张璐","age":14,"sex":"1","class":"7(5)班","birthday":"2005-06-02"} {"index": {"_index": "stu", "_type": "_doc", "_id": 8}} {"name":"李思敏","age":14,"sex":"1","class":"7(3)班","birthday":"2005-06-02"} {"index": {"_index": "stu", "_type": "_doc", "_id": 9}} {"name":"吴民锡","age":13,"sex":"1","class":"7(5)班","birthday":"2006-04-02"} {"index": {"_index": "stu", "_type": "_doc", "_id": 10}} {"name":"赵曦","age":14,"sex":"0","class":"7(2)班","birthday":"2005-09-02"}
-
POST bulk
curl -X POST "localhost:9200/_bulk" -H 'Content-Type: application/json' --data-binary @name
term(词条)查询
单词级别查询-词条查询不会分析查询条件,只有当词条和查询字符串完全匹配时,才匹配搜索;这些查询通常用于结构化的数据,比如: number, date, keyword等,而不是对text。也就是说,全文本查询之前要先对文本内容进行分词,而单词级别的查询直接在相应字段的反向索引中精确查找,单词级别的查询一般用于数值、日期等类型的字段上。
功能 | 请求方式 | url | 参数 | 描述 |
---|---|---|---|---|
单条term查询 | POST | localhost:9200/stu/_search | 见下 | - |
多条term查询 | POST | localhost:9200/stu/_search | 见下 | - |
Exsit Query | POST | localhost:9200/stu/_search | 见下 | 特定的字段中查找⾮非空值的⽂文档 |
Prefix Query | POST | localhost:9200/stu/_search | 见下 | 查找包含带有指定前缀term的⽂文档 |
Wildcard Query | POST | localhost:9200/stu/_search | 见下 | 支持通配符查询, *表示任意字符, ?表示任意单个字符 |
Regexp Query | POST | localhost:9200/stu/_search | 见下 | 正则表达式查询 |
Ids Query | POST | localhost:9200/stu/_search | 见下 | 通过id查询文档 |
-
单条term查询
POST localhost:9200/stu/_search
// 参数 { "query":{ "term":{ "sex": "1" } } }
// 响应 { "took": 1, "timed_out": false, "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 1, "relation": "eq" }, "max_score": 0.9808292, "hits": [ { "_index": "stu", "_type": "_doc", "_id": "1", "_score": 0.9808292, "_source": { "name": "杨光", "age": 14, "sex": "1", "class": "7(2)班", "birthday": "2005-08-26" } } ] } }
-
多条term查询
POST localhost:9200/stu/_search
// 参数 { "query":{ "terms":{ "sex": ["0","1"] } } }
-
Exsit Query
POST localhost:9200/stu/_search
// 参数 { "query": { "exists": { "field": "birthday" } } }
-
Prefix Query
POST localhost:9200/stu/_search
// 参数 { "query": { "prefix": { "class": { "value": "7" } } } }
-
Wildcard Query
POST localhost:9200/stu/_search
// 参数 { "query": { "wildcard": { "class": { "value": "*2*" } } } }
-
Regexp Query
POST localhost:9200/stu/_search
// 参数 { "query": { "regexp": { "class": "7.*" } } }
-
Ids Query
POST localhost:9200/stu/_search
// 参数 { "query": { "ids": { "values": [1,2] } } }
full text(全文)查询
ElasticSearch引擎会先分析查询字符串,将其拆分成多个分词,只要已分析的字段中包含词条的任意一个,或全部包含,就匹配查询条件,返回该文档;如果不包含任意一个分词,表示没有任何⽂文档匹配查询条件
类型 | 请求方式 | url | 参数 | 描述 |
---|---|---|---|---|
match_all | POST | localhost:9200/stu/_search | 见下 | 查询全部 |
match | POST | localhost:9200/stu/_search | 见下 | 分词匹配查询 |
multi_match | POST | localhost:9200/stu/_search | 见下 | 多字段查询 |
match_phrase | POST | localhost:9200/stu/_search | 见下 | 精确匹配 |
match_phrase_prefix | POST | localhost:9200/stu/_search | 见下 | 模糊匹配(text) |
-
match_all
POST localhost:9200/nba/_search
// 参数 { "query":{ "match_all":{} }, "from": 0, "size": 10 }
-
match
POST localhost:9200/nba/_search
// 参数 { "query": { "match": { "name": "张" } } }
-
multi_match
POST localhost:9200/nba/_search
// 参数 { "query": { "multi_match": { "query": "世",// 查询条件 "fields": ["name","class"]//查询哪些字段 } } }
-
match_phrase
POST localhost:9200/nba/_search
// 参数 { "query": { "match_phrase": { "class": "7(2)班" } } }
-
match_phrase_prefix
// 参数 { "query": { "match_phrase_prefix": { "name": "世杰" } } }
范围查询
范围查询--日期、数字或字符串
POST localhost:9200/nba/_search
// 查询年龄14-15岁的学生
{
"query": {
"range": {
"age": {
"gte": 14,
"lte": 15
}
}
}
}
// 查询2003年到2004年出生的学生
{
"query": {
"range": {
"birthday": {
"gte": "2003",
"lte": "31-12-2004",
"format": "dd-MM-yyyy||yyyy"
}
}
}
}
布尔查询
类型 | 请求方式 | url | 参数 | 描述 |
---|---|---|---|---|
must | POST | localhost:9200/nba/_search | 见下 | 必须出现在匹配文档中 |
filter | POST | localhost:9200/nba/_search | 必须出现在文档中,但是不打分 | |
must_not | POST | localhost:9200/nba/_search | 不能出现在文档中 | |
should | POST | localhost:9200/nba/_search | 应该出现在文档中 |
-
must
POST localhost:9200/nba/_search
// 查询sex为"0",name中含有"曦"的学生 { "query": { "bool": { "must": [ { "match": { "name": "曦" } }, { "term": { "sex": { "value": "0" } } } ] } } }
// 响应 { "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 1, "relation" : "eq" }, "max_score" : 2.9985561, "hits" : [ { "_index" : "stu", "_type" : "_doc", "_id" : "10", "_score" : 2.9985561,// 分数 "_source" : { "name" : "赵曦", "age" : 14, "sex" : "0", "class" : "7(2)班", "birthday" : "2005-09-02" } } ] } }
-
filter
效果同must,但是不打分
POST localhost:9200/nba/_search
{ "query": { "bool": { "filter": [ { "match": { "name": "曦" } }, { "term": { "sex": { "value": "0" } } } ] } } }
-
must_not
POST localhost:9200/nba/_search
// 查询name包含"张",sex不是"0"的学生 { "query": { "bool": { "must": [ { "match": { "name": "张" } } ], "must_not": [ { "term": { "sex": { "value": "0" } } } ] } } }
-
should
POST localhost:9200/nba/_search
// 查询sex为"1"的学生 { "query": { "bool": { "should": [ { "term": { "sex": { "value": "1" } } } ] } } }
与其他模式结合使用时即使匹配不到也返回,只是评分不同
// 查询name中包含"李",age在13-14之间的学生 { "query": { "bool": { "must": [ { "match": { "name": "李" } } ], "should": [ { "range": { "age": { "gte": 13, "lte": 14 } } } ] } } }
排序查询
POST localhost:9200/nba/_search
// 查询7(5)班学生,age倒序排列
{
"query": {
"term": {
"class": {
"value": "7(5)班"
}
}
},
"sort": [
{
"age": {
"order": "desc"
}
}
]
}
聚合查询
- 聚合分析是数据库中重要的功能特性,完成对一个查询的数据集中数据的聚合计算,如:找出某字段(或计算表达式的结果)的最大值、最小值,计算和、平均值等。 ES作为搜索引擎兼数据库,同样提供了强大的聚合分析能力
- 对一个数据集求最大、最小、和、平均值等指标的聚合,在ES中称为指标聚合
- 而关系型数据库中除了有聚合函数外,还可以对查询出的数据进行分组group by,再在组上进行指标聚合。在ES中称为桶聚合
指标聚合
-
max min sum avg
POST localhost:9200/nba/_search
// max-7(3)班最大年龄 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "maxAge": {// 自定义名称 "max": { "field": "age" } } }, "size": 0 }
// min-7(3)班最小年龄 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "minAge": { "min": { "field": "age" } } }, "size": 0 }
// sum { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "sumAge": { "sum": { "field": "age" } } }, "size": 0 }
// avg-7(3)班平均年龄 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "avgAge": {// 自定义名称 "avg": { "field": "age" } } }, "size": 0 }
-
value_count
统计非空字段的文档数
POST localhost:9200/nba/_search
// 查询7(3)班年龄非空的学生总数 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "countAge": { "value_count": { "field": "age" } } }, "size": 0 }
-
Cardinality
值去重计数
POST localhost:9200/nba/_search
// 7(3)班age去重统计 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "cardinalityAge": { "cardinality": { "field": "age" } } }, "size": 0 }
-
stats
统计count max min avg sum 5个值
POST localhost:9200/nba/_search
{ "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "statsAge": { "stats": { "field": "age" } } }, "size": 0 }
-
Extended stats
比stats多4个统计结果: 平方和、方差、标准差、平均值加/减两个标准差的区间
POST localhost:9200/nba/_search
{ "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "extendedAge": { "extended_stats": { "field": "age" } } }, "size": 0 }
-
Percentiles
占比百分位对应的值统计,默认返回[ 1, 5, 25, 50, 75, 95, 99 ]分位上的值
POST localhost:9200/nba/_search
{ "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "percentilesAge": { "percentiles": { "field": "age" } } }, "size": 0 }
// 指定分位值 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "percentilesAge": { "percentiles": { "field": "age", "percents": [ 20, 50, 75 ] } } }, "size": 0 }
桶聚合
-
Terms Aggregation 根据字段项分组聚合
POST localhost:9200/nba/_search
// 7(3)班按照age分组 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "aggsAge": { "terms": { "field": "age", "size": 5 } } }, "size": 0 }
-
order 分组聚合排序
POST localhost:9200/nba/_search
// 7(3)班按照age分组,分组信息通过年龄从大到小排序 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "aggsAge": { "terms": { "field": "age", "size": 5, "order": { "_key": "desc" } } } }, "size": 0 }
// 7(3)班按照age分组,分组信息通过文档数从大到小排序 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "aggsAge": { "terms": { "field": "age", "size": 5, "order": { "_count": "desc" } } } }, "size": 0 }
// 根据class分组,根据分组后的平均age倒排 { "aggs": { "aggsClass": { "terms": { "field": "class", "size": 10, "order": { "aggsAge": "desc" } }, "aggs": { "aggsAge": { "avg": { "field": "age" } } } } }, "size": 0 }
-
筛选分组聚合
POST localhost:9200/nba/_search
{ "aggs": { "aggsClass": { "terms": { "field": "class", "include": ["7(3)班", "7(2)班", "7(5)班"],// 包含 "exclude": ["7(5)班"],// 排除 "size": 10, "order": { "aggsAge": "desc" } }, "aggs": { "aggsAge": { "avg": { "field": "age" } } } } }, "size": 0 }
// 正则匹配 // include,exclude类型要一致 { "aggs": { "aggsClass": { "terms": { "field": "class", "include": "7.*", "exclude": "7(5)班", "size": 10, "order": { "aggsAge": "desc" } }, "aggs": { "aggsAge": { "avg": { "field": "age" } } } } }, "size": 0 }
-
Range Aggregation 范围分组聚合
POST localhost:9200/nba/_search
// -13,13-14,15- 范围分组 { "aggs": { "aggsrange": { "range": { "field": "age", "ranges": [ { "to": 13 }, { "from": 13, "to": 14 }, { "from": 15 } ] } } }, "size": 0 }
// 范围分组-别名 { "aggs": { "aggsrange": { "range": { "field": "age", "ranges": [ { "to": 13, "key":"A" }, { "from": 13, "to": 14, "key":"B" }, { "from": 15, "key":"C" } ] } } }, "size": 0 }
-
Date Range Aggregation 时间范围分组聚合
POST localhost:9200/nba/_search
// Date 时间范围分组聚合 { "aggs": { "aggsrange": { "date_range": { "field": "birthday", "format": "yyyy-MM", "ranges": [ { "to": "2004-12", "key":"A" }, { "from": "2005-01", "to": "2005-12", "key":"B" }, { "from": "2006-01", "key":"C" } ] } } }, "size": 0 }
-
Date Histogram Aggregation 时间柱状图聚合
按天、月、年等进行聚合统计。可按 year (1y), quarter (1q), month (1M), week (1w), day(1d), hour (1h), minute (1m), second (1s) 间隔聚合
POST localhost:9200/nba/_search
{ "aggs": { "aggsrange": { "date_histogram": { "field": "birthday", "format": "yyyy", "calendar_interval": "year" } } }, "size": 0 }
query_string查询
-
单个字段查询
POST localhost:9200/nba/_search
{ "query": { "query_string": { "default_field": "name", "query": "李 AND 思 OR 敏" } } }
-
多个字段查询
POST localhost:9200/nba/_search
{ "query": { "query_string": { "fields": ["name", "sex"], "query": "李 AND 0" } } }
分词器
将⽤用户输入的一段文本,按照一定逻辑,分析成多个词语的一种工具
内置分词器
-
standard analyzer (标准分词器)
标准分析器是默认分词器,如果未指定,则使用该分词器
-
simple analyzer
simple 分析器当它遇到只要不是字母的字符,就将文本解析成term,而且所有的term都是
小写的 -
whitespace analyzer
whitespace 分析器,当它遇到空白字符时,就将文本解析成terms
-
stop analyzer
stop 分析器 和 simple 分析器很像,唯一不同的是, stop 分析器增加了对删除停止词的支持,默认使⽤用了english停止词
stopwords 预定义的停止词列表,比如 (the,a,an,this,of,at)等
-
language analyzer
特定的语言的分词器,比如说, english,英语分词器),内置语言: arabic, armenian,basque, bengali, brazilian, bulgarian, catalan, cjk, czech, danish, dutch, english, finnish,french, galician, german, greek, hindi, hungarian, indonesian, irish, italian, latvian,lithuanian, norwegian, persian, portuguese, romanian, russian, sorani, spanish,swedish, turkish, thai
-
pattern analyzer
用正则表达式来将文本分割成terms,默认的正则表达式是\W+(非单词字符)
eg:
GET /_analyze
{
"analyzer": "simple",
"text": "Deploy a 14-day trial of Elasticsearch Service."
}
{
"tokens" : [
{
"token" : "deploy",
"start_offset" : 0,// 开始偏移量
"end_offset" : 6,// 结束偏移量
"type" : "word",
"position" : 0 // 索引
},
{
"token" : "a",
"start_offset" : 7,
"end_offset" : 8,
"type" : "word",
"position" : 1
},
{
"token" : "day",
"start_offset" : 12,
"end_offset" : 15,
"type" : "word",
"position" : 2
},
{
"token" : "trial",
"start_offset" : 16,
"end_offset" : 21,
"type" : "word",
"position" : 3
},
{
"token" : "of",
"start_offset" : 22,
"end_offset" : 24,
"type" : "word",
"position" : 4
},
{
"token" : "elasticsearch",
"start_offset" : 25,
"end_offset" : 38,
"type" : "word",
"position" : 5
},
{
"token" : "service",
"start_offset" : 39,
"end_offset" : 46,
"type" : "word",
"position" : 6
}
]
}
中文分词器
-
smartCN
一个简单的中文或中英文混合文本的分词器
-
安装 (重启服务后使用)
sh elasticsearch-plugin install analysis-smartcn
-
eg:
GET /_analyze { "analyzer": "smartcn", "text": "有限公司" }
{ "tokens" : [ { "token" : "有限公司", "start_offset" : 0, "end_offset" : 4, "type" : "word", "position" : 0 } ] }
-
-
IK分词器
更智能更友好的中文分词器
下载 https://github.com/medcl/elasticsearch-analysis-ik/releases (版本要对应)
安装 解压到es安装目录-plugins目录
-
eg:
GET /_analyze { "analyzer": "ik_max_word", "text": "有限公司" }
{ "tokens" : [ { "token" : "有限公司", "start_offset" : 0, "end_offset" : 4, "type" : "CN_WORD", "position" : 0 }, { "token" : "有限", "start_offset" : 0, "end_offset" : 2, "type" : "CN_WORD", "position" : 1 }, { "token" : "公司", "start_offset" : 2, "end_offset" : 4, "type" : "CN_WORD", "position" : 2 } ] }
refresh
新的数据已添加到索引中⽴⻢就能搜索到,但是真实情况不是这样的
-
先添加⼀个⽂档,再⽴刻搜索,获取不到新添加的数据
curl -X PUT localhost:9200/stu/_doc/666 -H 'Content-Type:application/json' -d '{ "name": "王丝菲" }' curl -X GET localhost:9200/stu/_doc/_search?pretty
-
强制刷新
curl -X PUT localhost:9200/stu/_doc/667?refresh -H 'Content-Type:application/json' -d '{ "name": "王豆豆" }' curl -X GET localhost:9200/stu/_doc/_search?pretty
-
修改默认更新时间(默认时间是1s)
PUT localhost:9200/stu/_settings
{ "index": { "refresh_interval": "5s" } }
-
将refresh关闭
PUT localhost:9200/stu/_settings
{ "index": { "refresh_interval": "-1" } }
高亮查询
-
高亮查询
POST localhost:9200/stu/_search
// 参数 { "query": { "match": { "name": "赵" } }, "highlight": { "fields": { "name": {} } } }
// 相应 { "took" : 4, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 1, "relation" : "eq" }, "max_score" : 2.4191523, "hits" : [ { "_index" : "stu", "_type" : "_doc", "_id" : "10", "_score" : 2.4191523, "_source" : { "name" : "赵曦", "age" : 14, "sex" : "0", "class" : "7(2)班", "birthday" : "2005-09-02" }, "highlight" : { "name" : [ "<em>赵</em>曦" ] } } ] } }
-
自定义高亮查询
POST localhost:9200/stu/_search
// 参数 { "query": { "match": { "name": "赵" } }, "highlight": { "fields": { "name": { "pre_tags": ["<p>"], "post_tags": ["</p>"] } } } }
// 响应 { "took" : 6, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 1, "relation" : "eq" }, "max_score" : 2.4191523, "hits" : [ { "_index" : "stu", "_type" : "_doc", "_id" : "10", "_score" : 2.4191523, "_source" : { "name" : "赵曦", "age" : 14, "sex" : "0", "class" : "7(2)班", "birthday" : "2005-09-02" }, "highlight" : { "name" : [ "<p>赵</p>曦" ] } } ] } }
查询建议
查询建议,是为了给⽤户提供更好的搜索体验。包括:词条检查,⾃动补全
字段类型
类型 | 描述 |
---|---|
text | 指定搜索文本 |
field | 获取建议器的搜索字段 |
analyzer | 指定分词器 |
size | 每个词返回的最大建议词数 |
sort | 如何对建议词进行排序,可用选项:score-先按评分排序,再按文档频率排序,term顺序;frequency:先按文档频率排序,再按评分排序,term顺序; |
suggest_mode | 建议模式,控制提供建议词的方式:missing-仅在搜索的词项在索引中不存在时才提供建议词,默认值;popular-仅建议文档频率比搜索词项高的词;always-总是提供匹配的建议词; |
suggester
-
Term suggester
term 词条建议器,对给输⼊的文本进⾏分词,为每个分词提供词项建议
POST localhost:9200/stu/_search
// 参数 { "suggest": { "MY_SUGGESTION": { "text": "7(6)班", "term": { "suggest_mode": "missing", "field": "class" } } } }
{ "took" : 105, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 0, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "suggest" : { "MY_SUGGESTION" : [ { "text" : "7(6)班", "offset" : 0, "length" : 5, "options" : [ { "text" : "7(2)班", "score" : 0.8, "freq" : 4 }, { "text" : "7(3)班", "score" : 0.8, "freq" : 3 }, { "text" : "7(5)班", "score" : 0.8, "freq" : 3 } ] } ] } }
-
Phrase suggester
phrase 短语建议,在term的基础上,会考量多个term之间的关系,⽐如是否同时出现在索
引的原文里,相邻程度,以及词频等
POST localhost:9200/stu/_search
// 参数 { "suggest": { "MY_SUGGESTION": { "text": "7(2) 班", "phrase": { "field": "class" } } } }
// 响应 { "took" : 17, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 0, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "suggest" : { "MY_SUGGESTION" : [ { "text" : "7(2) 班", "offset" : 0, "length" : 6, "options" : [ { "text" : "7(2)班", "score" : 0.4678218 }, { "text" : "7(3)班", "score" : 0.37474233 }, { "text" : "7(5)班", "score" : 0.37474233 } ] } ] } }
-
Completion suggester
完成建议,自动补充查询内容后面的内容
POST localhost:9200/stu/_search
// 要查询字段的类型必须是 completion { "suggest": { "MY_SUGGESTION": { // 自定义名称 "text": "I like", "completion": { "field": "selfDesc" } } } }