1.准备数据
PUT /lib4
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 0
},
"mappings": {
"user":{
"properties": {
"name":{
"type": "text","analyzer": "ik_max_word"
},
"address":{
"type": "text","analyzer": "ik_max_word"
},
"age":{
"type": "integer"
},
"interests":{
"type": "text","analyzer": "ik_max_word"
},
"birthday":{"type": "date"}
}
}
}
}
ik分为两个分词器:
ik_max_word
:会将文本做最细粒度的拆分;尽可能的拆分出词语
ik_smart
:会做粗粒度的拆分;已被分出的词语将不会被其它词语占有
PUT /lib4/user/1
{
"name":"赵六",
"address":"黑龙江省铁岭",
"age":50,
"birthday":"1970-12-12",
"interests":"喜欢喝酒,锻炼,说相声"
}
PUT /lib4/user/2
{
"name":"赵明",
"address":"北京海淀区清河",
"age":20,
"birthday":"1998-10-12",
"interests":"喜欢喝酒,锻炼,唱歌"
}
PUT /lib4/user/3
{
"name":"李四",
"address":"北京海淀区清河",
"age":23,
"birthday":"1998-10-12",
"interests":"喜欢喝酒,锻炼,唱歌"
}
PUT /lib4/user/4
{
"name":"王五",
"address":"北京海淀区清河",
"age":26,
"birthday":"1995-10-12",
"interests":"喜欢编程,听音乐,旅游"
}
PUT /lib4/user/5
{
"name":"张三",
"address":"北京海淀区清河",
"age":29,
"birthday":"1988-10-12",
"interests":"喜欢摄影,听音乐,跳舞"
}
2.term查询和terms查询
term query会去倒排索引中寻找确切的term,它并不知道分词器的存在.这种适合keyword、numeric、date
term:查询某个字段里含有某个关键词的文档
GET /lib4/user/_search
{
"query": {
"term": {
"interests":"唱歌"
}
}
}
terms:查询某个字段里含有多个关键字的文档
GET /lib4/user/_search
{
"query": {
"terms": {
"interests": [
"喝酒",
"唱歌"
]
}
}
}
3.控制查询返回的数量
from:从哪一个文档开始
size:需要的个数
GET lib4/user/_search
{
"from": 0,
"size": 2,
"query": {
"terms": {
"interests":["唱歌","喝酒"]
}
}
}
4.返回版本号
GET /lib4/user/_search
{
"version": true,
"query": {
"terms": {
"interests": [
"喝酒",
"唱歌"
]
}
}
}
5.match查询
match query知道分词器的存在,会对filed进行分词操作,然后再查询
GET /lib4/user/_search
{
"query": {
"match": {
"name": "赵六"
}
}
}
GET /lib4/user/_search
{
"query": {
"match": {
"age": 20
}
}
}
match_all:查询所有文档
GET /lib4/user/_search
{
"query": {
"match_all": {}
}
}
multi_match:可以指定多个字段
GET /lib4/user/_search
{
"query": {
"multi_match": {
"query": "旅游",
"fields": ["interests","name"]
}
}
}
match_phrase:短语匹配查询
ElasticSearch引擎首先分析(analyze)查询字符串,从分析后的文本中构建短语查询,这意味着必须匹配短语中的所有分词,并且保证各个分词的相对位置不变:
GET /lib4/user/_search
{
"query": {
"match_phrase": {
"interests": "锻炼,说相声"
}
}
}
6.指定返回的字段
GET lib4/user/_search
{
"_source": ["address","name"],
"query": {
"match": {
"interests": "唱歌"
}
}
}
7.控制加载的字段
GET /lib4/user/_search
{
"query": {
"match_all": {}
},
"_source": {
"includes": ["name","address"],
"excludes": ["age","birthday"]
}
}
使用通配符*
GET /lib4/user/_search
{
"_source": {
"includes": "addr*",
"excludes": ["name","bir*"]
},
"query": {
"match_all": {}
}
}
8.排序
使用sort排序:desc:升序,asc:降序
GET /lib4/user/_search
{
"query": {"match_all": {}},
"sort": [
{
"age": {
"order": "desc"
}
}
]
}
GET /lib4/user/_search
{
"query": {"match_all": {}},
"sort": [
{
"age": {
"order": "desc"
}
}
]
}
9.前缀匹配查询
GET /lib4/user/_search
{
"query": {
"match_phrase_prefix": {
"name": {
"query": "赵"
}
}
}
}
10.范围查询
range:实现范围查询
参数:from,to,include_lower,include_upper,boost
include_lower:是否包含范围的左边界,默认是true
include_upper:是否包含范围的右边界,默认是true
GET /lib4/user/_search
{
"query": {
"range": {
"birthday": {
"from": "1990-10-10",
"to": "2018-05-01"
}
}
}
}
GET /lib4/user/_search
{
"query": {
"range": {
"age": {
"from": 20,
"to": 25,
"include_lower":true,
"include_upper":false
}
}
}
}
11.wildcard查询
允许使用通配符*和?来进行查询
*代表0个或多个字符
?代表任意一个字符
GET /lib4/user/_search
{
"query": {
"wildcard": {
"name": "赵"
}
}
}
GET /lib4/user/_search
{
"query": {
"wildcard": {
"name":"li?i"
}
}
}
12.fuzzy实现模糊查询
value:关键字
boost:查询的权值,默认值是1.0
min_similarity:设置匹配的最小相似度,默认值为0.5,对于字符串,取值为0-1(包括0和1);对于数值,取值可能大于1;对于日期型取值为1d,1m等,1d就代表1天
prefix_length:指明区分词项的共同前缀长度,默认是0
max_expansions:查询中的词项可以扩展的数目,默认可以无限大
GET /lib4/user/_search
{
"query": {
"fuzzy": {
"interests": "唱歌"
}
}
}
GET /lib4/user/_search
{
"query": {
"fuzzy": {
"interests": {
"value": "喝酒"
}
}
}
}