全文检索
1.全文搜索概念:
(1)数据结构:
·结构化:只具有固定格式或者有限长度的数据,如数据库,元数据等
·非结构化:指不定长或者无固定格式的数据,如邮件,word文档等
(2)非结构化数据的检索:
·顺序扫描法:适合小数据量文件 物联网系统开发找上海捌跃网络科技有限公司
·全文搜索:将非结构化的数据转为结构化的数据,然后创建索引,在进行搜索
(3)概念:全文搜索是一种将文件中所有文本域搜索项匹配的文件资料检索方式
2.全文搜索实现原理
3.全文搜索实现技术:基于java的开源实现Lucene,ElasticSearch(具有自身的分布式管理功能),Solr
4.ElasticSearch简介:
概念:
(1)高度可扩展的开源全文搜索和分析引擎
(2)快速的,近实的多大数据进行存储,搜索和分析
(3)用来支撑有复杂的数据搜索需求的企业级应用
特点及介绍:
(1)分布式
(2)高可用
(3)对类型,支持多种数据类型
(4)多API
(5)面向文档
(6)异不写入
(7)近实时:每隔n秒查询,在写入磁盘中
(8)基于Lucene
(9)Apache协议
5.ElasticSearch与Spring Boot集成
(1)配置环境:ElasticSearch,Spring Data ElasticSearch,JNA
(2)安装ElasticSearch,下载包,解压直接启动即可,这里特别说一下ElasticSearch的一些异常问题,必须版本对应,其次端口问题一定要注意
(3)建立Spring Boot项目
(4)我们修改pom.xml文件,将相关依赖加进去
(5)在项目代码编写之前我们必须在本地安装ElasticSearch并在版本上与Spring Boot版本相兼容,其次注意端口号的问题,集成时ElasticSearch服务的端口号为9200,而客户端端口号为9300
接下来我们启动本地安装的ElasticSearch然后在启动我们的项目:
<?xml version="1.0"encoding="UTF-8"?>4.0.0com.dhtt.spring.boot.blogspring.data.action0.0.1-SNAPSHOTjarspring.data.actionDemo project for Spring Bootorg.springframework.bootspring-boot-starter-parent2.1.0.RELEASE<!-- lookup parent from repository -->UTF-8UTF-81.8org.springframework.bootspring-boot-starter-data-jpa<!-- spring boot集成elasticsearch -->org.springframework.bootspring-boot-starter-data-elasticsearchorg.springframework.dataspring-data-elasticsearchorg.springframework.bootspring-boot-starter-thymeleaforg.springframework.bootspring-boot-starter-web<!-- 添加热部署 -->org.springframework.bootspring-boot-devtoolsorg.springframework.bootspring-boot-starter-testtest<!-- JNA 的依赖 -->net.java.dev.jnajna4.5.1org.elasticsearchelasticsearch<!-- 内存数据库h2 --><!-- <dependency> <groupId>com.h2database</groupId> <artifactId>h2</artifactId>
</dependency> --><!-- MySql数据库驱动 -->mysqlmysql-connector-java5.1.46<!-- hibernate持久层框架引入 -->org.hibernatehibernate-core5.3.7.Finalorg.springframework.bootspring-boot-maven-plugin
启动项目进行测试,观察项目各项配置是否正确,项目能否成功启动,项目启动成功后
(5)接下来配置application.properties文件:
#thymeleaf配置spring.thymeleaf.encoding=UTF-8#热部署静态文件,不需要缓存,实时观察文件修改效果spring.thymeleaf.cache=false#使用html5标准spring.thymeleaf.mode=HTML5spring.thymeleaf.suffix=.htmlspring.resources.chain.strategy.content.enabled=true#elasticsearch服务器地址spring.data.elasticsearch.cluster-nodes=127.0.0.1:9300#连接超时时间spring.data.elasticsearch.properties.transport.tcp.connect_timeout=120s #节点名字,默认elasticsearch#spring.data.elasticsearch.cluster-name=elasticsearch #spring.data.elasticsearch.repositories.enable=true#spring.data.elasticsearch.properties.path.logs=./elasticsearch/log#spring.data.elasticsearch.properties.path.data=./elasticsearch/data#数据库连接配置spring.datasource.url=jdbc:mysql://localhost:3306/blog_test?useUnicode=true&characterEncoding=UTF-8&serverTimezone=GMT%2B8&useSSL=falsespring.datasource.username=rootspring.datasource.password=qitao1996spring.datasource.driver-class-name=com.mysql.jdbc.Driver#jpa配置spring.jpa.show-sql=truespring.jpa.hibernate.ddl-auto=create-drop
(6)进行后台编码:
文档类EsBlog:
packagecom.dhtt.spring.boot.blog.spring.data.action.entity;importjava.io.Serializable;importjavax.persistence.Id;importorg.springframework.data.elasticsearch.annotations.Document;/** * EsBlog实体(文档)类 * *@authorQiTao * */@Document(indexName="blog",type="blog")//指定文档publicclassEsBlogimplementsSerializable{/**
*
*/privatestaticfinallongserialVersionUID =4745983033416635193L;@IdprivateString id;privateString title;privateString summary;privateString content;protectedEsBlog(){super(); }publicEsBlog(String title, String summary, String content){super();this.title = title;this.summary = summary;this.content = content; }publicStringgetId(){returnid; }publicvoidsetId(String id){this.id = id; }publicStringgetTitle(){returntitle; }publicvoidsetTitle(String title){this.title = title; }publicStringgetSummary(){returnsummary; }publicvoidsetSummary(String summary){this.summary = summary; }publicStringgetContent(){returncontent; }publicvoidsetContent(String content){this.content = content; }@OverridepublicStringtoString(){return"EsBlog [id="+ id +", title="+ title +", summary="+ summary +", content="+ content +"]"; }}
资源库,定义数据查询接口:
packagecom.dhtt.spring.boot.blog.spring.data.action.repository;importorg.springframework.data.domain.Page;importorg.springframework.data.domain.PageRequest;importorg.springframework.data.elasticsearch.repository.ElasticsearchRepository;importcom.dhtt.spring.boot.blog.spring.data.action.entity.EsBlog;/** * EsBlogRepository接口 * *@authorQiTao * */publicinterfaceEsBlogRepositoryextendsElasticsearchRepository{/** * 分页,查询,去重 * *@paramtitle *@paramsummary *@paramcontent *@parampageable *@return*/PagefindDistinctEsBlogByTitleContainingOrSummaryContainingOrContentContaining(String title, String summary,
String content, PageRequest pageRequest);}
最后编写Controller类:
packagecom.dhtt.spring.boot.blog.spring.data.action.web.user;importjava.util.List;importorg.springframework.beans.factory.annotation.Autowired;importorg.springframework.data.domain.Page;importorg.springframework.data.domain.PageRequest;importorg.springframework.web.bind.annotation.GetMapping;importorg.springframework.web.bind.annotation.RequestMapping;importorg.springframework.web.bind.annotation.RequestParam;importorg.springframework.web.bind.annotation.RestController;importcom.dhtt.spring.boot.blog.spring.data.action.entity.EsBlog;importcom.dhtt.spring.boot.blog.spring.data.action.repository.EsBlogRepository;@RestController@RequestMapping("/blogs")publicclassBlogController{@AutowiredprivateEsBlogRepository esBlogRepository;@GetMappingpublicListlist(@RequestParam(value ="title")String title, @RequestParam(value ="summary")String summary, @RequestParam(value ="content")String content, @RequestParam(value ="pageIndex", defaultValue ="0")intpageIndex, @RequestParam(value ="pageSize", defaultValue ="10")intpageSize){//添加测试数据esBlogRepository.deleteAll(); esBlogRepository.save(newEsBlog("登黄鹤楼","王之涣的等黄鹤楼","百日依山尽,黄河入海流,欲穷千里目,更上一层楼")); esBlogRepository.save(newEsBlog("相思","王维的相思","红豆生南国,春来发几枝,愿君多采截,此物最相思")); esBlogRepository.save(newEsBlog("静夜思","李白的静夜思","床前明月光,疑是地上霜,举头望明月,低头思故乡"));//查询获取PageRequest pageRequest=PageRequest.of(pageIndex,pageSize); Page page= esBlogRepository.findDistinctEsBlogByTitleContainingOrSummaryContainingOrContentContaining(title, summary, content, pageRequest);returnpage.getContent(); }}
启动项目,前台进行访问:
前台结果打印成功,故我们的Elasticsearch+Spring Boot集成成功
转自:http://blog.51cto.com/13501268/2322430