Realization of full-text search based on Solr

background

I have studied JAVA for almost two years and built a personal blog based on SpringBoot. There is a requirement: search the corresponding blog by any keyword.
Analysis: If you directly access the database, because it is a %like% query, the database will not be indexed, but will only scan the entire table. The efficiency is too low, and it can be found by searching data.

Mass data, fuzzy query or conditional query using mysql or Oracle is inefficient (when index is not used).

Search solutions:
(1) Realize search based on Apache Lucene
(2) Realize search based on Google API
(3) Realize search based on Baidu API—(The advantage lies in coordinate search: map)

Solr is an open source solution for search and analysis built on Apache Lucene. It is a high-performance, full-text search server based on Lucene, developed by Java, and is essentially a JAVA Web project.

Preliminary preparation

(1) Download and install solr, the download version of this article is solr-7.7.3
(2) The database mysql, the version of this article is mysql-5.7.11-winx64
(3) Chinese word segmenter, the version of this article is ik-analyzer-7.6.0 .jar

begin

(1) Create a file search in the D:\Solr\solr-7.7.3\server\solr directory
(2) Copy the conf file in D:\Solr\solr-7.7.3\server\solr\configsets_default to search
(3) Add core to solr server

Insert picture description here


(4) Install Chinese word segmenter

Put the .jar package into server\solr-webapp\webapp\WEB-INF\lib

(5) Modify the configuration of managed-schema in the search file

<!-- 中文分词器-->
 <field name="zh_all" type="text_zh_all" indexed="true" stored="true"/>
	 <field name="zh_smart" type="text_zh_smart" indexed="true" stored="true"/>
	 	 
	<fieldType name="text_zh_all" class="solr.TextField">
      <analyzer type="index">
        <tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory"  useSmart="false" conf="ik.conf"/>
        <filter class="solr.LowerCaseFilterFactory"/> </analyzer>
        <analyzer type="query">
        <tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory"  useSmart="false" conf="ik.conf"/>
        <filter class="solr.LowerCaseFilterFactory"/>
	  </analyzer>
	</fieldType>
	<fieldType name="text_zh_smart" class="solr.TextField">
      <analyzer type="index">
        <tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory"  useSmart="true" conf="ik.conf"/>
        <filter class="solr.LowerCaseFilterFactory"/> </analyzer>
        <analyzer type="query">
        <tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory"  useSmart="true" conf="ik.conf"/>
        <filter class="solr.LowerCaseFilterFactory"/>
	  </analyzer>
	</fieldType>

(6) Configure the handler
in solrconfig.xml in search

  <!-- 增加配置-->
   <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
    <lst name="defaults">
      <str name="config">dataimport.xml</str>  
    </lst>
  </requestHandler>

(7) Create new dataimport.xml, configure database query

<?xml version="1.0" encoding="UTF-8"?>
<dataConfig>
<dataSource type="JdbcDataSource"
            driver="com.mysql.jdbc.Driver"
            url="jdbc:mysql://localhost:3306/blog"
            user="root"
            password="Root1234/"/>
<document>
<entity name="BlogProduct" query="SELECT id,content from t_search">
<!--  把查询字段的结果写入 索引库对应的字段-->
<field column="id" name="id"/>
<field column="content" name="zh_all"/>
</entity>
</document>
</dataConfig>

(8) Upload relevant jar package

  • mysql corresponding jar package
  • Dataimport related Jar package

Location: solr-7.7.3\server\solr-webapp\webapp\WEB-INF\lib

Dataimport related Jar packages are in Solr\solr-7.7.3\dist\: two

(9) Import and test database data

Insert picture description here
Insert picture description here

(10) Maven configuration in springboot


```java
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.2.8.RELEASE</version>
        <relativePath/> <!-- lookup parent from repository -->
    </parent>
    <groupId>com.cqu</groupId>
    <artifactId>newblog</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>newblog</name>
    <description>Demo project for Spring Boot</description>
    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
   <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
        <java.version>1.8</java.version>
    </properties>


    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>
        <dependency>
            <groupId>redis.clients</groupId>
            <artifactId>jedis</artifactId>
        </dependency>
        <dependency>
            <groupId>commons-codec</groupId>
            <artifactId>commons-codec</artifactId>
            <version>1.6</version>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-thymeleaf</artifactId>
        </dependency>
        <dependency>
            <groupId>net.sourceforge.nekohtml</groupId>
            <artifactId>nekohtml</artifactId>
            <version>1.9.22</version>
        </dependency>
        <dependency>
            <groupId>org.mybatis.spring.boot</groupId>
            <artifactId>mybatis-spring-boot-starter</artifactId>
            <version>2.1.3</version>
        </dependency>

        <dependency>
            <groupId>com.alibaba</groupId>
            <artifactId>fastjson</artifactId>
            <version>1.2.38</version>
        </dependency>
        <dependency>
            <groupId>mysql</groupId>
            <artifactId>mysql-connector-java</artifactId>
            <scope>runtime</scope>
        </dependency>
        <dependency>
            <groupId>com.alibaba</groupId>
            <artifactId>druid</artifactId>
            <version>1.0.5</version>
        </dependency>
        <!--导入solr-->
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-data-solr</artifactId>
            <version>7.7.3</version>
        </dependency>
        <!--参数校验-->
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-validation</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
            <optional>true</optional>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-amqp</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.data</groupId>
            <artifactId>spring-data-solr</artifactId>
            <version>4.3.9</version>
            <scope>compile</scope>
        </dependency>
        <dependency>
            <groupId>org.apache.solr</groupId>
            <artifactId>solr-solrj</artifactId>
            <version>8.8.2</version>
            <scope>compile</scope>
        </dependency>
    </dependencies>
    <build>
        <finalName>${project.artifactId}</finalName>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
                <!--<configuration>-->
                <!--<failOnMissingWebXml>false</failOnMissingWebXml>-->
                <configuration>
                    <!-- 程序的主启动类,即:用@SpringBootApplication注解,包含main方法的类 -->
                    <mainClass>com.cqu.newblog.MainApplication</mainClass>
                </configuration>
                <executions>
                    <execution>
                        <goals>
                            <goal>repackage</goal>
                        </goals>
                    </execution>
                </executions>
                <!--</configuration>-->
            </plugin>
            <plugin>
                <groupId>org.mybatis.generator</groupId>
                <artifactId>mybatis-generator-maven-plugin</artifactId>
                <version>1.3.7</version>
                <!-- 添加一个mysql的依赖,防止等会找不到driverClass -->
                <dependencies>
                    <dependency>
                        <groupId>mysql</groupId>
                        <artifactId>mysql-connector-java</artifactId>
                        <version>5.7.11</version>
                        <scope>runtime</scope>
                    </dependency>
                </dependencies>
                <!-- mybatisGenerator 的配置 -->
                <configuration>
                    <!-- generator 工具配置文件的位置 -->
                    <configurationFile>src/main/resources/generatorConfig.xml</configurationFile>
                    <!-- 是否覆盖 -->
                    <!-- 此处要特别注意,如果不加这个设置会导致每次运行都会在原目录再次创建-->
                    <overwrite>true</overwrite>
                </configuration>
            </plugin>
        </plugins>
    </build>

(11) Application.properties configuration

spring.datasource.driver-class-name=com.mysql.cj.jdbc.Driver
spring.datasource.type=com.alibaba.druid.pool.DruidDataSource
spring.resources.add-mappings=true
spring.resources.cache.period=P7D
spring.resources.chain.cache=true
spring.resources.chain.enabled=true
spring.resources.chain.compressed=true
spring.resources.chain.html-application-cache=true

(12) Entity class

@SolrDocument
@Data
public class BlogVo {
    @Id
    private String id;
    @Field
    private String content;

}

(13) Realization of controller search function

 @PostMapping("/value")
    public String serachBlog(Model model, @RequestParam("searchValue") String searchValue){

        //搜索内容
        model.addAttribute("searchContent",searchValue);
        //Spring data 通用条件模板
        Criteria c = Criteria.where("zh_all").is(searchValue);
        //查询条件对象
        Query query = Query.query(c);
        //分页
        query.setPageRequest(PageRequest.of(0, 100));
        //spring data 中排序规则
        query.addSort(Sort.by(Sort.Direction.ASC, "id"));


        ScoredPage<BlogVo> page = solrTemplate.queryForPage("search", query, BlogVo.class);

        Float maxScore = page.getMaxScore();

        if (page == null) {
            System.out.println("不存在该关键词");
        }
        List<BlogVo> blogVos = page.getContent();
        Integer searchNum=blogVos.size();
        solrTemplate.commit("search");
        if(searchNum!=null){
            model.addAttribute("searchNum",searchNum);
        }
        List<IndexVo> indexVos=blogService.getIndexVoListByBlogVoList(blogVos);


        if(indexVos!=null){
            model.addAttribute("indexVos", indexVos);
        }

        //联系我
        User user = userService.getUserByUserId(1);
        model.addAttribute("user",user);
        //最新博客
        List<Blog> blogList=blogService.getRecentBlog(3);;
        model.addAttribute("blogList",blogList);
        return "search";


    }