Sphinx实时索引

作者: 杍劼 | 来源:发表于2017-01-04 17:21 被阅读464次

Sphinx实时索引
2018-09-23 sphinx命令行最新的
sphinx 命令行
［搜索引擎］Sphinx的介绍和原理探索
sphinx+MySQL+sphinxse+mmseg
Coreseek安装与测试
sphinx（八）sphinx增量索引
sphinx（一）全文检索引擎sphinx
CentOS6.x安装Coreseek和Sphinx扩展for
sphinx索引/命令

数据库的数据很大，然后有些新的数据后来加入到数据库中，也希望能够检索到，全部重新建立索引很消耗资源，这样需要用到“主索引+增量索引”的思路来解决，这个模式实现的基本原理是设置两个数据源和两个索引。

1.创建计数器表：

一个简单的实现是：在数据库中增加一个计数表，记录将文档集分为两个部分的文档ID，每次重新构建主索引时，更新这个表。

先在mysql中插入一个计数表：

create table sph_counter(

counter_id int unsigned auto_increment primary key,

max_doc_id int

);

2.修改配置文件：

主数据源：

source main {

#更新主数据源前，先更新计数表

sql_query_pre = REPLACE INTO sph_counter SELECT 1,MAX(id) FROM post

sql_query = SELECT id,title,content FROM post WHERE id <= (SELECT max_doc_id FROM sph_counter WHERE counter_id = 1)

}

增量数据源：

source delta : main {

sql_query_pre = SET NAMES utf8

sql_query = select id,title,content from post where id > (select max_doc_id from sph_counter where counter_id = 1);

}

主索引：无需更改

增量索引：

index delta : main

{

source = delta

path = /opt/coreseek/var/data/delta

morphology = stem_en

}

合并增量索引：

./indexer delta --rotate

3.实时索引脚本：

(1).建立主索引和增量索引的脚本：main.sh、delta.sh

cd /opt/coreseek

mkdir init

cd init

touch main.sh

touch delta.sh

chmod a+x *

main.sh：

#!/bin/bash

#main.sh

/opt/coreseek/bin/indexer main --rotate >> /opt/coreseek/var/log/main.log

delta.sh：

#!/bin/bash

#delta.sh

/opt/coreseek/bin/indexer delta --rotate >> /opt/coreseek/var/log/delta.log

(2).建立日志文件：

cd /opt/coreseek/var/log

touch main.log

touch delta.log

(3).添加计划任务：

crontab -e

*/5 * * * * /opt/coreseek/init/delta.sh

00 03 * * * /opt/coreseek/init/main.sh

4.分布式索引：

分布式是为了改善查询延迟问题和提高多服务器、多CPU或多核环境下的吞吐率，对于大量数据（即十亿级的记录和TB级的文本量）上的搜索应用来说是很关键的。

分布式思想：对数据进行水平分区（HP，Horizontally partition），然后并行处理，当searchd收到一个对分布式索引的查询时，它做如下操作：

1.连接到远程代理

2.执行查询

3.对本地索引进行查询

4.接收来自远程代理的搜索结构

5.强所有结果合并，删除重复项

6.将合并后的结果返回给客户端

index dist {

type = distributed

local = chunk1

agent = localhost:9312:chunk2 //本地

agent = 192.168.100.2:9312:chunk3 //远程

agent = 192.168.100.3:9312:chunk4 //远程

}

网友评论

本文标题：Sphinx实时索引

本文链接：https://www.haomeiwen.com/subject/axxavttx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

Sphinx实时索引

相关文章