文章搜索我现在使用的是mysql的模糊查询like搜索标题关键字。
之前也有用全文索引,但是全文索引的效率比较低,所以,后期就没有在对文章内容进行匹配。
后来接触到中文分词器,感觉他刚好能解决我的问题:目前比较好的支持PHP的分词器大概有solr(基于Java开发),sphinx(基于C++开发)
Solr需要java环境才可以运行。我不太喜欢,所以,这个先被过滤掉。
比较好的选择就是sphinx(斯文克斯)
但是,sphinx是不支持中文分词的,所以,百度上给的大多数的结果是基于sphinx内核开发的coreseek+mmseg分词的一套组合来实现中文分词+全文检索。
但是有个问题,coreseek目前已经没有人在维护了。
官方网站已经不能访问了:http://www.coreseek.cn/
我能找到的最新版本是coreseek4.1。
coreseek4.1版本我在阿里云的centos7.8上边没有编译安装成功。所以我这里还是推荐使用coreseek3.2版本(基于sphinx0.9版本开发),版本有点老。
下载地址:
https://gitee.com/sdagfsdh/coreseek?_from=gitee_search
我这里主要使用的是红框标注的压缩包。
一:安装编译环境
yum -y install gcc gcc-c++ autoconf python python-devel libiconv libtool
已安装的同学请略过
二:安装mmseg3
我的软件包放在usr/local/download目录下
222.pngcd /usr/local/download/coreseek-3.2.14
cd mmseg-3.2.14
chmod -R 777 ./configure # configure文件增加执行权限
./configure --prefix=/usr/local/mmseg3 # 安装目录是/usr/local/mmseg3
make&&make install
1:可能出现的报错
(1):config.status: error: cannot find input file: src/Makefile.in
解决方法:
yum -y install libtool
aclocal
libtoolize --force
automake --add-missing
autoconf
autoheader
make clean
./configure --prefix=/usr/local/mmseg3
make&&make install
2:编译成功显示
------------------------------------------------------------------------
Configuration:
Source code location: .
Compiler: gcc
Compiler flags: -g -O2
Host System Type: x86_64-redhat-linux-gnu
Install path: /usr/local/mmseg3
See config.h for further configuration information.
------------------------------------------------------------------------
三:安装coreseek
1:安装依赖项
yum -y install expat expat-devel
2:进入目录
# 进入目录
cd csft-3.2.14
# 给configure文件执行权限
chmod -R 777 ./configure
# 执行编译,编译命令需要根据你自己软件安装的情况来修改目录。
./configure --prefix=/usr/local/coreseek -without-unixodbc -with-mmseg -with-mmseg-includes=/usr/local/mmseg3/include/mmseg/ -with-mmseg-libs=/usr/local/mmseg3/lib/ -with-mysql=/usr/local/mariadb # 我的mysql安装目录
这里需要注意一下,我的mysql是采用编译安装的,将所有文件(配置文件,数据库文件)都编译到了同一个目录下(/usr/local/mariadb),如果你的数据库是使用yum源安装的,那么上边的编译命令,可能用不了。
编译成功显示:
generating configuration files
------------------------------
configure: creating ./config.status
config.status: creating Makefile
config.status: creating src/Makefile
config.status: creating libstemmer_c/Makefile
config.status: creating sphinx.conf.dist
config.status: creating sphinx-min.conf.dist
config.status: creating config/config.h
config.status: executing depfiles commands
configuration done
------------------
执行安装
make&&make install
1:可能出现的报错
make[2]: *** [sphinxexpr.o] Error 1
make[2]: Leaving directory `/usr/local/download/coreseek-3.2.14/csft-3.2.14/src'
make[1]: *** [all] Error 2
make[1]: Leaving directory `/usr/local/download/coreseek-3.2.14/csft-3.2.14/src'
make: *** [all-recursive] Error 1
解决方法:
上面已经有提示,
在sphinxexpr.cpp文件里面(会有好多行),将”ExprEval“替换为”this->ExprEval“,
然后从新./configure........,
编译安装:
make && make install
安装成功显示:
make[2]: Leaving directory `/usr/local/download/coreseek-3.2.14/csft-3.2.14/src'
make[1]: Leaving directory `/usr/local/download/coreseek-3.2.14/csft-3.2.14/src'
Making all in test
make[1]: Entering directory `/usr/local/download/coreseek-3.2.14/csft-3.2.14/test'
make[1]: Nothing to be done for `all'.
make[1]: Leaving directory `/usr/local/download/coreseek-3.2.14/csft-3.2.14/test'
make[1]: Entering directory `/usr/local/download/coreseek-3.2.14/csft-3.2.14'
make[1]: Nothing to be done for `all-am'.
make[1]: Leaving directory `/usr/local/download/coreseek-3.2.14/csft-3.2.14'
至此,编译安装成功。
四:启动报错解决方法
使用如下命令启动coreseek
/usr/local/coreseek/bin/searchd
报错:
/usr/local/coreseek/bin/searchd: error while loading shared libraries: libmariadb.so.3: cannot open shared object file: No such file or directory
解决方法:
ln -s /usr/local/mariadb/lib/libmariadb.so.3 /usr/lib64/libmariadb.so.3
再次启动:
/usr/local/coreseek/bin/searchd
报错:
Coreseek Fulltext 3.2 [ Sphinx 0.9.9-release (r2117)]
Copyright (c) 2007-2011,
Beijing Choice Software Technologies Inc (http://www.coreseek.com)
FATAL: no readable config file (looked in /usr/local/coreseek/etc/csft.conf, ./csft.conf).
没有配置文件,解决方法:
cp /usr/local/coreseek/etc/sphinx-min.conf.dist csft.conf
再次启动
/usr/local/coreseek/bin/searchd
报错:
Coreseek Fulltext 3.2 [ Sphinx 0.9.9-release (r2117)]
Copyright (c) 2007-2011,
Beijing Choice Software Technologies Inc (http://www.coreseek.com)
using config file '/usr/local/coreseek/etc/csft.conf'...
listening on all interfaces, port=9312
WARNING: index 'test1': preload: failed to open /usr/local/coreseek/var/data/test1.sph: No such file or directory; NOT SERVING
FATAL: no valid indexes to serve
就是找不到索引文件。
我们来配置cstf.conf文件:
#
# Minimal Sphinx configuration sample (clean, simple, functional)
#
source src1
{
type = mysql
# 你的数据库纤细
sql_host = localhost
sql_user = mysql
sql_pass =
sql_db = test
sql_port = 3306 # optional, default is 3306
sql_query = \
SELECT id, group_id, UNIX_TIMESTAMP(date_added) AS date_added, title, content \
FROM documents
sql_attr_uint = group_id
sql_attr_timestamp = date_added
sql_query_info = SELECT * FROM documents WHERE id=$id
}
index test1
{
source = src1
# 确保一下路径存在,不存在提前创建
path = /usr/local/coreseek/var/data/test1
docinfo = extern
charset_type = sbcs
}
indexer
{
mem_limit = 32M
}
searchd
{
port = 9312
# 确保一下路径存在,不存在提前创建
log = /usr/local/coreseek/var/log/searchd.log
# 确保一下路径存在,不存在提前创建
query_log = /usr/local/coreseek/var/log/query.log
read_timeout = 5
max_children = 30
# 确保一下路径存在,不存在提前创建
pid_file = /usr/local/coreseek/var/log/searchd.pid
max_matches = 1000
seamless_rotate = 1
preopen_indexes = 0
unlink_old = 1
}
我们将/usr/local/coreseek/etc目录下(安装目录)的example.sql导入数据库
# 使用test数据库
MariaDB [(none)]> use test;
Database changed
#导入sql文件
MariaDB [test]> source /usr/local/coreseek/etc/example.sql
Query OK, 0 rows affected, 1 warning (0.018 sec)
Query OK, 0 rows affected (0.011 sec)
Query OK, 4 rows affected (0.003 sec)
Records: 4 Duplicates: 0 Warnings: 0
Query OK, 0 rows affected, 1 warning (0.002 sec)
Query OK, 0 rows affected (0.010 sec)
Query OK, 10 rows affected (0.001 sec)
Records: 10 Duplicates: 0 Warnings: 0
创建索引:
/usr/local/coreseek/bin/indexer -c /usr/local/coreseek/etc/csft.conf --all –rotate
创建成功显示:
Coreseek Fulltext 3.2 [ Sphinx 0.9.9-release (r2117)]
Copyright (c) 2007-2011,
Beijing Choice Software Technologies Inc (http://www.coreseek.com)
using config file '/usr/local/coreseek/etc/csft.conf'...
indexing index 'test1'...
collected 4 docs, 0.0 MB
sorted 0.0 Mhits, 100.0% done
total 4 docs, 193 bytes
total 0.003 sec, 56581 bytes/sec, 1172.67 docs/sec
total 2 reads, 0.000 sec, 0.1 kb/call avg, 0.0 msec/call avg
total 7 writes, 0.000 sec, 0.1 kb/call avg, 0.0 msec/call avg
WARNING: failed to scanf pid from pid_file '/usr/local/coreseek/var/log/searchd.pid'.
WARNING: indices NOT rotated.
最后这两个警告,就是缺少文件。
解决方法不是自己去创建,重启服务器,再重新启动coreseek就可以了
五:coreseek常用命令
1:启动
/usr/local/coreseek/bin/searchd
2:停止
/usr/local/coreseek/bin/searchd –stop
3:创建索引
/usr/local/coreseek/bin/indexer -c /usr/local/coreseek/etc/csft.conf --all –rotate
4:搜索测试
/usr/local/coreseek/bin/search -c /usr/local/coreseek/etc/csft_mysql.conf -a abc
5:如果在coreseek运行时创建索引,加上--rotate参数,这样索引创建完成就直接生效了
/usr/local/coreseek/bin/indexer -c /usr/local/coreseek/etc/csft_mysql.conf --all --rotate
其他使用方法,请参照sphinx。
有好的建议,请在下方输入你的评论。
欢迎访问个人博客
https://guanchao.site
网友评论