CSFT Coreseek MMSEG Sphinx安装配置及问题处理
@安装LibMMSeg
#wget http://www.coreseek.cn/uploads/csft/3.2/mmseg-3.2.14.tar.gz
#tar zxf mmseg-3.2.14.tar.gz
#cd mmseg-3.2.14
#./configure --prefix=/usr/local/webserver/mmseg
注意看是否有
#make
出错:cannot find input file: src/Makefile.inaclocal
解决:
#aclocal
#libtoolize --force 运行后有一个错误,不用管它。
#automake --add-missing
#autoconf
#autoheader
#make clean
再次配置
#./configure --prefix=/usr/local/webserver/mmseg
再次#make
#make install
@安装CSFT
#wget http://www.coreseek.cn/uploads/csft/3.2/coreseek-3.2.14.tar.gz
#tar zxvf coreseek-3.2.14.tar.gz
#cd coreseek-3.2.14/csft-3.2.14
#./configure --prefix=/usr/local/webserver/sphinx --with-python --with-mysql=/usr/local/webserver/mysql/ --with-mysql-includes=/usr/local/webserver/mysql/include --with-mysql-libs=/usr/local/webserver/mysql/lib --with-mmseg=/usr/local/webserver/mmseg/ --with-mmseg-includes=/usr/local/webserver/mmseg/include/mmseg/ --with-mmseg-libs=/usr/local/webserver/mmseg/lib/
#make
出错类似错误:/usr/local/coreseek-3.2.14/csft-3.2.14/src/tokenizer_zhcn.h:86: undefined reference to `libiconv_open'
/usr/local/coreseek-3.2.14/csft-3.2.14/src/tokenizer_zhcn.h:89: undefined reference to `libiconv'
解决:
#vi src/Makefile文件
将
LIBS = -lm -lexpat -L/usr/local/lib
改成
LIBS = -lm -lexpat -liconv -L/usr/local/lib
然后再#make 注意不要再./configure,
#make install
@生成 mmseg词典库及配置文件
#/usr/local/webserver/mmseg/bin/mmseg -u /usr/local/src/mmseg-3.2.14/data/unigram.txt
会在/usr/local/src/mmseg-3.2.14/data/目录下生成unigram.txt.uni文件
#cd /usr/local/webserver/sphinx
#mkdir dict 创建字典目录
#cp /usr/local/src/mmseg-3.2.14/data/unigram.txt.uni dict/uni.lib 把创建的词典复制到dict,并重命名
#vi dict/mmseg.ini 创建mmseg的配置文件
================================
复制进如下内容
[mmseg]
merge_number_and_ascii=1;
number_and_ascii_joint=-;
compress_space=0;
seperate_number_ascii=1;
================================
@测试sphinx
#cd /usr/local/webserver/sphinx/etc
#cp sphinx.conf.dist sphinx.conf
#vi sphinx.conf
修改source src1下面几行中服务器的mysql用户名和密码以及数据库,这里数据库为test,用作测试
#/usr/local/webserver/mysql/bin/mysql -u root -p test < /usr/local/webserver/sphinx/etc/example.sql
注意创建test数据库
#cd /usr/local/webserver/sphinx/etc
建立索引
#/usr/local/webserver/sphinx/bin/indexer --config /usr/local/webserver/sphinx/etc/sphinx.conf --all
数据基础是刚刚导入的example.sql的数据
出错:/usr/local/webserver/sphinx/bin/indexer: error while loading shared libraries: libmysqlclient.so.18: cannot open shared object file: No such file or directory
解决:
ln -s /usr/local/webserver/mysql/lib/libmysqlclient.so.18 /usr/lib/libmysqlclient.so.18
出错:/usr/local/webserver/sphinx/bin/indexer: error while loading shared libraries: libiconv.so.2: cannot open shared object file: No such file or directory
解决:
ln -s /usr/local/lib/libiconv.so.2 /usr/lib/libiconv.so.2
出错:FATAL: index 'test1': 'synonyms': failed to open '/data/exceptions.txt'
解决:
#vi sphinx.conf 注释此行exceptions = /data/exceptions.txt
再次建立索引
#/usr/local/webserver/sphinx/bin/indexer --config /usr/local/webserver/sphinx/etc/sphinx.conf --all
出错: WARNING: stopwords: failed to get file size for 'G:datastopwords.txt'
解决:不管
搜索测试
#/usr/local/webserver/sphinx/bin/search --config /usr/local/webserver/sphinx/etc/sphinx.conf test
搜索关键词test, 出现words: 1. 'test': 3 documents, 5 hits 说明成功
@配置中文搜索
#cd /usr/local/webserver/sphinx/etc
#vi sphonx.conf
修改 charset_type 为"zh_cn.utf-8"(不含引号), 默认为"sbcs"
@创建自己的搜索配置
改名sphinx.conf
#mv sphinx.conf sphinx.conf-
=============================================
添加如下内容 这里用到的库为theme,注意密码
source theme
{
type = mysql
sql_host = 127.0.0.1
sql_user = root
sql_pass = *****
sql_db = theme
sql_port = 3306
sql_query_pre = SET NAMES UTF8
sql_query = SELECT `id` as `id`,`id`,`name`,`tid` FROM `t_theme`
sql_attr_uint = id
sql_ranged_throttle = 0
}
index theme
{
source = theme
path = /usr/local/webserver/sphinx/var/data/theme
docinfo = extern
mlock = 0
morphology = none
min_word_len = 1
charset_type = zh_cn.utf-8
charset_dictpath = /usr/local/webserver/sphinx/dict
min_prefix_len = 0
min_infix_len = 0
html_strip = 0
}
indexer
{
mem_limit = 32M
}
searchd
{
listen = 127.0.0.1:3312
log = /usr/local/webserver/sphinx/var/log/searchd.log
query_log = /usr/local/webserver/sphinx/var/log/query.log
read_timeout = 5
max_children = 30
pid_file = /usr/local/webserver/sphinx/var/log/searchd.pid
max_matches = 1000
seamless_rotate = 1
}
=============================================
建立索引
#/usr/local/webserver/sphinx/bin/indexer --config /usr/local/webserver/sphinx/etc/sphinx.conf --all
启动sphinx
#/usr/local/webserver/sphinx/bin/searchd --config /usr/local/webserver/sphinx/etc/sphinx.conf
查看进程是否存在
#ps -ef|grep sphinx
有新数据时要重新建立索引和刷新sphinx
重启,删除进程再启动
刷新命令
#/usr/local/webserver/sphinx/bin/indexer --config /usr/local/webserver/sphinx/etc/sphinx.conf --all --rotate
定时刷新
这里设定9点13点17点21点时每30分钟一次
#crontab -e
加入如下内容
30 9,13,17,21 * * * /usr/local/webserver/sphinx/bin/indexer --config /usr/local/webserver/sphinx/etc/sphinx.conf --all --rotate
#/etc/init.d/crond restart