CSFT Coreseek MMSEG Sphinx安装配置及问题处理

时间: 2012-05-18  分类: Linux  收藏

@安装LibMMSeg
    #wget http://www.coreseek.cn/uploads/csft/3.2/mmseg-3.2.14.tar.gz
    #tar zxf mmseg-3.2.14.tar.gz
    #cd mmseg-3.2.14
    #./configure --prefix=/usr/local/webserver/mmseg
        注意看是否有
    #make
        出错:cannot find input file: src/Makefile.inaclocal
            解决:
                #aclocal
                #libtoolize --force              运行后有一个错误,不用管它。
                #automake --add-missing
                #autoconf
                #autoheader
                #make clean
                再次配置
                #./configure --prefix=/usr/local/webserver/mmseg
                再次#make
    #make install
@安装CSFT
    #wget http://www.coreseek.cn/uploads/csft/3.2/coreseek-3.2.14.tar.gz
    #tar zxvf coreseek-3.2.14.tar.gz
    #cd coreseek-3.2.14/csft-3.2.14
    #./configure --prefix=/usr/local/webserver/sphinx --with-python --with-mysql=/usr/local/webserver/mysql/ --with-mysql-includes=/usr/local/webserver/mysql/include --with-mysql-libs=/usr/local/webserver/mysql/lib --with-mmseg=/usr/local/webserver/mmseg/ --with-mmseg-includes=/usr/local/webserver/mmseg/include/mmseg/ --with-mmseg-libs=/usr/local/webserver/mmseg/lib/
    #make
        出错类似错误:/usr/local/coreseek-3.2.14/csft-3.2.14/src/tokenizer_zhcn.h:86: undefined reference to `libiconv_open'
                          /usr/local/coreseek-3.2.14/csft-3.2.14/src/tokenizer_zhcn.h:89: undefined reference to `libiconv'
            解决:
                #vi src/Makefile文件
                将
                LIBS = -lm -lexpat -L/usr/local/lib
                改成
                LIBS = -lm -lexpat -liconv -L/usr/local/lib
                然后再#make     注意不要再./configure,
    #make install

@生成 mmseg词典库及配置文件
    #/usr/local/webserver/mmseg/bin/mmseg -u /usr/local/src/mmseg-3.2.14/data/unigram.txt
            会在/usr/local/src/mmseg-3.2.14/data/目录下生成unigram.txt.uni文件
    #cd /usr/local/webserver/sphinx
    #mkdir dict               创建字典目录
    #cp /usr/local/src/mmseg-3.2.14/data/unigram.txt.uni dict/uni.lib 把创建的词典复制到dict,并重命名
    #vi dict/mmseg.ini         创建mmseg的配置文件
================================
复制进如下内容
    [mmseg]
    merge_number_and_ascii=1;
    number_and_ascii_joint=-;
    compress_space=0;
    seperate_number_ascii=1;
================================

@测试sphinx
    #cd /usr/local/webserver/sphinx/etc
    #cp sphinx.conf.dist sphinx.conf
    #vi sphinx.conf
        修改source src1下面几行中服务器的mysql用户名和密码以及数据库,这里数据库为test,用作测试

    #/usr/local/webserver/mysql/bin/mysql -u root -p test < /usr/local/webserver/sphinx/etc/example.sql
            注意创建test数据库
    #cd /usr/local/webserver/sphinx/etc
    建立索引
    #/usr/local/webserver/sphinx/bin/indexer --config /usr/local/webserver/sphinx/etc/sphinx.conf --all
           数据基础是刚刚导入的example.sql的数据

            出错:/usr/local/webserver/sphinx/bin/indexer: error while loading shared libraries: libmysqlclient.so.18: cannot open shared object file: No such file or directory

                解决:

                    ln -s /usr/local/webserver/mysql/lib/libmysqlclient.so.18 /usr/lib/libmysqlclient.so.18

            出错:/usr/local/webserver/sphinx/bin/indexer: error while loading shared libraries: libiconv.so.2: cannot open shared object file: No such file or directory

                解决:

                    ln -s /usr/local/lib/libiconv.so.2 /usr/lib/libiconv.so.2
            出错:FATAL: index 'test1': 'synonyms': failed to open '/data/exceptions.txt'
                解决:
                    #vi sphinx.conf        注释此行exceptions = /data/exceptions.txt
                    再次建立索引
                    #/usr/local/webserver/sphinx/bin/indexer --config /usr/local/webserver/sphinx/etc/sphinx.conf --all
                        出错: WARNING: stopwords: failed to get file size for 'G:datastopwords.txt'
                            解决:不管
    搜索测试
    #/usr/local/webserver/sphinx/bin/search --config /usr/local/webserver/sphinx/etc/sphinx.conf test
        搜索关键词test, 出现words: 1. 'test': 3 documents, 5 hits 说明成功

@配置中文搜索
    #cd /usr/local/webserver/sphinx/etc
    #vi sphonx.conf
        修改 charset_type 为"zh_cn.utf-8"(不含引号), 默认为"sbcs"


@创建自己的搜索配置
    改名sphinx.conf
    #mv sphinx.conf sphinx.conf-
=============================================
添加如下内容   这里用到的库为theme,注意密码
source theme
{
    type                        = mysql
    sql_host                  = 127.0.0.1
    sql_user                  = root
    sql_pass                  = *****
    sql_db                     = theme
    sql_port                   = 3306
    sql_query_pre           = SET NAMES UTF8
    sql_query                 = SELECT `id` as `id`,`id`,`name`,`tid` FROM `t_theme`
    sql_attr_uint             = id
    sql_ranged_throttle    = 0
}

index theme
{
    source                      = theme
    path                          = /usr/local/webserver/sphinx/var/data/theme
    docinfo                      = extern
    mlock                      = 0
    morphology              = none
    min_word_len              = 1
    charset_type              = zh_cn.utf-8
    charset_dictpath          = /usr/local/webserver/sphinx/dict
    min_prefix_len          = 0
    min_infix_len              = 0
    html_strip                  = 0
}

indexer
{
        mem_limit           = 32M
}

searchd
{
        listen                              = 127.0.0.1:3312
        log                                 = /usr/local/webserver/sphinx/var/log/searchd.log
        query_log                        = /usr/local/webserver/sphinx/var/log/query.log
        read_timeout                    = 5
        max_children                    = 30
        pid_file                            = /usr/local/webserver/sphinx/var/log/searchd.pid
        max_matches                    = 1000
        seamless_rotate                 = 1
}
=============================================
    建立索引
    #/usr/local/webserver/sphinx/bin/indexer --config /usr/local/webserver/sphinx/etc/sphinx.conf --all
    启动sphinx
    #/usr/local/webserver/sphinx/bin/searchd --config /usr/local/webserver/sphinx/etc/sphinx.conf
    查看进程是否存在
    #ps -ef|grep sphinx


    有新数据时要重新建立索引和刷新sphinx
    重启,删除进程再启动
    刷新命令
    #/usr/local/webserver/sphinx/bin/indexer --config /usr/local/webserver/sphinx/etc/sphinx.conf --all --rotate

 

    定时刷新

                这里设定9点13点17点21点时每30分钟一次

    #crontab -e

                加入如下内容

                30 9,13,17,21 * * * /usr/local/webserver/sphinx/bin/indexer --config /usr/local/webserver/sphinx/etc/sphinx.conf --all --rotate

    #/etc/init.d/crond restart

分享到:

评论

昵 称: