Main repository for Mapa76 project (in spanish).
- Ruby 1.9+
- Bundler (
bundler
gem) - MongoDB server
- Redis server
- Docsplit dependencies
- FreeLing 3.0
- Poppler 0.20+
We recommend rvm for installing and managing the
Ruby interpreter and environment. Refer to the installation
page for instructions on installing Ruby 1.9+.
with rvm
.
On Debian / Ubuntu machines, install from the package manager:
# apt-get install mongodb mongodb-server redis-server
$ git Clone git@github.com:hhba/mapa76.git
First, run bundle install
to install all gem dependencies.
$ bundle install
Create your MongoDB and Resque configuration files based on the sample files, and modify the connection options to suit your needs:
$ cp config/mongoid.yml.sample config/mongoid.yml
$ cp config/resque.yml.sample config/resque.yml
$ cp config/elasticsearch.yml.sample config/elasticsearch.yml
If the servers will be running on the same machine as Mapa76, you don't need to change anything.
You should also install most of the dependencies listed in Docsplit documentation.
Download the tarball of Poppler
0.20.1 and extract it
somewhere, like /usr/local/src
.
Run apt-get build-dep poppler-utils
to make sure you have all of its
dependencies. Then, just execute ./configure
, make
and make install
as
usual.
The NER module currently uses FreeLing, an open source suite of language analyzers written in C++.
This has been tested on FreeLing 3.0a1 only. Because this is an alpha release, there are no binary packages available. You can download the source here (114Mb~) and compile it. If you are a happy Ubuntu user, check out this link. You will be able to find .deb easily to use files.
For compiling the source, you need the build-essential
, libboost
and
libicu
libraries. On Debian / Ubuntu machines, you can run:
# apt-get install build-essential libboost-dev libboost-filesystem-dev \
libboost-program-options-dev libboost-regex-dev \
libicu-dev
Then, just execute ./configure
, make
and make install
as usual.
To start workers for document processing, you need to run at least one Resque worker:
$ QUEUE=* bundle exec rake resque:work
you can run multiple workers with the resque:workers
task:
$ COUNT=2 QUEUE=* bundle exec rake resque:workers
To remove everything from the database and restart all to 0 just run:
$ rake mi:drop && redis-cli FLUSHALL && rake anm:load && rake convicted:load