forked from jgasthaus/libPLUMP
-
Notifications
You must be signed in to change notification settings - Fork 0
Library for the Sequence Memoizer and related models
License
eehsan/libPLUMP
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
libPLUMP -- Library for implementing the Sequence Memoizer and related models =============================================================================== Jan Gasthaus <[email protected]> 0. Getting started =============================================================================== 0.1 Prerequisites ------------------------------------------------------------------------------- To get started with the code you will need a few additional things: - CMake - Boost C++ libraries (recent version, 1.37 and 1.44 are known to work, but 1.33 is known to not work due to some changes to Boost.Serialization) www.boost.org - GNU Scientific Library (tested with 1.12 and 1.14) www.gnu.org/software/gsl - SWIG - Python 2.5 or higher (but not 3.x) www.python.org 0.2 Compiling ------------------------------------------------------------------------------- If you have the dependencies listed above installed in default locations then simply doing # mkdir build && cd build # cmake .. # make should configure libPLUMP (including the Python binding) and build the library. 0.3 Check whether the build was successful ------------------------------------------------------------------------------- LibPLUMP comes with an example program that allows you to compute perlexity on some test file. To see how well the SM model can e.g. predict this file, do # src/score_file README This will incrementally build the context tree for this file and at the same time incrementally compute the probability of the next symbol given the previous ones. The result should be around 3 bits/symbol. You can run # src/score_file --help to find out about various options that change the behaviour of the model. More interestingly, have a look at src/utils/score_file.cc as it shows how to use most of the high-level interface of the libPLUMP. 0.4 Testing the Python wrapper ------------------------------------------------------------------------------- To see if the Python wrapper is working, try the following: # cd examples # source init.sh # python python_examples.py where init.sh sets up the environment so the libPLUMP and the Python bindings can be found, and python_example.py is a simple example calling libPLUMP. There are some further examples in the examples/ directory that can you look at to find out how to use the Python interface. The Python bindings are generated by SWIG, so it should be relatively easy to generate bindings for other languages that are supported by SWIG (e.g. Ruby, Octave, R, Java, Lua). 0.5 Where to go from here ------------------------------------------------------------------------------- Currently, the only documentation is in the form of source code comments. So, have a look at the source if you want to know more about the internal workings of the library. If you care about performance you may want to enable optimizations and disable all assertions in the code by using # export CXXFLAGS="-O3 -march=native -DNDEBUG" and then rerunning cmake/make, which should yield significant speedups. If you have problems, find bugs or want to contribute, let me know at <[email protected]>
About
Library for the Sequence Memoizer and related models
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published
Languages
- C++ 99.1%
- Other 0.9%