This sample application contains the code for the text search tutorial. Please refer to the text search tutorial for more information.
Executable example:
$ git clone --depth 1 https://github.com/vespa-engine/sample-apps.git $ VESPA_SAMPLE_APPS=`pwd`/sample-apps $ cd $VESPA_SAMPLE_APPS/text-search && mvn clean package $ docker run --detach --name vespa --hostname vespa-container --privileged \ --volume $VESPA_SAMPLE_APPS:/apps --publish 8080:8080 vespaengine/vespa
Wait for the configserver to start:
$ docker exec vespa bash -c 'curl -s --head http://localhost:19071/ApplicationStatus'
Deploy the application:
$ docker exec vespa bash -c '/opt/vespa/bin/vespa-deploy prepare /apps/text-search/target/application.zip && \ /opt/vespa/bin/vespa-deploy activate'
Wait for the application to start:
$ curl -s --head http://localhost:8080/ApplicationStatus
Create data feed:
To use the entire MS MARCO data set, use the download script. Here we use the sample data.
$ ./bin/convert-msmarco.sh
Feed data:
$ docker exec vespa bash -c 'java -jar /opt/vespa/lib/jars/vespa-http-client-jar-with-dependencies.jar \ --file /apps/text-search/msmarco/vespa.json --host localhost --port 8080'
Test the application:
$ curl -s 'http://localhost:8080/search/?query=what+is+dad+bod'
Browse the site:
Install python dependencies:
pip3 install -qqq -r src/python/requirements.txt
Collect training data:
./src/python/collect_training_data.py msmarco/sample collect_rank_features 99
Train TF-Ranking models:
./src/python/tfrank.py msmarco/sample
Shutdown and remove the container:
$ docker rm -f vespa