Skip to content

Latest commit

 

History

History
 
 

text-search

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

Vespa sample application - text search tutorial

This sample application contains the code for the text search tutorial. Please refer to the text search tutorial for more information.

Executable example:

$ git clone --depth 1 https://github.com/vespa-engine/sample-apps.git
$ VESPA_SAMPLE_APPS=`pwd`/sample-apps
$ cd $VESPA_SAMPLE_APPS/text-search && mvn clean package
$ docker run --detach --name vespa --hostname vespa-container --privileged \
  --volume $VESPA_SAMPLE_APPS:/apps --publish 8080:8080 vespaengine/vespa

Wait for the configserver to start:

$ docker exec vespa bash -c 'curl -s --head http://localhost:19071/ApplicationStatus'

Deploy the application:

$ docker exec vespa bash -c '/opt/vespa/bin/vespa-deploy prepare /apps/text-search/target/application.zip && \
    /opt/vespa/bin/vespa-deploy activate'

Wait for the application to start:

$ curl -s --head http://localhost:8080/ApplicationStatus

Create data feed:

To use the entire MS MARCO data set, use the download script. Here we use the sample data.

$ ./bin/convert-msmarco.sh

Feed data:

$ docker exec vespa bash -c 'java -jar /opt/vespa/lib/jars/vespa-http-client-jar-with-dependencies.jar \
    --file /apps/text-search/msmarco/vespa.json --host localhost --port 8080'

Test the application:

$ curl -s 'http://localhost:8080/search/?query=what+is+dad+bod'

Browse the site:

http://localhost:8080/site

Install python dependencies:

pip3 install -qqq -r src/python/requirements.txt

Collect training data:

./src/python/collect_training_data.py msmarco/sample collect_rank_features 99

Train TF-Ranking models:

./src/python/tfrank.py msmarco/sample

Shutdown and remove the container:

$ docker rm -f vespa