Skip to content

Search tool API

Thomas Piller edited this page Mar 6, 2023 · 16 revisions

The MapX Search tool API was built on top of MeiliSearch, an open source (MIT License) search-engine.

The configuration of MeiliSearch specific to MapX as well as an example of API usage are described in this document.

⚠️ Take a few minutes to read the MeiliSearch documentation, you'll find everything you need to learn about the search-engine and its use.

How to use the MapX search tool via the API?

The Search tool API key and host required to search the MapX public data catalog from the API can be retrieved in the Toolbox -> Get search API Key and configuration. To access this information, you must be logged into MapX with a valid account.

The index referencing MapX public views used in the query is dependent on the language of the search (see next section for more details).

Once all parameters have been retrieved, it is possible to query the API like this (basic example using cURL):

curl '<search engine host>/indexes/<index>/search' \
-H 'X-Meili-API-Key: <search engine API key>' \
--data-raw '{"q":"water"}' \
--compressed;

To learn more about how to use the search parameters, refer directly to the MeiliSearch documentation.

Special case: using the correct parameters, it is possible to extract an entire index:

curl '<search engine host>/indexes/<index>/search' \
-H 'X-Meili-API-Key: <search engine API key>' \
--data-raw '{"q":"", "limit": 10000}' \
--compressed > index.json;

⚠️ The number of requests per user is limited during a specific time range. If the limit is reached, request will be denied until the next time-frame.

Indexes

Indexes referencing MapX public views are updated hourly using a automatic routine. This means that a newly created view will appear in the MapX Search tool only after the execution of the next routine. The same reasoning applies to view deletion.

An index is generated per language supported by MapX thus allowing users to specify the language when searching. When generating indexes from the database, if a field in the view or source metadata is not filled in for a given language, the fallback is English.

Indexes (format = views_{language ISO 639-1 codes}) are available for the following languages:

  • Arabic: views_ar
  • Bengali: views_bn
  • Chinese: views_zh
  • English: views_en
  • French: views_fr
  • German: views_de
  • Pashto: views_ps
  • Persian: views_fa
  • Russian: views_ru
  • Spanish: views_es

Searchable attributes

searchableAttributes designates the fields that are searchable in the indexes.

In MapX, some fields are more relevant to search than others. The attributes' order in searchableAttributes determines their impact on relevancy, from most impactful to least.

searchableAttributes: [
    'view_title',
    'view_abstract',
    'source_title',
    'source_abstract',
    'source_keywords', // custom keywords
    'source_keywords_m49', // geographic keywords
    'source_keywords_gemet', // keywords from the GEMET thesaurus
    'source_notes',
    'project_title',
    'project_abstract',
    'view_id',
    'project_id',
    'view_type',
    'view_modified_at',
    'view_created_at',
    'source_start_at', // start date if the data has a temporal component
    'source_end_at', // end date if the data has a temporal component
    'source_released_at',
    'source_modified_at',
    'range_start_at', // see below
    'range_end_at', // see below
    'range_start_at_year', // year extracted from range_start_at
    'range_end_at_year', // year extracted from range_end_at
    'range_years', // generate_series(range_start_at_year, range_end_at_year)
    'projects_data', // information related to projects where the view has been shared
    'projects_id' // list of project ids containing the view
]

In MapX metadata, it is possible that some date fields were not filled in by publishers. This would have had a negative impact when filtering results by date in the MapX UI. In our initial tests, many views were not returned thus distorting the search. Therefore, the MapX team decided to generate new date fields to overcome this situation:

LEAST(
    source_start_at,
    source_released_at,
    source_modified_at,
    source_modified_at,
    view_created_at,
    view_modified_at
) AS range_start_at

GREATEST(
    source_end_at,
    source_released_at,
    source_modified_at,
    source_modified_at,
    view_created_at,
    view_modified_at
) AS range_end_at

Dedicated page in MeiliSearch documentation.

Attributes for faceting

attributesForFaceting designates the fields that can be used in the filter search parameter.

attributesForFaceting: [
      'view_type',
      'source_keywords',
      'source_keywords_m49',
      'source_keywords_gemet',
      'projects_id'
    ]

Dedicated page in MeiliSearch documentation.

Ranking rules

Search responses are sorted according to a set of rules (i.e. ranking rules).

Whenever a search query is made, MeiliSearch uses a bucket sort to rank documents. The first ranking rule is applied to all documents, while each subsequent rule is only applied to documents that are considered equal under the previous rule (i.e. as a tiebreaker).

The order in which ranking rules are applied matters. The first rule in the array has the most impact, and the last rule has the least.

MeiliSearch Documentation v0.20

rankingRules: [
    'attribute', //  attribute ranking order (see searchableAttributes)
    'exactness', // similarity of the matched words with the query words
    'proximity', // increasing distance between matched query terms
    'words', // decreasing number of matched query terms
    'wordsPosition', // location of the query word in the field
    'typo', // increasing number of typos
    'asc(view_modified_at)' // custom rule: ascending sort on the view_modified_at attribute (see searchableAttributes)
]

Dedicated page in MeiliSearch documentation.

Response structure

Field Description Type
hits Results of the query [result]
offset Number of documents skipped number
limit Number of documents to take number
nbHits Total number of matches number
exhaustiveNbHits Whether nbHits is exhaustive boolean
facetsDistribution Distribution of the given facets object
exhaustiveFacetsCount Whether facetsDistribution is exhaustive boolean
processingTimeMs Processing time of the query number
query Query originating the response string

Dedicated page in MeiliSearch documentation.

Results structure

Field Description Type
view_id View id string
view_type View type coded as follows: vt = vector, cc = custom code, rt = raster, and sm = story map string
source_keywords List of user-defined keywords [string]
source_keywords_m49 List of geographic keywords: countries are coded using ISO 3166-1 alpha-3, regions using m49 codes (prefixed by "m49_") [string]
source_keywords_gemet List of keywords from the GEMET thesaurus [string]
view_modified_at Last modified date of the view integer
view_title View title string
view_abstract View abstract string
source_title Source title (several views can be based on the same source) string
source_abstract Source abstract string
source_notes Source note string
project_id Project id in which the view was published string
view_created_at Creation date of the view integer
source_start_at Start date if the data has a temporal component integer
source_end_at End date if the data has a temporal component integer
source_released_at Creation date of the source integer
source_modified_at Last modified date of the source integer
range_start_at Start date of the time range (see Searchable attributes) integer
range_end_at End date of the time range (see Searchable attributes) integer
range_start_at_year Start year of the time range integer
range_end_at_year End year of the time range integer
range_years List of years between range_start_at_yearand range_end_at_year [integer]
projects_data Information on projects where the view is present (see projects_data structure) object
projects_id List of project ids where the view is present [string]
_formatted Fields formatted for use in the MapX UI object

projects_data structure

Field Description Type
title Project title string
origin Origin of the view coded as follows: project = the view was published in this project; exported or imported = the view has been shared to this project string
id_project Project id string
description Project description string