Skip to content

Commit

Permalink
Upgrade solr to 9.4.0 (#317)
Browse files Browse the repository at this point in the history
[minor] Requires regenerating solr config.
  • Loading branch information
nigelgbanks authored Dec 31, 2023
1 parent 92abab3 commit b88f2b5
Show file tree
Hide file tree
Showing 9 changed files with 183 additions and 136 deletions.
10 changes: 5 additions & 5 deletions solr/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@
FROM java

ARG TARGETARCH
ARG SOLR_VERSION="8.11.2"
ARG SOLR_FILE="solr-${SOLR_VERSION}.tgz"
ARG SOLR_URL="https://archive.apache.org/dist/lucene/solr/${SOLR_VERSION}/${SOLR_FILE}"
ARG SOLR_FILE_SHA256="54d6ebd392942f0798a60d50a910e26794b2c344ee97c2d9b50e678a7066d3a6"
ARG SOLR_VERSION=9.4.0
ARG SOLR_FILE=solr-${SOLR_VERSION}.tgz
ARG SOLR_URL=https://archive.apache.org/dist/solr/solr/${SOLR_VERSION}/solr-${SOLR_VERSION}.tgz
ARG SOLR_FILE_SHA256=5ff28fe3a9d92804d53c0072a8459bb1d0c280e212a288a9efd31f923fe1a9d4

EXPOSE 8983

Expand All @@ -30,7 +30,7 @@ RUN create-service-user.sh --name solr /data && \
# Defaults environment variables to be overloaded.
ENV \
SOLR_JAVA_OPTS= \
SOLR_JETTY_OPTS= \
SOLR_JETTY_OPTS=-Dsolr.jetty.host=0.0.0.0 \
SOLR_LOG_LEVEL=INFO \
SOLR_MEMORY=512m

Expand Down
54 changes: 25 additions & 29 deletions solr/README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
# Solr

Docker image for [Solr] version 8.11.2.
Docker image for [solr] version 9.4.0.

Please refer to the [Solr Documentation] for more in-depth information.

As a quick example this will bring up an instance of [Solr], and allow you
As a quick example this will bring up an instance of [solr], and allow you
to view on <http://localhost:8983/solr/>.

```bash
Expand All @@ -19,12 +19,12 @@ additional settings, volumes, ports, etc.

## Settings

| Environment Variable | Default | Description |
| :------------------- | :------ | :----------------------------------------------------------------------------- |
| SOLR_JAVA_OPTS | | Additional parameters to pass to the JVM when starting Solr |
| SOLR_JETTY_OPTS | | Additional parameters to pass to Jetty when starting Solr. |
| SOLR_LOG_LEVEL | INFO | Log level. Possible Values: OFF, FATAL, ERROR, WARN, INFO, DEBUG, TRACE or ALL |
| SOLR_MEMORY | 512m | Sets the min (-Xms) and max (-Xmx) heap size for the JVM |
| Environment Variable | Default | Description |
| :------------------- | :-------------------------- | :----------------------------------------------------------------------------- |
| SOLR_JAVA_OPTS | | Additional parameters to pass to the JVM when starting Solr |
| SOLR_JETTY_OPTS | `-Dsolr.jetty.host=0.0.0.0` | Additional parameters to pass to Jetty when starting Solr. |
| SOLR_LOG_LEVEL | `INFO` | Log level. Possible Values: OFF, FATAL, ERROR, WARN, INFO, DEBUG, TRACE or ALL |
| SOLR_MEMORY | `512m` | Sets the min (-Xms) and max (-Xmx) heap size for the JVM |

## Ports

Expand All @@ -42,29 +42,25 @@ additional settings, volumes, ports, etc.

- [Solr Logging]

[Solr Documentation]: https://lucene.apache.org/solr/guide/7_1/
[Solr Logging]: https://lucene.apache.org/solr/guide/7_1/configuring-logging.html
[Solr]: https://lucene.apache.org/solr/

## Changing versions
## Updating

There is 2 values you need to update/change the version.

1. Solr version: found at [archive.apache.org](https://archive.apache.org/dist/lucene/solr)
1. SOLR_FILE_SHA256: sha256sum of the tgz file

```dockerfile
ARG SOLR_VERSION="8.11.2"
ARG SOLR_FILE_SHA256="54d6ebd392942f0798a60d50a910e26794b2c344ee97c2d9b50e678a7066d3a6"
```
You can change the version used for [solr] by modifying the build argument
`SOLR_VERSION` and `SOLR_FILE_SHA256` in the `Dockerfile`.

Go to [archive.apache.org](https://archive.apache.org/dist/lucene/solr) and find the version you want. There will be several files but the one to use have the following naming convention.

* solr-${SOLR_VERSION}.tgz

Download the two files and run and replace the _1.1.1_ with the version you have.
Change `SOLR_VERSION` and then generate the `SOLR_FILE_SHA256` with the following
commands:

```bash
# This outputs the value to use for $SOLR_FILE_SHA256.
sha256sum solr-1.1.1.tgz
SOLR_VERSION=$(cat solr/Dockerfile | grep -o 'SOLR_VERSION=.*' | cut -f2 -d=)
SOLR_FILE=$(cat solr/Dockerfile | grep -o 'SOLR_FILE=.*' | cut -f2 -d=)
SOLR_URL=$(cat solr/Dockerfile | grep -o 'SOLR_URL=.*' | cut -f2 -d=)
SOLR_FILE=$(eval "echo $SOLR_FILE")
SOLR_URL=$(eval "echo $SOLR_URL")
wget --quiet "${SOLR_URL}"
shasum -a 256 "${SOLR_FILE}" | cut -f1 -d' '
rm "${SOLR_FILE}"
```

[Solr Documentation]: https://lucene.apache.org/solr/guide/7_1/
[Solr Logging]: https://lucene.apache.org/solr/guide/7_1/configuring-logging.html
[solr]: https://lucene.apache.org/solr/
71 changes: 37 additions & 34 deletions test/rootfs/opt/solr/server/solr/default/conf/schema.xml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
It should be kept correct and concise, usable out-of-the-box.
For more information, on how to customize this file, please see
http://wiki.apache.org/solr/SchemaXml
https://solr.apache.org/guide/solr/latest/indexing-guide/schema-elements.html
PERFORMANCE NOTE: this schema includes many optional features and should not
be used for benchmarking. To improve performance one could
Expand All @@ -49,7 +49,7 @@
that avoids logging every request
-->

<schema name="drupal-4.2.9-solr-8.x-0" version="1.6">
<schema name="drupal-4.3.0-solr-9.x-0" version="1.6">
<!-- attribute "name" is the name of this schema and is only used for display purposes.
version="x.y" is Solr's version number for the schema syntax and
semantics. It should not normally be changed by applications.
Expand Down Expand Up @@ -156,7 +156,7 @@

<!-- Currently the suggester context filter query (suggest.cfq) accesses the tags using the stored values, neither the indexed terms nor the docValues.
Therefore the dynamicField sm_* isn't suitable at the moment -->
<field name="sm_context_tags" type="string" indexed="true" stored="true" multiValued="true" docValues="false"/>
<field name="sm_context_tags" type="strings" indexed="true" stored="true" docValues="false"/>

<!-- Dynamic field definitions. If a field name is not found, dynamicFields
will be used if the name matches any of the patterns.
Expand All @@ -170,56 +170,56 @@
the last letter is 's' for single valued, 'm' for multi-valued -->

<!-- We use plong for integer since 64 bit ints are now common in PHP. -->
<dynamicField name="is_*" type="plong" indexed="true" stored="false" multiValued="false" docValues="true" termVectors="true"/>
<dynamicField name="im_*" type="plong" indexed="true" stored="false" multiValued="true" docValues="true" termVectors="true"/>
<dynamicField name="is_*" type="plong" indexed="true" stored="false" docValues="true" termVectors="true"/>
<dynamicField name="im_*" type="plongs" indexed="true" stored="false" docValues="true" termVectors="true"/>
<!-- List of floats can be saved in a regular float field -->
<dynamicField name="fs_*" type="pfloat" indexed="true" stored="false" multiValued="false" docValues="true"/>
<dynamicField name="fm_*" type="pfloat" indexed="true" stored="false" multiValued="true" docValues="true"/>
<dynamicField name="fs_*" type="pfloat" indexed="true" stored="false" docValues="true"/>
<dynamicField name="fm_*" type="pfloats" indexed="true" stored="false" docValues="true"/>
<!-- List of doubles can be saved in a regular double field -->
<dynamicField name="ps_*" type="pdouble" indexed="true" stored="false" multiValued="false" docValues="true"/>
<dynamicField name="pm_*" type="pdouble" indexed="true" stored="false" multiValued="true" docValues="true"/>
<dynamicField name="ps_*" type="pdouble" indexed="true" stored="false" docValues="true"/>
<dynamicField name="pm_*" type="pdoubles" indexed="true" stored="false" docValues="true"/>
<!-- List of booleans can be saved in a regular boolean field -->
<dynamicField name="bm_*" type="boolean" indexed="true" stored="false" multiValued="true" docValues="true" termVectors="true"/>
<dynamicField name="bs_*" type="boolean" indexed="true" stored="false" multiValued="false" docValues="true" termVectors="true"/>
<dynamicField name="bm_*" type="booleans" indexed="true" stored="false" docValues="true" termVectors="true"/>
<dynamicField name="bs_*" type="boolean" indexed="true" stored="false" docValues="true" termVectors="true"/>
<!-- Regular text (without processing) can be stored in a string field-->
<dynamicField name="ss_*" type="string" indexed="true" stored="false" multiValued="false" docValues="true" termVectors="true"/>
<dynamicField name="ss_*" type="string" indexed="true" stored="false" docValues="true" termVectors="true"/>
<!-- For field types using SORTED_SET, multiple identical entries are collapsed into a single value.
Thus if I insert values 4, 5, 2, 4, 1, my return will be 1, 2, 4, 5 when enabling docValues.
If you need to preserve the order and duplicate entries, consider to store the values as zm_* (twice). -->
<dynamicField name="sm_*" type="string" indexed="true" stored="false" multiValued="true" docValues="true" termVectors="true"/>
<dynamicField name="sm_*" type="strings" indexed="true" stored="false" docValues="true" termVectors="true"/>
<!-- Special-purpose text fields -->
<dynamicField name="tws_*" type="text_ws" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="twm_*" type="text_ws" indexed="true" stored="true" multiValued="true"/>

<dynamicField name="ds_*" type="pdate" indexed="true" stored="false" multiValued="false" docValues="true"/>
<dynamicField name="dm_*" type="pdate" indexed="true" stored="false" multiValued="true" docValues="true"/>
<dynamicField name="ds_*" type="pdate" indexed="true" stored="false" docValues="true"/>
<dynamicField name="dm_*" type="pdates" indexed="true" stored="false" docValues="true"/>
<!-- This field is used to store date ranges -->
<dynamicField name="drs_*" type="date_range" indexed="true" stored="true" multiValued="false"/>
<dynamicField name="drm_*" type="date_range" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="drs_*" type="date_range" indexed="true" stored="true"/>
<dynamicField name="drm_*" type="date_ranges" indexed="true" stored="true"/>
<!-- Trie fields are deprecated. Point fields solve all needs. But we keep the dedicated field names for backward compatibility. -->
<dynamicField name="its_*" type="plong" indexed="true" stored="false" multiValued="false" docValues="true" termVectors="true"/>
<dynamicField name="itm_*" type="plong" indexed="true" stored="false" multiValued="true" docValues="true" termVectors="true"/>
<dynamicField name="fts_*" type="pfloat" indexed="true" stored="false" multiValued="false" docValues="true"/>
<dynamicField name="ftm_*" type="pfloat" indexed="true" stored="false" multiValued="true" docValues="true"/>
<dynamicField name="pts_*" type="pdouble" indexed="true" stored="false" multiValued="false" docValues="true"/>
<dynamicField name="ptm_*" type="pdouble" indexed="true" stored="false" multiValued="true" docValues="true"/>
<dynamicField name="its_*" type="plong" indexed="true" stored="false" docValues="true" termVectors="true"/>
<dynamicField name="itm_*" type="plongs" indexed="true" stored="false" docValues="true" termVectors="true"/>
<dynamicField name="fts_*" type="pfloat" indexed="true" stored="false" docValues="true"/>
<dynamicField name="ftm_*" type="pfloats" indexed="true" stored="false" docValues="true"/>
<dynamicField name="pts_*" type="pdouble" indexed="true" stored="false" docValues="true"/>
<dynamicField name="ptm_*" type="pdoubles" indexed="true" stored="false" docValues="true"/>
<!-- Binary fields can be populated using base64 encoded data. Useful e.g. for embedding
a small image in a search result using the data URI scheme -->
<dynamicField name="xs_*" type="binary" indexed="false" stored="true" multiValued="false"/>
<dynamicField name="xm_*" type="binary" indexed="false" stored="true" multiValued="true"/>
<dynamicField name="xs_*" type="binary" indexed="false" stored="true" multiValued="false"/>
<dynamicField name="xm_*" type="binary" indexed="false" stored="true" multiValued="true"/>
<!-- Trie fields are deprecated. Point fields solve all needs. But we keep the dedicated field names for backward compatibility. -->
<dynamicField name="dds_*" type="pdate" indexed="true" stored="false" multiValued="false" docValues="true"/>
<dynamicField name="ddm_*" type="pdate" indexed="true" stored="false" multiValued="true" docValues="true"/>
<dynamicField name="dds_*" type="pdate" indexed="true" stored="false" docValues="true"/>
<dynamicField name="ddm_*" type="pdates" indexed="true" stored="false" docValues="true"/>
<!-- In case a 32 bit int is really needed, we provide these fields. 'h' is mnemonic for 'half word', i.e. 32 bit on 64 arch -->
<dynamicField name="hs_*" type="pint" indexed="true" stored="false" multiValued="false" docValues="true"/>
<dynamicField name="hm_*" type="pint" indexed="true" stored="false" multiValued="true" docValues="true"/>
<dynamicField name="hs_*" type="pint" indexed="true" stored="false" docValues="true"/>
<dynamicField name="hm_*" type="pints" indexed="true" stored="false" docValues="true"/>
<!-- Trie fields are deprecated. Point fields solve all needs. But we keep the dedicated field names for backward compatibility. -->
<dynamicField name="hts_*" type="pint" indexed="true" stored="false" multiValued="false" docValues="true"/>
<dynamicField name="htm_*" type="pint" indexed="true" stored="false" multiValued="true" docValues="true"/>
<dynamicField name="hts_*" type="pint" indexed="true" stored="false" docValues="true"/>
<dynamicField name="htm_*" type="pints" indexed="true" stored="false" docValues="true"/>

<!-- Unindexed string fields that can be used to store values that won't be searchable -->
<dynamicField name="zs_*" type="string" indexed="false" stored="true" multiValued="false"/>
<dynamicField name="zm_*" type="string" indexed="false" stored="true" multiValued="true"/>
<dynamicField name="zs_*" type="string" indexed="false" stored="true"/>
<dynamicField name="zm_*" type="strings" indexed="false" stored="true"/>

<!-- Fields for location searches.
http://wiki.apache.org/solr/SpatialSearch#geodist_-_The_distance_function -->
Expand Down Expand Up @@ -264,9 +264,11 @@
single-valued and either required or have a default value.
-->
<fieldType name="string" class="solr.StrField"/>
<fieldType name="strings" class="solr.StrField" multiValued="true"/>

<!-- boolean type: "true" or "false" -->
<fieldType name="boolean" class="solr.BoolField"/>
<fieldType name="booleans" class="solr.BoolField" multiValued="true"/>

<!-- sortMissingLast and sortMissingFirst attributes are optional attributes are
currently supported on types that are sorted internally as strings
Expand Down Expand Up @@ -331,6 +333,7 @@

<!-- A date range field -->
<fieldType name="date_range" class="solr.DateRangeField"/>
<fieldType name="date_ranges" class="solr.DateRangeField" multiValued="true"/>

<!--Binary data type. The data should be sent/retrieved in as Base64 encoded Strings -->
<fieldType name="binary" class="solr.BinaryField"/>
Expand Down Expand Up @@ -369,7 +372,7 @@
-->

<!-- A text field that only splits on whitespace for exact matching of words -->
<fieldType name="text_ws" class="solr.TextField" omitNorms="true" positionIncrementGap="100">
<fieldType name="text_ws" class="solr.TextField" omitNorms="true" positionIncrementGap="100" storeOffsetsWithPositions="true">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
Expand Down
Loading

0 comments on commit b88f2b5

Please sign in to comment.