Skip to content

Commit

Permalink
[DOCS] Documents configurable chunking (elastic#115300)
Browse files Browse the repository at this point in the history
Co-authored-by: David Kyle <[email protected]>
  • Loading branch information
szabosteve and davidkyle committed Oct 25, 2024
1 parent b6c921f commit f7a756d
Show file tree
Hide file tree
Showing 15 changed files with 354 additions and 3 deletions.
62 changes: 61 additions & 1 deletion docs/reference/inference/inference-apis.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,6 @@ Elastic –, then create an {infer} endpoint by the <<put-inference-api>>.
Now use <<semantic-search-semantic-text, semantic text>> to perform
<<semantic-search, semantic search>> on your data.


[discrete]
[[default-enpoints]]
=== Default {infer} endpoints
Expand All @@ -53,6 +52,67 @@ For these models, the minimum number of allocations is `0`.
If there is no {infer} activity that uses the endpoint, the number of allocations will scale down to `0` automatically after 15 minutes.


[discrete]
[[infer-chunking-config]]
=== Configuring chunking

{infer-cap} endpoints have a limit on the amount of text they can process at once, determined by the model's input capacity.
Chunking is the process of splitting the input text into pieces that remain within these limits.
It occurs when ingesting documents into <<semantic-text,`semantic_text` fields>>.
Chunking also helps produce sections that are digestible for humans.
Returning a long document in search results is less useful than providing the most relevant chunk of text.

Each chunk will include the text subpassage and the corresponding embedding generated from it.

By default, documents are split into sentences and grouped in sections up to 250 words with 1 sentence overlap so that each chunk shares a sentence with the previous chunk.
Overlapping ensures continuity and prevents vital contextual information in the input text from being lost by a hard break.

{es} uses the https://unicode-org.github.io/icu-docs/[ICU4J] library to detect word and sentence boundaries for chunking.
https://unicode-org.github.io/icu/userguide/boundaryanalysis/#word-boundary[Word boundaries] are identified by following a series of rules, not just the presence of a whitespace character.
For written languages that do use whitespace such as Chinese or Japanese dictionary lookups are used to detect word boundaries.


[discrete]
==== Chunking strategies

Two strategies are available for chunking: `sentence` and `word`.

The `sentence` strategy splits the input text at sentence boundaries.
Each chunk contains one or more complete sentences ensuring that the integrity of sentence-level context is preserved, except when a sentence causes a chunk to exceed a word count of `max_chunk_size`, in which case it will be split across chunks.
The `sentence_overlap` option defines the number of sentences from the previous chunk to include in the current chunk which is either `0` or `1`.

The `word` strategy splits the input text on individual words up to the `max_chunk_size` limit.
The `overlap` option is the number of words from the previous chunk to include in the current chunk.

The default chunking strategy is `sentence`.

NOTE: The default chunking strategy for {infer} endpoints created before 8.16 is `word`.


[discrete]
==== Example of configuring the chunking behavior

The following example creates an {infer} endpoint with the `elasticsearch` service that deploys the ELSER model by default and configures the chunking behavior.

[source,console]
------------------------------------------------------------
PUT _inference/sparse_embedding/small_chunk_size
{
"service": "elasticsearch",
"service_settings": {
"num_allocations": 1,
"num_threads": 1
},
"chunking_settings": {
"strategy": "sentence",
"max_chunk_size": 100,
"sentence_overlap": 0
}
}
------------------------------------------------------------
// TEST[skip:TBD]


include::delete-inference.asciidoc[]
include::get-inference.asciidoc[]
include::post-inference.asciidoc[]
Expand Down
34 changes: 33 additions & 1 deletion docs/reference/inference/inference-shared.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -31,4 +31,36 @@ end::task-settings[]

tag::task-type[]
The type of the {infer} task that the model will perform.
end::task-type[]
end::task-type[]

tag::chunking-settings[]
Chunking configuration object.
Refer to <<infer-chunking-config>> to learn more about chunking.
end::chunking-settings[]

tag::chunking-settings-max-chunking-size[]
Specifies the maximum size of a chunk in words.
Defaults to `250`.
This value cannot be higher than `300` or lower than `20` (for `sentence` strategy) or `10` (for `word` strategy).
end::chunking-settings-max-chunking-size[]

tag::chunking-settings-overlap[]
Only for `word` chunking strategy.
Specifies the number of overlapping words for chunks.
Defaults to `100`.
This value cannot be higher than the half of `max_chunking_size`.
end::chunking-settings-overlap[]

tag::chunking-settings-sentence-overlap[]
Only for `sentence` chunking strategy.
Specifies the numnber of overlapping sentences for chunks.
It can be either `1` or `0`.
Defaults to `1`.
end::chunking-settings-sentence-overlap[]

tag::chunking-settings-strategy[]
Specifies the chunking strategy.
It could be either `sentence` or `word`.
end::chunking-settings-strategy[]


21 changes: 20 additions & 1 deletion docs/reference/inference/service-alibabacloud-ai-search.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,26 @@ Available task types:
[[infer-service-alibabacloud-ai-search-api-request-body]]
==== {api-request-body-title}

`chunking_settings`::
(Optional, object)
include::inference-shared.asciidoc[tag=chunking-settings]

`max_chunking_size`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]

`overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-overlap]

`sentence_overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]

`strategy`:::
(Optional, string)
include::inference-shared.asciidoc[tag=chunking-settings-strategy]

`service`::
(Required, string) The type of service supported for the specified task type.
In this case,
Expand Down Expand Up @@ -108,7 +128,6 @@ To modify this, set the `requests_per_minute` setting of this object in your ser
include::inference-shared.asciidoc[tag=request-per-minute-example]
--


`task_settings`::
(Optional, object)
include::inference-shared.asciidoc[tag=task-settings]
Expand Down
20 changes: 20 additions & 0 deletions docs/reference/inference/service-amazon-bedrock.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,26 @@ Available task types:
[[infer-service-amazon-bedrock-api-request-body]]
==== {api-request-body-title}

`chunking_settings`::
(Optional, object)
include::inference-shared.asciidoc[tag=chunking-settings]

`max_chunking_size`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]

`overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-overlap]

`sentence_overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]

`strategy`:::
(Optional, string)
include::inference-shared.asciidoc[tag=chunking-settings-strategy]

`service`::
(Required, string) The type of service supported for the specified task type.
In this case,
Expand Down
20 changes: 20 additions & 0 deletions docs/reference/inference/service-anthropic.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,26 @@ Available task types:
[[infer-service-anthropic-api-request-body]]
==== {api-request-body-title}

`chunking_settings`::
(Optional, object)
include::inference-shared.asciidoc[tag=chunking-settings]

`max_chunking_size`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]

`overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-overlap]

`sentence_overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]

`strategy`:::
(Optional, string)
include::inference-shared.asciidoc[tag=chunking-settings-strategy]

`service`::
(Required, string)
The type of service supported for the specified task type. In this case,
Expand Down
20 changes: 20 additions & 0 deletions docs/reference/inference/service-azure-ai-studio.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,26 @@ Available task types:
[[infer-service-azure-ai-studio-api-request-body]]
==== {api-request-body-title}

`chunking_settings`::
(Optional, object)
include::inference-shared.asciidoc[tag=chunking-settings]

`max_chunking_size`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]

`overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-overlap]

`sentence_overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]

`strategy`:::
(Optional, string)
include::inference-shared.asciidoc[tag=chunking-settings-strategy]

`service`::
(Required, string)
The type of service supported for the specified task type. In this case,
Expand Down
20 changes: 20 additions & 0 deletions docs/reference/inference/service-azure-openai.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,26 @@ Available task types:
[[infer-service-azure-openai-api-request-body]]
==== {api-request-body-title}

`chunking_settings`::
(Optional, object)
include::inference-shared.asciidoc[tag=chunking-settings]

`max_chunking_size`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]

`overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-overlap]

`sentence_overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]

`strategy`:::
(Optional, string)
include::inference-shared.asciidoc[tag=chunking-settings-strategy]

`service`::
(Required, string)
The type of service supported for the specified task type. In this case,
Expand Down
20 changes: 20 additions & 0 deletions docs/reference/inference/service-cohere.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,26 @@ Available task types:
[[infer-service-cohere-api-request-body]]
==== {api-request-body-title}

`chunking_settings`::
(Optional, object)
include::inference-shared.asciidoc[tag=chunking-settings]

`max_chunking_size`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]

`overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-overlap]

`sentence_overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]

`strategy`:::
(Optional, string)
include::inference-shared.asciidoc[tag=chunking-settings-strategy]

`service`::
(Required, string)
The type of service supported for the specified task type. In this case,
Expand Down
20 changes: 20 additions & 0 deletions docs/reference/inference/service-elasticsearch.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,26 @@ Available task types:
[[infer-service-elasticsearch-api-request-body]]
==== {api-request-body-title}

`chunking_settings`::
(Optional, object)
include::inference-shared.asciidoc[tag=chunking-settings]

`max_chunking_size`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]

`overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-overlap]

`sentence_overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]

`strategy`:::
(Optional, string)
include::inference-shared.asciidoc[tag=chunking-settings-strategy]

`service`::
(Required, string)
The type of service supported for the specified task type. In this case,
Expand Down
20 changes: 20 additions & 0 deletions docs/reference/inference/service-elser.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,26 @@ Available task types:
[[infer-service-elser-api-request-body]]
==== {api-request-body-title}

`chunking_settings`::
(Optional, object)
include::inference-shared.asciidoc[tag=chunking-settings]

`max_chunking_size`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]

`overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-overlap]

`sentence_overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]

`strategy`:::
(Optional, string)
include::inference-shared.asciidoc[tag=chunking-settings-strategy]

`service`::
(Required, string)
The type of service supported for the specified task type. In this case,
Expand Down
20 changes: 20 additions & 0 deletions docs/reference/inference/service-google-ai-studio.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,26 @@ Available task types:
[[infer-service-google-ai-studio-api-request-body]]
==== {api-request-body-title}

`chunking_settings`::
(Optional, object)
include::inference-shared.asciidoc[tag=chunking-settings]

`max_chunking_size`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]

`overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-overlap]

`sentence_overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]

`strategy`:::
(Optional, string)
include::inference-shared.asciidoc[tag=chunking-settings-strategy]

`service`::
(Required, string)
The type of service supported for the specified task type. In this case,
Expand Down
20 changes: 20 additions & 0 deletions docs/reference/inference/service-google-vertex-ai.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,26 @@ Available task types:
[[infer-service-google-vertex-ai-api-request-body]]
==== {api-request-body-title}

`chunking_settings`::
(Optional, object)
include::inference-shared.asciidoc[tag=chunking-settings]

`max_chunking_size`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size]

`overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-overlap]

`sentence_overlap`:::
(Optional, integer)
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap]

`strategy`:::
(Optional, string)
include::inference-shared.asciidoc[tag=chunking-settings-strategy]

`service`::
(Required, string)
The type of service supported for the specified task type. In this case,
Expand Down
Loading

0 comments on commit f7a756d

Please sign in to comment.