Skip to content

Commit

Permalink
0.27.0 release notes (#128)
Browse files Browse the repository at this point in the history
  • Loading branch information
Gallaecio authored Jan 16, 2025
1 parent ca375d0 commit e329c56
Show file tree
Hide file tree
Showing 2 changed files with 36 additions and 9 deletions.
14 changes: 14 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,20 @@
Changelog
=========

0.27.0 (2025-01-16)
===================

* The :class:`~zyte_common_items.pipelines.DropLowProbabilityItemPipeline` now
supports nested items, i.e. :class:`dict` objects with items as values.

* Added an add-on to make :ref:`Scrapy configuration <scrapy-config>` easier.

* :class:`~zyte_common_items.Metadata` now also has all fields from
:class:`~zyte_common_items.SerpMetadata`.

* Messages about dropped items, e.g. due to low probability, are now logged as
information and not as warnings.

.. _0.26.2:

0.26.2 (2024-11-12)
Expand Down
31 changes: 22 additions & 9 deletions zyte_common_items/pipelines.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,19 +65,32 @@ def process_item(self, item, spider):


class DropLowProbabilityItemPipeline:
"""This pipeline drops an item if its probability, defined in the settings,
is less than the specified threshold.
""":ref:`Item pipeline <topics-item-pipeline>` that drops items that have
a low probability.
By default, 0.1 threshold is used, i.e. items with probabillity < 0.1 are dropped.
The :setting:`ITEM_PROBABILITY_THRESHOLDS` setting determines the
probability thresholds. By default, items with probability < 0.1 are
dropped.
You can customize the thresholds by using the ITEM_PROBABILITY_THRESHOLDS setting that offers
greater flexibility, allowing you to define thresholds for each Item class separately or
set a default threshold for all other item classes.
:class:`dict` objects with items as values are supported. For those, the
probability of each item is evaluated, and items with a low probability are
removed from the :class:`dict`. If the :class:`dict` ends up empty, it is
dropped entirely.
Thresholds for Item classes can be defined using either the path to the Item class or
directly using the Item classes themselves.
.. setting:: ITEM_PROBABILITY_THRESHOLDS
The example of using ITEM_PROBABILITY_THRESHOLDS:
ITEM_PROBABILITY_THRESHOLDS
---------------------------
Default: ``{"default": 0.1}``
Allows defining a threshold for each item class and a default threshold
for any other item class.
Thresholds for item classes can be defined using either an import path of
the item class or directly using the item class itself.
For example:
.. code-block:: python
Expand Down

0 comments on commit e329c56

Please sign in to comment.