Initial support for typing.Annotated, support for extractFrom #141

wRAR · 2023-10-11T19:23:21Z

Requires scrapinghub/andi#25 and scrapinghub/scrapy-poet#169

scrapy_zyte_api/providers.py

codecov · 2023-10-11T19:24:58Z

Codecov Report

Merging #141 (412475e) into main (d701f48) will decrease coverage by 0.33%.
The diff coverage is 100.00%.

❗ Current head 412475e differs from pull request most recent head f5bb894. Consider uploading reports for the commit f5bb894 to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #141      +/-   ##
==========================================
- Coverage   98.96%   98.64%   -0.33%     
==========================================
  Files          10       10              
  Lines         776      810      +34     
==========================================
+ Hits          768      799      +31     
- Misses          8       11       +3

Files	Coverage Δ
scrapy_zyte_api/providers.py	`96.87% <100.00%> (-3.13%)`	⬇️

scrapy_zyte_api/providers.py

wRAR · 2023-10-16T10:59:22Z

scrapy_zyte_api/providers.py

+                for option in ExtractFrom:
+                    if option in metadata:
+                        product_options = zyte_api_meta.setdefault("productOptions", {})
+                        if "extractFrom" in product_options:


This doesn't check if the previous value is the same, maybe we should do that instead.

Alternatively, we should treat values from different sources with different priorities and use the top one instead of raising this error.

wRAR · 2023-10-16T10:59:59Z

tests/test_providers.py

+    @attrs.define
+    class AnnotatedProductPage(BasePage):
+        product: Annotated[Product, ExtractFrom.httpResponseBody]
+        product2: Annotated[Product, ExtractFrom.httpResponseBody]


This doesn't raise a double-extractFrom exception because there is only one unique dep.

tests/test_providers.py

wRAR · 2023-11-14T15:22:09Z

tests/mockserver.py

@@ -117,6 +117,14 @@ def render_POST(self, request):
                "price": "10",
                "currency": "USD",
            }
+            extract_from = request_data.get("productOptions", {}).get("extractFrom")
+            if extract_from:
+                from scrapy_zyte_api.providers import ExtractFrom


Alternatively it can be moved to a different module (importing providers needs additional deps).

wRAR · 2023-11-14T15:24:35Z

tests/mockserver.py

+                    assert isinstance(response_data["product"], dict)
+                    assert isinstance(response_data["product"]["name"], str)


These are needed because of how response_data is typed, otherwise mypy thinks the values could be anything supported by JSON.

wRAR · 2023-11-14T15:27:21Z

tests/test_providers.py

+        product2: Annotated[Product, ExtractFrom.httpResponseBody]
+
+    class AnnotatedZyteAPISpider(ZyteAPISpider):
+        def parse_(self, response: DummyResponse, page: AnnotatedProductPage):  # type: ignore[override]


error: Argument 2 of "parse_" is incompatible with supertype "ZyteAPISpider"; supertype defines the argument type as "ProductPage" [override]
note: This violates the Liskov substitution principle
note: See https://mypy.readthedocs.io/en/stable/common_issues.html#incompatible-overrides

I think this is a general problem with scrapy-poet-annotated callbacks, it will also be worse when we release typed Scrapy as Scrapy.parse has Response but we often use DummyResponse.

Could you please open an issue for this in scrapy-poet?

I will, but I'm not sure what to do with it :)

scrapinghub/scrapy-poet#179

wRAR · 2023-12-12T14:55:43Z

So the last question is where should we define ExtractFrom. Do we want to expose it in the top-level __init__.py? If so, we shouldn't put it in scrapy_zyte_api.providers as importing it needs additional deps.

Gallaecio · 2023-12-12T15:07:17Z

What about a scrapy_zyte_api._annotations module?

Initial support for typing.Annotated.

2cabbe0

wRAR commented Oct 11, 2023

View reviewed changes

scrapy_zyte_api/providers.py Outdated Show resolved Hide resolved

wRAR commented Oct 11, 2023

View reviewed changes

scrapy_zyte_api/providers.py Outdated Show resolved Hide resolved

wRAR added 2 commits October 13, 2023 19:04

Switch to scrapy_poet.AnnotatedResult, improve the test.

ab34b93

Forbid multiple extractFrom.

ecc42d4

wRAR commented Oct 16, 2023

View reviewed changes

tests/test_providers.py Show resolved Hide resolved

wRAR added 5 commits November 14, 2023 17:31

Merge remote-tracking branch 'origin/main' into annotated-support

87bd23a

Fix removing *Options.

4b5c088

Fixes and improvements.

b63ad66

Fix CI issues.

15fc06e

More fixes.

08a5f48

wRAR commented Nov 14, 2023

View reviewed changes

wRAR added 2 commits November 14, 2023 20:19

Install more deps for the mypy test.

e0ba154

Fix an old typing issue.

2d936fd

wRAR changed the title ~~Initial support for typing.Annotated.~~ Initial support for typing.Annotated, support for extractFrom Dec 11, 2023

kmike approved these changes Dec 11, 2023

View reviewed changes

wRAR added 5 commits December 12, 2023 15:55

Merge remote-tracking branch 'origin/main' into annotated-support

cad9c3a

Bump the andi version.

e9be1a3

Fix and improve extractFrom tests.

3ee24f0

Roll back capture_exceptions.

2b0d659

Bump the scrapy-poet version.

de01fc2

Bump the web-poet version.

412475e

wRAR closed this Dec 12, 2023

wRAR reopened this Dec 12, 2023

wRAR added 2 commits December 12, 2023 19:27

Move ExtractFrom into scrapy_zyte_api/_annotations.py.

5305b46

Add docs for ExtractFrom.

f5bb894

Gallaecio approved these changes Dec 12, 2023

View reviewed changes

wRAR merged commit 11a9b03 into main Dec 12, 2023
18 checks passed

wRAR deleted the annotated-support branch December 12, 2023 17:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial support for typing.Annotated, support for extractFrom #141

Initial support for typing.Annotated, support for extractFrom #141

wRAR commented Oct 11, 2023

codecov bot commented Oct 11, 2023 •

edited

Loading

wRAR Oct 16, 2023

wRAR Nov 13, 2023

wRAR Oct 16, 2023

wRAR Nov 14, 2023

wRAR Nov 14, 2023

wRAR Nov 14, 2023

kmike Dec 11, 2023

wRAR Dec 11, 2023

wRAR Dec 11, 2023

wRAR commented Dec 12, 2023

Gallaecio commented Dec 12, 2023

		assert isinstance(response_data["product"], dict)
		assert isinstance(response_data["product"]["name"], str)

Initial support for typing.Annotated, support for extractFrom #141

Initial support for typing.Annotated, support for extractFrom #141

Conversation

wRAR commented Oct 11, 2023

codecov bot commented Oct 11, 2023 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wRAR commented Dec 12, 2023

Gallaecio commented Dec 12, 2023

codecov bot commented Oct 11, 2023 •

edited

Loading