Skip to content

Latest commit

 

History

History
459 lines (294 loc) · 13.6 KB

article.md

File metadata and controls

459 lines (294 loc) · 13.6 KB
               .
              ..:
            Hultnér
          Technologies

@ahultner | https://hultner.se/

Give your dataclasses super powers with pydantic

Index

  • Quick refresher on python data classes
  • Pydantic introduction
    • Prior art
    • Minimal example from dataclass
    • Runtime type-checking
    • JSON (de)serialisation
    • JSONSchema
    • Validators
    • FastAPI framework
      • OpenAPI Specifications
      • Autogenerated tests
  • Cool features worth mentioning
  • Future
  • Conclusion

Let's start with a quick @dataclass-refresher.

Well use pizza-based examples in the spirit of python.pizza 🐍🍕

from dataclasses import dataclass
from typing import Tuple
@dataclass
class Pizza:
    style: str
    toppings: Tuple[str, ...]
    
Pizza(1, ("cheese", "ham"))
Pizza(style=1, toppings=('cheese', 'ham'))

Now we may want to constrain the toppings to ones we actually offer. Our pizzeria doesn't offer pineapple 🚫🍍 as a valid topping, hate it 😡 or love it 💕

from enum import Enum
class Topping(str, Enum):
    mozzarella = 'mozzarella'
    tomato_sauce = 'tomato sauce'
    prosciutto = 'prosciutto'
    basil = 'basil'
    rucola = 'rucola'
    

@dataclass
class Pizza:
    style: str
    toppings: Tuple[Topping, ...]

Let's see what happens if we try to create a pizza with pineapple 🍍 topping.

Pizza(2, ("pineapple", 24))
Pizza(style=2, toppings=('pineapple', 24))

With dataclasses the types aren't enforced, this can ofcourse be implemented but in this case we'll lean on the shoulders of a giant, pydantic 🧹🐍🧐

from pydantic.dataclasses import dataclass

    
@dataclass
class Pizza:
    style: str
    toppings: Tuple[Topping, ...]

As you can see the only thing changed in this example is that we import the dataclass decorator from pydantic.

from pydantic import ValidationError
try:
    Pizza(2, ("pineapple", 24))
except ValidationError as err:
    print(err)
2 validation errors for Pizza
toppings -> 0
  value is not a valid enumeration member; permitted: 'mozzarella', 'tomato sauce', 'prosciutto', 'basil', 'rucola' (type=type_error.enum; enum_values=[<Topping.mozzarella: 'mozzarella'>, <Topping.tomato_sauce: 'tomato sauce'>, <Topping.prosciutto: 'prosciutto'>, <Topping.basil: 'basil'>, <Topping.rucola: 'rucola'>])
toppings -> 1
  value is not a valid enumeration member; permitted: 'mozzarella', 'tomato sauce', 'prosciutto', 'basil', 'rucola' (type=type_error.enum; enum_values=[<Topping.mozzarella: 'mozzarella'>, <Topping.tomato_sauce: 'tomato sauce'>, <Topping.prosciutto: 'prosciutto'>, <Topping.basil: 'basil'>, <Topping.rucola: 'rucola'>])

And with that simple chage we can see that our new instance of an invalid pizza actually raises errors 🚫🚨

Additionally these errors are very readable!

So let's try to create a valid pizza 🍕✅

Pizza("Napoli", (Topping.tomato_sauce, Topping.prosciutto, Topping.mozzarella, Topping.basil))
Pizza(style='Napoli', toppings=(<Topping.tomato_sauce: 'tomato sauce'>, <Topping.prosciutto: 'prosciutto'>, <Topping.mozzarella: 'mozzarella'>, <Topping.basil: 'basil'>))

So what about JSON? 🧑‍💻
The dataclass dropin replacement decorator from pydantic is great for compability but by using pydantic.BaseModel we can get even more out of pydantic. One of those things is (de)serialisation, pydantic have native support JSON encoding and decoding.

from pydantic import BaseModel



class Pizza(BaseModel):
    style: str
    toppings: Tuple[Topping, ...]

Disclaimer: Pydantic is primarly a parsing library and does validation as a means to an end, so make sure it makes sense for you.

When using the BaseModel the default behaviour requires to specify the init arguments using their keywords like below

Pizza(style="Napoli", toppings=(Topping.tomato_sauce, Topping.prosciutto, Topping.mozzarella, Topping.basil))
Pizza(style='Napoli', toppings=(<Topping.tomato_sauce: 'tomato sauce'>, <Topping.prosciutto: 'prosciutto'>, <Topping.mozzarella: 'mozzarella'>, <Topping.basil: 'basil'>))

We can now easily encode this object as JSON, there's also built-in support for dict, pickle, immutable copy(). Pydantic will also (de)serialise subclasses.

_.json()
'{"style": "Napoli", "toppings": ["tomato sauce", "prosciutto", "mozzarella", "basil"]}'

And we can also reconstruct our original object using the parse_raw-method.

Pizza.parse_raw('{"style": "Napoli", "toppings": ["tomato sauce", "prosciutto", "mozzarella", "basil"]}')
Pizza(style='Napoli', toppings=(<Topping.tomato_sauce: 'tomato sauce'>, <Topping.prosciutto: 'prosciutto'>, <Topping.mozzarella: 'mozzarella'>, <Topping.basil: 'basil'>))

Errors raises a validation error, these can also be represented as JSON.

try:
    Pizza(style="Napoli", toppings=(2,))
except ValidationError as err:
    print(err.json())
[
  {
    "loc": [
      "toppings",
      0
    ],
    "msg": "value is not a valid enumeration member; permitted: 'mozzarella', 'tomato sauce', 'prosciutto', 'basil', 'rucola'",
    "type": "type_error.enum",
    "ctx": {
      "enum_values": [
        "mozzarella",
        "tomato sauce",
        "prosciutto",
        "basil",
        "rucola"
      ]
    }
  }
]

We can also export a JSONSchema directly from our model, this is very useful for instance if we want to use your model to feed a Swagger/OpenAPI-spec. 📜✅

Caution: Pydantic uses the latest draft 7 of JSONSchema, this will be used in the comming OpenAPI 3.1 spec but the current 3.0.x spec uses draft 4. I spoke with Samuel Colvin, the creator of pydantic about this and his recommendation is to write a schema_extrafunction to use the older JSONSchema version if you want strict compability. The FastAPI framework doesn't do this and is slightly incompatible with the current OpenAPI-spec

Pizza.schema()
{'title': 'Pizza',
 'type': 'object',
 'properties': {'style': {'title': 'Style', 'type': 'string'},
  'toppings': {'title': 'Toppings',
   'type': 'array',
   'items': {'enum': ['mozzarella',
     'tomato sauce',
     'prosciutto',
     'basil',
     'rucola'],
    'type': 'string'}}},
 'required': ['style', 'toppings']}

That was the basics using the built-in validators, but what if you want to implement your own business rules in a custom validator, we're going to look at this next.

We now want to add a new property for oven_temperature, but in our case we also want to ensure that we are close to the ideal of roughly 375°C for Neapolitan pizzas, which is our imaginary restaurants house style.

from pydantic import validator, root_validator
class BakedPizza(Pizza):
    # For simplicity in the example we use int for temperature
    oven_temperature: int
        
    # A validator looking at a single property
    @validator('style')
    def check_style(cls, style):
        house_styles = ("Napoli", "Roman", "Italian")
        if style not in house_styles:
            raise ValueError(f"We only cook the following styles: {house_styles}, given: {style}")
        return style
    
    # Root validators check the entire model
    @root_validator
    def check_temp(cls, values):
        style, temp = values.get("style"), values.get("oven_temperature")
        
        if style != "Napoli":
            # We don't have any special rules yet for the other styles
            return values

        if 350 <= temp <= 400: 
            # Target temperature 350 - 400°C, ideally around 375°C
            return values

        raise ValueError(f"Napoli pizzas require a oven_temperature in the range of 350 - 400°C, given: {temp}°C")

Now let's see if we create some invalid pizzas ⚠️🚨

try: 
    BakedPizza(style="Panpizza", toppings=["tomato sauce"], oven_temperature=250 )
except ValidationError as err:
    print(err)
1 validation error for BakedPizza
style
  We only cook the following styles: ('Napoli', 'Roman', 'Italian'), given: Panpizza (type=value_error)
try: 
    BakedPizza(style="Napoli", toppings=["tomato sauce"], oven_temperature=300 )
except ValidationError as err:
    print(err)
1 validation error for BakedPizza
__root__
  Napoli pizzas require a oven_temperature in the range of 350 - 400°C, given: 300°C (type=value_error)

Now let's create a pizza 🍕 allowed by our rules! ✨

BakedPizza(style="Napoli", toppings=["tomato sauce"], oven_temperature=350)
BakedPizza(style='Napoli', toppings=(<Topping.tomato_sauce: 'tomato sauce'>,), oven_temperature=350)

Gosh these runtime type checkers are rather useful, but what about functions?

Pydantic got you covered with @validate_arguments. Still in beta, API may change, release 2020-04-18 in version 1.5

from pydantic import validate_arguments

# Validator on function
# Ensure that we use a valid pizza when making orders
@validate_arguments
def make_order(pizza: Pizza):
    ...
    
try:
    make_order({
        "style":"Napoli",
        "toppings":("tomato sauce", "mozzarella", "prosciutto", "pineapple")
    })
except ValidationError as err:
    print(err)
1 validation error for MakeOrder
pizza -> toppings -> 3
  value is not a valid enumeration member; permitted: 'mozzarella', 'tomato sauce', 'prosciutto', 'basil', 'rucola' (type=type_error.enum; enum_values=[<Topping.mozzarella: 'mozzarella'>, <Topping.tomato_sauce: 'tomato sauce'>, <Topping.prosciutto: 'prosciutto'>, <Topping.basil: 'basil'>, <Topping.rucola: 'rucola'>])

FastAPI

FastAPI is a lean microframework similar to Flask which utilizes pydantic models heavily, it will also automatically generate OpenAPI-specifications from your application based on your models.

This gives you framework agnostic models while still being able to leverage tight integration with a modern and easy to use framework. If you're going to start a new API-project i highly recommend trying FastAPI.

from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()

def make_order(pizza: Pizza):
    # Business logic for making an order
    pass

def dispatch_order(pizza: BakedPizza):
    # Hand over pizza to delivery company
    pass

# Deliver a baked pizza
@app.post("/delivery/pizza")
async def deliver_pizza_order(pizza: BakedPizza):
    dispatch = dispatch_order(pizza)
    return dispatch

@app.post("/order/pizza")
async def order_pizza(pizza: Pizza):
    order = make_order(pizza)
    return order

This is everything we need to create a small API around our models.


That's it, a quick introduction to pydantic!

But this is just the tip of the iceberg 🗻 and I want to give you a hint about what more can be done.
I'm not going to go into detail in any of this but feel free to ask me about it in the chat, on Twitter/LinkedIn or via email. 💬📨

Cool features worth mentioning

  • Post 1.0, reached this milestone about a year ago
  • Support for standard library types
  • Offer useful extra types for every day use
    • Email
    • HttpUrl (and more, stricturl for custom validation)
    • PostgresDsn
    • IPvAnyAddress (as well as IPv4Address and IPv6Address from ipaddress)
    • PositiveInt
    • PaymentCardNumber, PaymentCardBrand.[amex, mastercard, visa, other], checks luhn, str of digits and BIN-based lenght.
    • Constrained types (e.g. conlist, conint, etc.)
    • and more…
  • Supports custom datatypes
  • Settings management
    • Typed configuration management
    • Automatically reads from environment variables
    • Dotenv (.env) support via defacto standard python-dotenv.
  • ORM-mode
  • Recursive models
  • Works with mypy out of the box, mypy plugin further improves experience.
  • Postponed annotations, self-referencing models, PEP-563-style.
  • python-devtools intergration
  • PyCharm plugin
  • Fast compared to popular alternatives!
    But always make your own benchmarks for your own usecase if performance is important for you.

Future

  • A strict mode is being worked on, in the future this will enable us to choose between Strict and Coercion on a model level instead of relying on the Strict* types.
  • The project is very active and a lot of improvements are constantly being made to the library.

Conclusion

Pure python syntax Better validation Very useful JSON-tools for API's Easy to migrate from dataclasses Lots of useful features Try it out!

Want to hear more from me?

I'm making a course on property based testing in python using Hypothesis.
Sign up here