pydantic for validation #71

robertodr · 2019-04-08T12:48:12Z

There exists this neat library https://pydantic-docs.helpmanual.io that could help us with validation. At a glance, the advantage seems to be that defining new data types is equivalent to adding a class with decorated methods. This could certainly help in making parselglossy customizable and possibly to solve #51. The disadvantage is that this requires Python 3.6+

robertodr · 2019-04-28T13:50:48Z

I've looked at this and I don't think it's something we can use, unless we bend over backwards.

What pydantic does

pydantic can be used to validate data based on a model In principle, this is exactly what we want to do, but we have additional design requirements that, I think are hard to fit in the pydantic way of doing things. This is an example copy-pasted from their documentation:

from pydantic import BaseModel, ValidationError, validator


class UserModel(BaseModel):
    name: str
    password1: str
    password2: str

    @validator('name')
    def name_must_contain_space(cls, v):
        if ' ' not in v:
            raise ValueError('must contain a space')
        return v.title()

    @validator('password2')
    def passwords_match(cls, v, values, **kwargs):
        if 'password1' in values and v != values['password1']:
            raise ValueError('passwords do not match')
        return v

here I have a model for our User class. Doing print(UserModel(name='samuel colvin', password1='zxcvbn', password2='zxcvbn')) will check types and run the validator methods (decorated methods of the class). Default values can be given to the class attributes. All very nifty!

How I intended to use pydantic

I intended to transform parselglossy into a code generator:

parselglossy generate --grammar=getkw --template=foo.yml --docs=input.rst

this command would give:

the input.rst documentation file
a handful of Python files to put under source control in the client. This would include a copy of pyparsing.py, a copy of the chosen grammar source file, a translation of the templating YAML into source code with validation models for the various keywords.

The validation models would have been generated à la pydantic, see example above. I think having a generator is better:

No need to install parselglossy for users of the client code.
The client code can decide how to distribute the input parsing facilities and how to modify them, (almost) independently of how parselglossy evolves.
Translating the template YAML into code makes the whole approach less fragile, I think. (What if the client code forgets to install the YAML? What if it gets accidentally deleted?)
The way things are now, the template.yml gets validated every time we parse input. In this way it's validated once and for all, skipping one unnecessary step (Yes, this is a micro-optimization and not an argument with any weight)

Why we can't use pydantic

We have two powerful mechanisms for defaulting and validation that I do not want to get rid of:

Arbitrary Python code to compute defaults based on values of other keywords in the input tree.
Arbitrary Python code to check keywords based on other keywords on the input tree.

The need to have the input tree available when computing defaults and checking predicates, effectively bars us from using pydantic. I can see no way to get the tree as an additional parameter

Conclusions

a. I am pretty much convinced that we need to transform this tool into a code generator, for the reasons given above.
b. pydantic would be good to get type checking done. It will impose an additional dependency, without effectively adding much value.

bast · 2019-04-29T06:18:50Z

Thank you! Both conclusions are convincing to me.

robertodr self-assigned this Apr 8, 2019

robertodr closed this as completed Jul 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pydantic for validation #71

pydantic for validation #71

robertodr commented Apr 8, 2019

robertodr commented Apr 28, 2019

bast commented Apr 29, 2019

pydantic for validation #71

pydantic for validation #71

Comments

robertodr commented Apr 8, 2019

robertodr commented Apr 28, 2019

What pydantic does

How I intended to use pydantic

Why we can't use pydantic

Conclusions

bast commented Apr 29, 2019