Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pydantic for validation #71

Closed
robertodr opened this issue Apr 8, 2019 · 2 comments
Closed

pydantic for validation #71

robertodr opened this issue Apr 8, 2019 · 2 comments
Assignees

Comments

@robertodr
Copy link
Contributor

There exists this neat library https://pydantic-docs.helpmanual.io that could help us with validation. At a glance, the advantage seems to be that defining new data types is equivalent to adding a class with decorated methods. This could certainly help in making parselglossy customizable and possibly to solve #51. The disadvantage is that this requires Python 3.6+

@robertodr robertodr self-assigned this Apr 8, 2019
@robertodr
Copy link
Contributor Author

I've looked at this and I don't think it's something we can use, unless we bend over backwards.

What pydantic does

pydantic can be used to validate data based on a model In principle, this is exactly what we want to do, but we have additional design requirements that, I think are hard to fit in the pydantic way of doing things. This is an example copy-pasted from their documentation:

from pydantic import BaseModel, ValidationError, validator


class UserModel(BaseModel):
    name: str
    password1: str
    password2: str

    @validator('name')
    def name_must_contain_space(cls, v):
        if ' ' not in v:
            raise ValueError('must contain a space')
        return v.title()

    @validator('password2')
    def passwords_match(cls, v, values, **kwargs):
        if 'password1' in values and v != values['password1']:
            raise ValueError('passwords do not match')
        return v

here I have a model for our User class. Doing print(UserModel(name='samuel colvin', password1='zxcvbn', password2='zxcvbn')) will check types and run the validator methods (decorated methods of the class). Default values can be given to the class attributes. All very nifty!

How I intended to use pydantic

I intended to transform parselglossy into a code generator:

parselglossy generate --grammar=getkw --template=foo.yml --docs=input.rst

this command would give:

  • the input.rst documentation file
  • a handful of Python files to put under source control in the client. This would include a copy of pyparsing.py, a copy of the chosen grammar source file, a translation of the templating YAML into source code with validation models for the various keywords.

The validation models would have been generated à la pydantic, see example above. I think having a generator is better:

  • No need to install parselglossy for users of the client code.
  • The client code can decide how to distribute the input parsing facilities and how to modify them, (almost) independently of how parselglossy evolves.
  • Translating the template YAML into code makes the whole approach less fragile, I think. (What if the client code forgets to install the YAML? What if it gets accidentally deleted?)
  • The way things are now, the template.yml gets validated every time we parse input. In this way it's validated once and for all, skipping one unnecessary step (Yes, this is a micro-optimization and not an argument with any weight)

Why we can't use pydantic

We have two powerful mechanisms for defaulting and validation that I do not want to get rid of:

  • Arbitrary Python code to compute defaults based on values of other keywords in the input tree.
  • Arbitrary Python code to check keywords based on other keywords on the input tree.

The need to have the input tree available when computing defaults and checking predicates, effectively bars us from using pydantic. I can see no way to get the tree as an additional parameter

Conclusions

a. I am pretty much convinced that we need to transform this tool into a code generator, for the reasons given above.
b. pydantic would be good to get type checking done. It will impose an additional dependency, without effectively adding much value.

@bast
Copy link
Member

bast commented Apr 29, 2019

Thank you! Both conclusions are convincing to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants