Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add required field to generated ToSchema instances #137

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

rashadg1030
Copy link
Collaborator

@rashadg1030 rashadg1030 commented Aug 4, 2020

This is my second attempt to have the required fields specified in the generated swagger.json file.

@rashadg1030 rashadg1030 self-assigned this Aug 4, 2020
@rashadg1030
Copy link
Collaborator Author

I'm now going to point my version of end-to-end towards this commit and test the code generation.

@rashadg1030 rashadg1030 changed the title Add required field to genrated ToSchema instances Add required field to generated ToSchema instances Aug 5, 2020
@rashadg1030
Copy link
Collaborator Author

I found doctests in proto3-suite that tests the code that generates the swagger.json file. I've been using these to test the changes I'm making to Generate.hs. All of them pass except this one doctest >>> schemaOf @SomethingPickOne.

@Gabriella439
Copy link
Contributor

Gabriella439 commented Aug 5, 2020

@rashadg1030: Yeah, I saw that. Is it possibly related to the fact that you still have some places in the code that use toSchemaInstanceDeclarationOld?

@rashadg1030
Copy link
Collaborator Author

rashadg1030 commented Aug 5, 2020

@Gabriel439 Thank you, yes it is related. I think I see what the problem is now. The reason I still use the old version there is because I wasn't sure how to pass in the messageParts. By that point I only have fieldNames and the messageParts information is lost. I'm trying to figure out how to pass it down to where the old version is used, so I can use the new version.

@Gabriella439
Copy link
Contributor

Assigning to @lambdadog to review so that we get a fresh pair of eyes on this

@Gabriella439 Gabriella439 requested a review from lambdadog August 11, 2020 00:50
@rashadg1030
Copy link
Collaborator Author

rashadg1030 commented Aug 12, 2020

@lambdadog The most recent changes I've made to proto3-suite were to fix this issue here: https://github.com/awakesecurity-dev/end-to-end/pull/13866#issuecomment-671613298. After trying different things, I believe the line of code I added to Generate.hs should make enum fields required in the generated swagger schema. Tests are passing here, but I wasn't able to test this change on the .proto files in end-to-end to see if it works. I'm currently working on getting end-to-end to point to the most recent version of this branch, and run the codegen to see if the changes I made work.

Copy link
Collaborator Author

@rashadg1030 rashadg1030 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doctest has been fixed

tests/TestCodeGen.hs Outdated Show resolved Hide resolved
@rashadg1030
Copy link
Collaborator Author

rashadg1030 commented Aug 12, 2020

Okay so after seeing the swagger.json output from end-to-end, it is incorrect. Here's why:

let isRequired :: DotProtoField -> Bool = \case
  DotProtoField _ (Prim (Named (Dots (Path ("google" :| ["protobuf", _]))))) _ _ _ -> False
  DotProtoField _ (Prim (Named (Single _))) _ _ _ -> True -- This line makes enums and message required.
  DotProtoField _ (Prim (Named _)) _ _ _ -> False                                 
  DotProtoField _ _ _ _ _ -> True
  DotProtoEmptyField -> False

The second pattern match matches on enums and messages, when we only want it to match on enums. As far as I can see, there isn't a way to tell if a DotProtoField is enum type or message type. Both enum and message DotProtoFields have Named identifier as the DotProtoPrimType. You can't tell if it is enum or message based on the identifier. That's why in the second case statement I just matched on Named (Single _), but it makes both message and enum fields required.

Here's my solution to fix the problem:

Change DotProtoPrimType from:

data DotProtoPrimType
  = Int32
  | Int64
  | SInt32
  | SInt64
  | UInt32
  | UInt64
  | Fixed32
  | Fixed64
  | SFixed32
  | SFixed64
  | String
  | Bytes
  | Bool
  | Float
  | Double
  | Named DotProtoIdentifier
  -- ^ A named type, referring to another message or enum defined in the same file
  deriving (Show, Eq)

to ->:

data DotProtoPrimType
  = Int32
  | Int64
  | SInt32
  | SInt64
  | UInt32
  | UInt64
  | Fixed32
  | Fixed64
  | SFixed32
  | SFixed64
  | String
  | Bytes
  | Bool
  | Float
  | Double
  | Message DotProtoIdentifier
  | Enum DotProtoIndentifier
  -- ^ Two separate constructors to differentiate between a message and enum
  deriving (Show, Eq)

This way we can pattern match better and affect enums only. This could be a widespread change because I have to fix the parsing I believe and I'm adding another constructor to match on, so I want to make sure there isn't an easier way to do this before I do it.

Edit: Wait nevermind I think I can match on the identifier. If it's enum it should be "enum". I was confused and thought the identifier was referring to the name of the field.

Edit2: My first edit is wrong.

@rashadg1030
Copy link
Collaborator Author

rashadg1030 commented Aug 12, 2020

I think it's a good idea to add a test or two for testing the output of proto3-suite for enums.

@Gabriella439
Copy link
Contributor

@rashadg1030: I think it should be possible to pattern match on just oneofs without changing DotProtoPrimType. As a general rule of thumb we want the data structures for the AST to match the .proto file format as closely as possible

@Gabriella439
Copy link
Contributor

@rashadg1030: Also, you might not need to make enums required anyway

@rashadg1030
Copy link
Collaborator Author

rashadg1030 commented Aug 13, 2020

Okay I think I know what I need to do. I need to lookup up the field in the TypeContext and get the DotProtoTypeInfo. There is field inside DotProtoTypeInfo that has type DotProtoKind. DotProtoKind is defined as:

data DotProtoKind = DotProtoKindEnum
                  | DotProtoKindMessage
                  deriving (Show, Eq, Ord, Enum, Bounded)

This should allow me to distinguish between enum and message fields.

@rashadg1030
Copy link
Collaborator Author

rashadg1030 commented Aug 13, 2020

@Gabriel439

message WithEnum {
  enum TestEnum {
    ENUM1 = 0;
    ENUM2 = 1;
    ENUM3 = 2;
  }
  TestEnum enumField = 1;
}

generates this datatype:

newtype WithEnum = WithEnum { withEnumEnumField :: HsProtobuf.Enumerated WithEnum_TestEnum }

@rashadg1030: Also, you might not need to make enums required anyway

Does this mean this should be?:

newtype WithEnum = WithEnum { withEnumEnumField :: Maybe (HsProtobuf.Enumerated WithEnum_TestEnum) }

@Gabriella439
Copy link
Contributor

@rashadg1030 and I pair-programmed on this and concluded that we might be able to correctly mark enums as required. One of the relevant functions we'll have to change, though, is this one:

dotProtoTypeContext :: MonadError CompileError m => DotProto -> m TypeContext
dotProtoTypeContext DotProto{..} =
foldMapM (definitionTypeContext (metaModulePath protoMeta)) protoDefinitions
definitionTypeContext :: MonadError CompileError m
=> Path -> DotProtoDefinition -> m TypeContext
definitionTypeContext modulePath (DotProtoMessage _ msgIdent parts) = do
let updateParent = tiParent (concatDotProtoIdentifier msgIdent)
childTyContext <- foldMapOfM (traverse . _DotProtoMessageDefinition)
(definitionTypeContext modulePath >=> traverse updateParent)
parts
qualifiedChildTyContext <- mapKeysM (concatDotProtoIdentifier msgIdent) childTyContext
let tyInfo = DotProtoTypeInfo { dotProtoTypeInfoPackage = DotProtoNoPackage
, dotProtoTypeInfoParent = Anonymous
, dotProtoTypeChildContext = childTyContext
, dotProtoTypeInfoKind = DotProtoKindMessage
, dotProtoTypeInfoModulePath = modulePath
}
pure $ M.singleton msgIdent tyInfo <> qualifiedChildTyContext

… since it's currently ignoring enums completely when processing definitions.

@rashadg1030
Copy link
Collaborator Author

rashadg1030 commented Aug 14, 2020

definitionTypeContext :: MonadError CompileError m
                      => Path -> DotProtoDefinition -> m TypeContext
definitionTypeContext modulePath (DotProtoMessage _ msgIdent parts) = do
  let updateParent = tiParent (concatDotProtoIdentifier msgIdent)


  childTyContext <- foldMapOfM (traverse . _DotProtoMessageDefinition)
                               (definitionTypeContext modulePath >=> traverse updateParent)
                               parts


  qualifiedChildTyContext <- mapKeysM (concatDotProtoIdentifier msgIdent) childTyContext


  let tyInfo = DotProtoTypeInfo { dotProtoTypeInfoPackage = DotProtoNoPackage
                                , dotProtoTypeInfoParent =  Anonymous
                                , dotProtoTypeChildContext = childTyContext
                                , dotProtoTypeInfoKind = DotProtoKindMessage
                                , dotProtoTypeInfoModulePath = modulePath
                                }


  pure $ M.singleton msgIdent tyInfo <> qualifiedChildTyContext


definitionTypeContext modulePath (DotProtoEnum _ enumIdent _) = do
  let tyInfo = DotProtoTypeInfo { dotProtoTypeInfoPackage = DotProtoNoPackage
                                , dotProtoTypeInfoParent =  Anonymous
                                , dotProtoTypeChildContext = mempty
                                , dotProtoTypeInfoKind = DotProtoKindEnum
                                , dotProtoTypeInfoModulePath = modulePath
                                }
  pure (M.singleton enumIdent tyInfo)


definitionTypeContext _ _ = pure mempty

We believe that definitionTypeContext function isn't adding the necessary type context for enums. I believe we have to fix the second pattern-match in this function. The one on DotProtoEnum _ _ _.

data DotProtoDefinition
  = DotProtoMessage String DotProtoIdentifier [DotProtoMessagePart]
  | DotProtoEnum    String DotProtoIdentifier [DotProtoEnumPart]
  | DotProtoService String DotProtoIdentifier [DotProtoServicePart]
  deriving (Show, Eq)

The issue I think is that [DotProtoEnumPart] isn't being added to the type context when they should be.

Edit: Actually I think the second pattern match of definitionTypeContext is correct. Enums don't have a childContext. The issue is that the enum identifier isn't associated with the typeInfo it seems like for some reason.

@rashadg1030
Copy link
Collaborator Author

rashadg1030 commented Aug 18, 2020

Going to briefly explain the changes I made yesterday here, but before I do I'd like to share this interesting thought which can be found in this comment: https://github.com/awakesecurity-dev/end-to-end/pull/13866#issuecomment-673656038

I'll note that the semantics are confusing because the API is not congruent:

  • There are the fields that we are guaranteed to send you.
  • There are the fields that you must send us.

This would imply always having different rules and messages based on which direction you are requesting. I'm primarily concerned with the first case, since that's 90% of what we deal with. I have no problems providing a required enum if it's technically not necessary. I would very much like for this to be based on what "we are guaranteed to send you".

@rashadg1030
Copy link
Collaborator Author

With that being said, the motivation for the most recent changes were this:

  • Want to make enum fields required
  • To do that we should be able to look up the type of a field in the TypeContext by the DotProtoIdentifier
  • Turns out fields aren't added to the TypeContext, only definitions
  • Added a function that adds DotProtoField to TypeContext, but it doesn't make sense. It seems like DotProtoFields. weren't meant to be added to the TypeContext

-> [String]
-- ^ Field names
-> [(String, Bool)]
-- ^ Field and if it is nested oneof field
Copy link
Collaborator Author

@rashadg1030 rashadg1030 Aug 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment should be fixed. The field represents whether a field is required or not

@rashadg1030
Copy link
Collaborator Author

rashadg1030 commented Aug 18, 2020

Okay so this is were I realized that adding enum fields to the type context is troublesome:

fieldTypeContext :: MonadError CompileError m => Path -> DotProtoField -> m TypeContext
fieldTypeContext modulePath (DotProtoField _ fieldType fieldName _ _) =
  case fieldType of
    Prim (Named _) -> do
    -- Not a good pattern match here. I want to pattern match specifically on enum fields,
    -- but that's not possible. All enums are named, but not all named types are enums


      let tyInfo = DotProtoTypeInfo { dotProtoTypeInfoPackage = DotProtoNoPackage
                                    , dotProtoTypeInfoParent = Anonymous
                                    , dotProtoTypeChildContext = mempty
                                    , dotProtoTypeInfoKind = DotProtoKindEnum
                                    , dotProtoTypeInfoModulePath = modulePath
                                    }


      pure $ M.singleton fieldName tyInfo


    _ -> do
      pure mempty


fieldTypeContext modulePath DotProtoEmptyField = do
  pure $ mempty

There is no way to specifically grab a field that is an enum. Once again the only solution I see to this is doing this:

data DotProtoPrimType
  = Int32
  | Int64
  | SInt32
  | SInt64
  | UInt32
  | UInt64
  | Fixed32
  | Fixed64
  | SFixed32
  | SFixed64
  | String
  | Bytes
  | Bool
  | Float
  | Double
  | Message DotProtoIdentifier
  | Enum DotProtoIndentifier
  -- ^ Two separate constructors to differentiate between a message and enum
  deriving (Show, Eq)

but as @Gabriel439 said this is an invasive change and doesn't follow the .proto specification. In this comment https://github.com/awakesecurity-dev/end-to-end/pull/13866#issuecomment-673656038, Nate said we might not have make enums required at all so maybe all of this unnecessary? If enums don't need to be required, then the code we had before this commit works.

@Gabriella439
Copy link
Contributor

@rashadg1030: Yeah, there is always the option of not making enums required at all. I think this route is only worth exploring if it's easy, but if it's not then we can go with the simpler approach of not requiring enums

@rashadg1030
Copy link
Collaborator Author

@Gabriel439 @natefaubion I'm going ahead with making enum type fields not required. For example:

message WithEnum {
  enum TestEnum {
    ENUM1 = 0;
    ENUM2 = 1;
    ENUM3 = 2;
  }
  TestEnum enumField = 1; 
}

will generate the following schema:

"WithEnum": {"properties":{"enumField":{"$ref":"#/definitions/WithEnum_TestEnum"}},"type":"object"}

which would generate this PureScript code:

data WithEnum = WithEnum { enumField :: Maybe WithEnum_TestEnum }

@natefaubion
Copy link

natefaubion commented Aug 18, 2020

Currently in our UI these are all considered required. If we do that, we will need to maintain special cases in the UI codegen. I do not want to introduce partiality for every enum for data that we get from the API. Most of our use cases our consuming data from the API, and this is just going to be saying that they may not be there (yet they are), and we will have to pervasively handle that case otherwise. I do not think this is the correct decision for our UI code, but since we already maintain special cases, we will just continue to do so.

@rashadg1030
Copy link
Collaborator Author

rashadg1030 commented Aug 18, 2020

@natefaubion Yes, I would also like the frontend developers to get the easiest data to work with. You shouldn't have to maintain those special cases. Making enum fields not required is something I would like to work on, but it would require a more involved change (that I'm willing to take on). I proposed making a change to how the types are parsed so that enum fields can be added to the type context, and can be made required. Read above for more about that. There is probably a better way to do it. This will have to be fixed eventually right? So we can either fix this now or make it technical debt? @Gabriel439

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants