Skip to content

Offer validation with a simple function#11

Draft
akikoskinen wants to merge 8 commits into
elchiapp:masterfrom
City-of-Helsinki:validation-functions
Draft

Offer validation with a simple function#11
akikoskinen wants to merge 8 commits into
elchiapp:masterfrom
City-of-Helsinki:validation-functions

Conversation

@akikoskinen
Copy link
Copy Markdown
Collaborator

The @validated class decorator doesn't work with mutation classes inherited from Graphene's relay.ClientIDMutation, as issue #10 explains.

The @validated class decorator also always performs validation before the mutation class's actual mutate function is entered. That might not always be what is needed. There are use cases (e.g authorisation) where some work should be done by the mutate function before input validation is done.

The validation function proposed in this PR solves (or can be used to circumvent) both of these problems.

TODO:

This PR is a draft for getting comments about the proposed idea. The follwing items need to be completed before this should be merged in.

  • write documentation

Pytest tries to find tests from everywhere starting with "Test". It's a
good practice to give names starting with "Test" to only things that
actually contain tests.
The tests so far have been dealing with the @validated decorator and its
functionality. Having a dedicated directory for tests is also beneficial
in organising code.
The test suite can be used in different ways to check the validation
functionality.
Extract an `_execute_query` method in `ValidationTestSuite` that is used
in the various test functions. This reduces repetition in the test
functions.
The main validation logic isn't tied to any decorators, so `decorators`
wasn't the most suitable module for it.
…e place

The `utils._unpack_input_tree` helper should collect all the validatable
sub trees, including the root tree. This change also prevents adding a
null root input tree into the validatable sub trees.
This validation function is just extracted out of the @validated class
decorator. It can be called directly from within a Mutation class's
`mutate` function. The decorator now just uses this new function.
The `validate` function can now also be used with ClientIDMutations. The
input arguments are constructed differently in ClientIDMutations and
regular Mutations. ClientIDMutations use an Input inner class, regular
Mutations use Arguments inner class. That difference needs to be taken
into account. Otherwise the validation happens in a similar way.
@akikoskinen
Copy link
Copy Markdown
Collaborator Author

@chpmrc Would you like to check out this PR and give some comments?

@elchiapp
Copy link
Copy Markdown
Owner

Yep I'll take a look, thanks! Can we break things down into smaller PRs next time though? I can quickly review smaller PRs whenever I have some time ;)

def mutate(cls, root, info, _input=None):
validate(cls, root, info, _input=_input)

if _input is None:
Copy link
Copy Markdown
Owner

@elchiapp elchiapp Sep 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super nit: we can just use input. Also this can be simplified to input = input or {}.


@classmethod
def mutate_and_get_payload(cls, root, info, **_input):
validate(cls, root, info, **_input)
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unclear on whether this approach requires explicitly calling validate within a mutation or if it can be done transparently to the decorated mutation.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With graphene.Mutation based classed there are options. One can choose to not use the decorator and call the validate function at any point within the mutate method that they want. Or one can use the class decorator -- which has now been modified to do no more additional work than just call the validate function as the first thing within the mutate method.

With graphene.relay.ClientIDMutation based classes there are no options, as the class decorator doesn't work for them. The only option is to call the validate function explicitly.

It could be possible to craft a function decorator that makes the mutate (or mutate_and_get_payload) method call the validate function as the first thing. But I haven't tried that since most of our use cases have turned out to be such that the validate call is not the first thing to do within the mutate function, so such decorator wouldn't work for us any way.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. From my perspective the validation function should be the very first thing to be called. Auth stuff could easily be encapsulated in a BaseInput class that checks it in the general validate. It would be helpful to see an example where validating the mutation input is not necessarily the first step. This way there would be a single approach no matter the kind of mutation and the behavior would be standardized (i.e. middleware (if any) -> validation -> mutation body).

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's an example how we have currently done it (more can be found by searching for mutate_and_get_payload in that same file). So there are different kinds of permission checks done. No need to study them too closely. The authorization checks can depend on:

  • The mutation input data
  • Other data on the request, for example authentication headers
  • State in the database, for example (Django) permissions

...and usually it's a combination of all of them.

One design guideline is also important to mention, that we are following: all authorization checks should be performed before any input validation. That's because we don't want to return any validation errors to the requester, if the requester isn't even authorized to perform the action they are asking. That's why all authorization related things are checked first, and if all of them pass, then input validation can be performed.

I also think that authorization and input validation are different things, so putting authorization checks in validate functions feels wrong.

We have used function decorators for some authorization checks (starting from the very basic case, is the requester authenticated) that are easy to extract into a decorator. But some checks are quite special (you could say complex), possibly depend on multiple data sources, that they could be harder to extract into decorators. Also extracting authorization into decorators just so that validate can be the first thing within mutate seems like the wrong reason to do that. Decorators should probably be implemented so that code becomes easier to read and reuse.

This is how we have currently implemented the authorization logic. For sure there are other ways to do this too. I hope this explanation sheds more light to the issue.

Copy link
Copy Markdown
Owner

@elchiapp elchiapp Oct 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's because we don't want to return any validation errors to the requester, if the requester isn't even authorized to perform the action they are asking

We could probably change the decorator's logic so that this can be done in a base validator and chain them together, maybe by adding a pre_validate method or something (validate is only called after per-field validation). Since there is no builtin "authentication" concept in GraphQL that should either be part of an external middleware (e.g. sitting within the HTTP server or, e.g., a Django middleware) or part of the validation chain. Doing anything before the validation itself would break a key assumption IMO, that validation is the very first thing that happens when any input is sent over. Whether that's performed on the input, the headers, etc. is secondary.

Re: having to explicitly call validate if that's the only way then it's totally fine, as long as we maintain compatibility with the more transparent mutate and have a dedicated section in the README for Relay.

Let me know if this makes sense to you 👍

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to point out that authentication and authorization are different things. Django mixes them together but that leads to the situation that we can't use Django's authentication/authorization framework at all with GraphQL. So we do have a home grown (Django) middleware that does authentication (and authentication only) early in the HTTP request handling, based on HTTP headers. Then with GraphQL requests, authorization happens later, at some point where the GraphQL queries are handled.

Authorization isn't also a single operation. It can happen that the caller requests GraphQL operations X, Y and Z in the request. In this case authorization can be performed for example three times, once for each GraphQL operation, potentially every time with different rules. Maybe the requester is authorized to do X and Y, but not Z. They get results for X and Y, and an error code for Z. Note that X, Y and Z aren't necessarily mutations, they can as well be queries. Authorization is also needed in queries.

So it still looks to me that authorization should be completely separate from GraphQL input validation. That's why I wouldn't put authorization into a pre_validate function either, even if such possibility existed.

It could be possible to move authorization into some sort of combination of decorators, Graphene middleware, and maybe something else, if it would be desired to keep the mutate clear of any authorization logic. I haven't studied this, so can't really say if it would be a good or bad design. Perhaps I should experiment what it could look like.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, makes sense. The way I see it (without Django, since this is a generic graphene validation package) there would be something that performs authentication and attaches an ID or a whole user instance (however it's represented, could be a Django model instance, could be a plain object, etc.) that could even be external to graphene. Then a generic validator that all other validators can extend and that can be specialized for the specific needs of the related mutation. For example an AuthValidator could require the subclass to implement is_authorized or something like that.

This way when mutate (or a resolver) gets called we know the user has been authenticated, they are authorized to perform that operation and the input they provided is valid. I completely agree that auth(entication|orization) stuff should not be performed in mutate.

Basically an inversion of control where the top validator(s) allows data to flow down to the other validators or raises a validation error. IMO by composing/extending validators one can implement however many layers are required.

Hope this isn't too confusing.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, we can leave Django out of this discussion. It's not important here. It works as an example but nothing else.

I think graphene-validator shouldn't care about authorization at all as it's an unrelated concept. So it doesn't need to specifically support authorization, but it should not hinder it either.

Let's consider an example mutation with some input data:

mutation {
  mutateObject(input: {
    id: "123"
    foo: "foodata"
    bar: "bardata"
  }) {
    ...
  }
}

We may want to perform authorization in many levels here.

  • Is the user authorized to modify the object with id "123" at all?
  • Is the user authorized to modify field "foo" for object "123"?
  • Is the user authorized to modify field "bar" for object "123"?

Perhaps fields "foo" and "bar" are optional, so we would like to perform authorization regarding the fields only if they are present in the input. Authorization may need some specialized code for a mutation and is dependent on the given input. After all those authorization checks have been done, it's graphene-validator's turn to do input validation, like check if "foodata" and "bardata" are valid etc.

I can see that authorization and input validation may have some similarities: they may need to inspect the input tree and make decisions based on it. But still I think authorization isn't part of validation. Perhaps some other library can be used for authorization. Graphene-validator should stay out of the way and not hinder the usage of such solutions.

@elchiapp elchiapp mentioned this pull request Oct 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants