Offer validation with a simple function#11
Conversation
Pytest tries to find tests from everywhere starting with "Test". It's a good practice to give names starting with "Test" to only things that actually contain tests.
The tests so far have been dealing with the @validated decorator and its functionality. Having a dedicated directory for tests is also beneficial in organising code.
The test suite can be used in different ways to check the validation functionality.
Extract an `_execute_query` method in `ValidationTestSuite` that is used in the various test functions. This reduces repetition in the test functions.
The main validation logic isn't tied to any decorators, so `decorators` wasn't the most suitable module for it.
…e place The `utils._unpack_input_tree` helper should collect all the validatable sub trees, including the root tree. This change also prevents adding a null root input tree into the validatable sub trees.
This validation function is just extracted out of the @validated class decorator. It can be called directly from within a Mutation class's `mutate` function. The decorator now just uses this new function.
The `validate` function can now also be used with ClientIDMutations. The input arguments are constructed differently in ClientIDMutations and regular Mutations. ClientIDMutations use an Input inner class, regular Mutations use Arguments inner class. That difference needs to be taken into account. Otherwise the validation happens in a similar way.
|
@chpmrc Would you like to check out this PR and give some comments? |
|
Yep I'll take a look, thanks! Can we break things down into smaller PRs next time though? I can quickly review smaller PRs whenever I have some time ;) |
| def mutate(cls, root, info, _input=None): | ||
| validate(cls, root, info, _input=_input) | ||
|
|
||
| if _input is None: |
There was a problem hiding this comment.
Super nit: we can just use input. Also this can be simplified to input = input or {}.
|
|
||
| @classmethod | ||
| def mutate_and_get_payload(cls, root, info, **_input): | ||
| validate(cls, root, info, **_input) |
There was a problem hiding this comment.
I'm unclear on whether this approach requires explicitly calling validate within a mutation or if it can be done transparently to the decorated mutation.
There was a problem hiding this comment.
With graphene.Mutation based classed there are options. One can choose to not use the decorator and call the validate function at any point within the mutate method that they want. Or one can use the class decorator -- which has now been modified to do no more additional work than just call the validate function as the first thing within the mutate method.
With graphene.relay.ClientIDMutation based classes there are no options, as the class decorator doesn't work for them. The only option is to call the validate function explicitly.
It could be possible to craft a function decorator that makes the mutate (or mutate_and_get_payload) method call the validate function as the first thing. But I haven't tried that since most of our use cases have turned out to be such that the validate call is not the first thing to do within the mutate function, so such decorator wouldn't work for us any way.
There was a problem hiding this comment.
Got it. From my perspective the validation function should be the very first thing to be called. Auth stuff could easily be encapsulated in a BaseInput class that checks it in the general validate. It would be helpful to see an example where validating the mutation input is not necessarily the first step. This way there would be a single approach no matter the kind of mutation and the behavior would be standardized (i.e. middleware (if any) -> validation -> mutation body).
There was a problem hiding this comment.
Here's an example how we have currently done it (more can be found by searching for mutate_and_get_payload in that same file). So there are different kinds of permission checks done. No need to study them too closely. The authorization checks can depend on:
- The mutation input data
- Other data on the request, for example authentication headers
- State in the database, for example (Django) permissions
...and usually it's a combination of all of them.
One design guideline is also important to mention, that we are following: all authorization checks should be performed before any input validation. That's because we don't want to return any validation errors to the requester, if the requester isn't even authorized to perform the action they are asking. That's why all authorization related things are checked first, and if all of them pass, then input validation can be performed.
I also think that authorization and input validation are different things, so putting authorization checks in validate functions feels wrong.
We have used function decorators for some authorization checks (starting from the very basic case, is the requester authenticated) that are easy to extract into a decorator. But some checks are quite special (you could say complex), possibly depend on multiple data sources, that they could be harder to extract into decorators. Also extracting authorization into decorators just so that validate can be the first thing within mutate seems like the wrong reason to do that. Decorators should probably be implemented so that code becomes easier to read and reuse.
This is how we have currently implemented the authorization logic. For sure there are other ways to do this too. I hope this explanation sheds more light to the issue.
There was a problem hiding this comment.
That's because we don't want to return any validation errors to the requester, if the requester isn't even authorized to perform the action they are asking
We could probably change the decorator's logic so that this can be done in a base validator and chain them together, maybe by adding a pre_validate method or something (validate is only called after per-field validation). Since there is no builtin "authentication" concept in GraphQL that should either be part of an external middleware (e.g. sitting within the HTTP server or, e.g., a Django middleware) or part of the validation chain. Doing anything before the validation itself would break a key assumption IMO, that validation is the very first thing that happens when any input is sent over. Whether that's performed on the input, the headers, etc. is secondary.
Re: having to explicitly call validate if that's the only way then it's totally fine, as long as we maintain compatibility with the more transparent mutate and have a dedicated section in the README for Relay.
Let me know if this makes sense to you 👍
There was a problem hiding this comment.
I want to point out that authentication and authorization are different things. Django mixes them together but that leads to the situation that we can't use Django's authentication/authorization framework at all with GraphQL. So we do have a home grown (Django) middleware that does authentication (and authentication only) early in the HTTP request handling, based on HTTP headers. Then with GraphQL requests, authorization happens later, at some point where the GraphQL queries are handled.
Authorization isn't also a single operation. It can happen that the caller requests GraphQL operations X, Y and Z in the request. In this case authorization can be performed for example three times, once for each GraphQL operation, potentially every time with different rules. Maybe the requester is authorized to do X and Y, but not Z. They get results for X and Y, and an error code for Z. Note that X, Y and Z aren't necessarily mutations, they can as well be queries. Authorization is also needed in queries.
So it still looks to me that authorization should be completely separate from GraphQL input validation. That's why I wouldn't put authorization into a pre_validate function either, even if such possibility existed.
It could be possible to move authorization into some sort of combination of decorators, Graphene middleware, and maybe something else, if it would be desired to keep the mutate clear of any authorization logic. I haven't studied this, so can't really say if it would be a good or bad design. Perhaps I should experiment what it could look like.
There was a problem hiding this comment.
Yep, makes sense. The way I see it (without Django, since this is a generic graphene validation package) there would be something that performs authentication and attaches an ID or a whole user instance (however it's represented, could be a Django model instance, could be a plain object, etc.) that could even be external to graphene. Then a generic validator that all other validators can extend and that can be specialized for the specific needs of the related mutation. For example an AuthValidator could require the subclass to implement is_authorized or something like that.
This way when mutate (or a resolver) gets called we know the user has been authenticated, they are authorized to perform that operation and the input they provided is valid. I completely agree that auth(entication|orization) stuff should not be performed in mutate.
Basically an inversion of control where the top validator(s) allows data to flow down to the other validators or raises a validation error. IMO by composing/extending validators one can implement however many layers are required.
Hope this isn't too confusing.
There was a problem hiding this comment.
Yep, we can leave Django out of this discussion. It's not important here. It works as an example but nothing else.
I think graphene-validator shouldn't care about authorization at all as it's an unrelated concept. So it doesn't need to specifically support authorization, but it should not hinder it either.
Let's consider an example mutation with some input data:
mutation {
mutateObject(input: {
id: "123"
foo: "foodata"
bar: "bardata"
}) {
...
}
}We may want to perform authorization in many levels here.
- Is the user authorized to modify the object with id "123" at all?
- Is the user authorized to modify field "foo" for object "123"?
- Is the user authorized to modify field "bar" for object "123"?
Perhaps fields "foo" and "bar" are optional, so we would like to perform authorization regarding the fields only if they are present in the input. Authorization may need some specialized code for a mutation and is dependent on the given input. After all those authorization checks have been done, it's graphene-validator's turn to do input validation, like check if "foodata" and "bardata" are valid etc.
I can see that authorization and input validation may have some similarities: they may need to inspect the input tree and make decisions based on it. But still I think authorization isn't part of validation. Perhaps some other library can be used for authorization. Graphene-validator should stay out of the way and not hinder the usage of such solutions.
The
@validatedclass decorator doesn't work with mutation classes inherited from Graphene'srelay.ClientIDMutation, as issue #10 explains.The
@validatedclass decorator also always performs validation before the mutation class's actualmutatefunction is entered. That might not always be what is needed. There are use cases (e.g authorisation) where some work should be done by themutatefunction before input validation is done.The validation function proposed in this PR solves (or can be used to circumvent) both of these problems.
TODO:
This PR is a draft for getting comments about the proposed idea. The follwing items need to be completed before this should be merged in.