HTTP Authorization

A discussion on authentication and authorization in HTTP

Posted on Sunday, 3 January, 2016

In Liberator, there are two callback decision functions that are used to implement security.

The first is authorized?, which determines whether request processing continues or a 401 response is returned, while the second is allowed?, which determines whether request processing continues or a 403 (Forbidden) is returned.

In the past, when developers have asked me which callback to use, I've recommended using the authorized? callback for authentication and the allowed? callback for authorization.

So authorized? for authentication? Obviously, that can't be right. I've justified this as a historic naming quirk of HTTP. The fact that the message of the 401 response is 'Authorization Failed', I've suggested that this is the only reason why Liberator's callback is authorized? and that people shouldn't worry about it further.

This explanation aligns with how most HTTP implementations do authentication and authorization for protected resources :-

  1. Authenticate the user contained in the request. If there is no user, send a 401 Unauthorized response with a WWW-Authenticate header.

  2. Authorize the request, to make sure the user contained in it is allowed to access the resource in question. If they aren't, send a 403 Forbidden.

Recently, I've been working on the security in yada which is intended as a faithful implementation of the HTTP standards, so I've been re-reading the RFCs, notably https://tools.ietf.org/html/rfc7235.

As a result, I now believe my prior interpretation of HTTP security, and that of many existing implementations, while approximately correct, is still not accurate.

Here I offer a more nuanced explanation of how HTTP security is supposed to work. For completeness, I'll work in the concept of realms.

(Realms) allow the protected resources on a server to be partitioned into a set of protection spaces, each with its own authentication scheme and/or authorization database. The realm value is a string, generally assigned by the origin server, that can have additional semantics specific to the authentication scheme. Note that a response can have multiple challenges with the same auth-scheme but with different realms. See RFC 7235 section 2.2.

So here's the revised algorithm :-

  1. For each realm governing this resource,

a. validate the user's credentials contained in the Authorization or other header of request, according to the authentication scheme used by the request (this could be contained in the Authorization header itself, or in another header or cookie, or elsewhere.

b. determine the user's authorization rights (roles, permissions, entitlements).

c. if no valid credentials are specified, particularly if resource's representation depends on whether they are, set the WWW-Authenticate header on the response containing a comma separated list of the realm's authentication schemes and parameters. See RFC 7235 section 4.1.

  1. Load the resource's access requirements. (I've made this an actual step to map onto yada's get-properties step.)

  2. If the resource requires a valid user, but there is no valid user in the request, return with a 401 response.

  3. If the resource does not require a valid user for the given method, proceed with the request. The WWW-Authenticate is still set on the response, to indicate to the user-agent that providing credentials might influence the response.

  4. Otherwise, if there is a valid user, but the user's authorization rights (roles, permissions, entitlements) are not sufficient to access the resource, return a 403 (or 404). See RFC 7235 section 2.1.

According to this algorithm the absence of a user in the request does not, in itself, warrant a 401 response. It is only when we look at the method involved, and the resource itself that we can make that determination. It's quite possible that a resource might allow access from safe methods but not allow unsafe methods (that might mutate the resource) from unauthorized users.

It should also possible to communicate to a user agent that providing an Authorization might influence what they get back from a request, while at the same time providing a 200 response.

These cases are tricky to implement with Liberator, but the revised algorithm makes these situations possible to implement, at the expense of a more complicated resource model.

In this interpretation, a 401 does have to do with authorization, and the Unauthorized message is not merely a quirk of HTTP. It is not simply enough to check that there is no valid user, it is also important to check that the resource needs one.