The problem is that OAuth 2.0 is a Delegated Authorization protocol, and not a Authentication protocol.
This is generally a four party model User, Website, Authorization server, and Protected resource.
This leads people to make what turn out to be very bad security decisions around authentication when they follow the basic OAuth flow.
Part of this is due to the activities being quite different. OAuth provides an access token to a client, so that it can access a protected resource, based on the permission of the resource owner.
I am going to use Facebook as my example, they are not the only ones to use this pattern, but they are the best well known, so make a easy target.(Sorry guys)
In the Facebook Connect case:
The disadvantage is that they have no security. Not to put too fine a point on it, they have not Authenticated the User. They have gotten delegated access to the users information.
Now I know many of you are thinking John is just being overly dramatic. This is just a semantic difference that is only important to obsessives. That may be true, but in this case the difference leads to security holes that even Facebook recognizes and attempts to address in their code if not their documentation.
The problem is that the motivations of the participants are different between Authentication and Authorization. In the authorization case the client can be trusted with the access token because it has no real motivation to share it. They could give it to a third party and also grant them access to the information(protected resource), but they can just share the information anyway if they are bad.
In the above Facebook example their is a naive expectation that the access token is coming from the resource owner. Now in the Facebook case that is true only at the original token issuance.
The user is handing over access tokens all the time to Websites who use those tokens to access the Facebook Graph API.
The hard reality is that people go to questionable websites. One of the things people do at those sites is use Facebook, openID or Twitter to login. This avoids sharing their email and password with the site.
The problem is that in the authentication case Websites do have a motivation to inappropriately reuse the access token. The token is no longer just for accessing the protected resource, it now carries with it the implicit notion that the possessor is the resource owner.
So we wind up in the situation where any site the user loges into with their Facebook account can impersonate that user at any other site that accepts Facebook logins through the Client-side Flow(the default for Facebook). The less common Server-side Flow is more secure, but more complicated.
The attack really is quite trivial once you have a access token for the given user, you can cut and paste it into a authorization response in the browser.
Some people believe that using the state parameter in OAuth protects against token substitution.
The only binding a browser cookie to state protects against is Cross Site Scripting. In the attack I am proposing the client generates a legitimate request to a bad user agent that captures it and provides a response that includes state unchanged from the request. Their is nothing in the OAuth client-side flow that proves the issuer you sent the request to through the browser ever received it and is the one responding. Only the access_token parameter is generated by the Authorization server, all the other parameters are dropped or echoed back. The client has no way to tell who the authorization server thought it was issuing that access_token to.
This is a security hole that you can drive a house through. As long as you don't have anything worth protecting at the Client websites it works OK, however OpenID Connect needs to address more than the simple no security use case.
Traditionally authentication involves three parties User, Relying party
Web Site, and Authentication server. The Authentication server produces
some sort of token/assertion for the consumption of the Relying Party
(RP). This assertion in OpenID 2.0 and SAML 2.0
contain a way to identify who (what RP) the token was created for. The
RP only accepts tokens addressed to it. This prevents RP's and others
from replaying tokens at different RP.
In OAuth the token is intended for the consumption of the protected resource, and intentionally opaque to the client (RP). The RP has no way to tell from the token if it was generated for it or another RP.
You could say the audience for the OAuth token is the protected resource and the audience for a authentication token is the RP. They are not the same endpoint!
How is Connect Different from plain OAuth?
We needed to add the appropriate security and semantics for authentication without compromising OAuth's functionality as a Authorization protocol.
One idea we explored was adding a audience claim to the protected resource(User Info Endpoint).
This requires communicating the OAuth client_id from the authorization endpoint to the protected resource. The biggest problem is that the RP is required to do a network operation in the background to get the user_id before logging the user in.
Many RP including Facebook stated that the extra latency in the network round trip was unacceptable.
Now you are thinking how can that be if that is way Facebook is doing?
Well it turns out that they have some undocumented enhancements to get around the latency issue.
They are using something they call signed-request.
The idea is that they are encapsulating the OAuth code token inside a "signed" JSON object.
The SDK they provide is looking for user_id in the signed request and using that to log the user in before doing any back channel OAuth requests. This provides an implicit audience restriction by the client verifying a symmetric client password used to generate a HMAC over the token.
To be clear Facebook is NOT using the signed_request as the access_token they are extracting code from the signed_request and exchanging that for a access_token at the Token Endpoint.
One thing OAuth 2.0 allows is defining new Response Types. Based on OAuth 2.0 Multiple Response Type Encoding Practices by Google and Facebook, we decided to use a separate id_token parameter in the response rather than overload the access_token. That allows the tokens to have separate audience restrictions and formats. Facebook is doing the same thing but are using signed_request rather than id_token as the response type. We both return the extra token fragment encoded win an additional parameter, in Facebooks case signed_request and in openID's id_token.
Connect documents having two tokens one for authentication and one for delegated access to protected resources. Facebook's documentation on this could use some improvements.
Connect is also using a proposed IETF standard for Javascript Object Signing and Encryption (JOSE) rather than proprietary method. We hope that the availability of standard libraries will improve security, and reduce developer effort.
Connect has had to take into consideration that RP will be dealing with multiple Identity Providers, unlike single provider protocols.
When creating a protocol their is always a balance that is struck between having a single tool that is flexible vs multiple highly optimized ones. I don't think having each identity provider deploy their own incompatible identity layer on top of OAuth is good for RP or adoption. We came up with a compromise that is not optimized to only serve a single use case, though we have made the simple social login use case no more complicated for the RP than Facebook Connect.
Many of the optional features for higher security and distributed claims will only be used by those sites that need them, however it will give a clear migration path for sites starting with social login.
The specification looks a bit scary because for interoperability reasons we need to document the server side. Nat's blogpost on OpenID Connect in a Nutshell lays out how simple it is for a RP to implement.
This is generally a four party model User, Website, Authorization server, and Protected resource.
This leads people to make what turn out to be very bad security decisions around authentication when they follow the basic OAuth flow.
Part of this is due to the activities being quite different. OAuth provides an access token to a client, so that it can access a protected resource, based on the permission of the resource owner.
I am going to use Facebook as my example, they are not the only ones to use this pattern, but they are the best well known, so make a easy target.(Sorry guys)
Diagram by Amanda Anganes (mitre.org) |
- The Website (client in OAuth speak) redirects the user to Facebook asking for access to the users portion of the Facebook Graph API endpoint.
- Facebook gets the users Authorization to give that access.
- Facebook then redirects the user back to the client passing an access token in the URL fragment.
- The client performs a GET on the Facebook API endpoint using the access token from step 3.
- The Graph API endpoint returns a JSON object that contains a Facebook user_id and other public and perhaps private information based on what access the user granted.
- The Website logs in user in as the user_id from the Graph API endpoint.
The disadvantage is that they have no security. Not to put too fine a point on it, they have not Authenticated the User. They have gotten delegated access to the users information.
Now I know many of you are thinking John is just being overly dramatic. This is just a semantic difference that is only important to obsessives. That may be true, but in this case the difference leads to security holes that even Facebook recognizes and attempts to address in their code if not their documentation.
The problem is that the motivations of the participants are different between Authentication and Authorization. In the authorization case the client can be trusted with the access token because it has no real motivation to share it. They could give it to a third party and also grant them access to the information(protected resource), but they can just share the information anyway if they are bad.
In the above Facebook example their is a naive expectation that the access token is coming from the resource owner. Now in the Facebook case that is true only at the original token issuance.
The user is handing over access tokens all the time to Websites who use those tokens to access the Facebook Graph API.
The hard reality is that people go to questionable websites. One of the things people do at those sites is use Facebook, openID or Twitter to login. This avoids sharing their email and password with the site.
The problem is that in the authentication case Websites do have a motivation to inappropriately reuse the access token. The token is no longer just for accessing the protected resource, it now carries with it the implicit notion that the possessor is the resource owner.
So we wind up in the situation where any site the user loges into with their Facebook account can impersonate that user at any other site that accepts Facebook logins through the Client-side Flow(the default for Facebook). The less common Server-side Flow is more secure, but more complicated.
The attack really is quite trivial once you have a access token for the given user, you can cut and paste it into a authorization response in the browser.
Some people believe that using the state parameter in OAuth protects against token substitution.
The only binding a browser cookie to state protects against is Cross Site Scripting. In the attack I am proposing the client generates a legitimate request to a bad user agent that captures it and provides a response that includes state unchanged from the request. Their is nothing in the OAuth client-side flow that proves the issuer you sent the request to through the browser ever received it and is the one responding. Only the access_token parameter is generated by the Authorization server, all the other parameters are dropped or echoed back. The client has no way to tell who the authorization server thought it was issuing that access_token to.
This is a security hole that you can drive a house through. As long as you don't have anything worth protecting at the Client websites it works OK, however OpenID Connect needs to address more than the simple no security use case.
In OAuth the token is intended for the consumption of the protected resource, and intentionally opaque to the client (RP). The RP has no way to tell from the token if it was generated for it or another RP.
You could say the audience for the OAuth token is the protected resource and the audience for a authentication token is the RP. They are not the same endpoint!
How is Connect Different from plain OAuth?
We needed to add the appropriate security and semantics for authentication without compromising OAuth's functionality as a Authorization protocol.
One idea we explored was adding a audience claim to the protected resource(User Info Endpoint).
This requires communicating the OAuth client_id from the authorization endpoint to the protected resource. The biggest problem is that the RP is required to do a network operation in the background to get the user_id before logging the user in.
Many RP including Facebook stated that the extra latency in the network round trip was unacceptable.
Now you are thinking how can that be if that is way Facebook is doing?
Well it turns out that they have some undocumented enhancements to get around the latency issue.
They are using something they call signed-request.
The idea is that they are encapsulating the OAuth code token inside a "signed" JSON object.
The SDK they provide is looking for user_id in the signed request and using that to log the user in before doing any back channel OAuth requests. This provides an implicit audience restriction by the client verifying a symmetric client password used to generate a HMAC over the token.
To be clear Facebook is NOT using the signed_request as the access_token they are extracting code from the signed_request and exchanging that for a access_token at the Token Endpoint.
One thing OAuth 2.0 allows is defining new Response Types. Based on OAuth 2.0 Multiple Response Type Encoding Practices by Google and Facebook, we decided to use a separate id_token parameter in the response rather than overload the access_token. That allows the tokens to have separate audience restrictions and formats. Facebook is doing the same thing but are using signed_request rather than id_token as the response type. We both return the extra token fragment encoded win an additional parameter, in Facebooks case signed_request and in openID's id_token.
Connect documents having two tokens one for authentication and one for delegated access to protected resources. Facebook's documentation on this could use some improvements.
Connect is also using a proposed IETF standard for Javascript Object Signing and Encryption (JOSE) rather than proprietary method. We hope that the availability of standard libraries will improve security, and reduce developer effort.
Connect has had to take into consideration that RP will be dealing with multiple Identity Providers, unlike single provider protocols.
When creating a protocol their is always a balance that is struck between having a single tool that is flexible vs multiple highly optimized ones. I don't think having each identity provider deploy their own incompatible identity layer on top of OAuth is good for RP or adoption. We came up with a compromise that is not optimized to only serve a single use case, though we have made the simple social login use case no more complicated for the RP than Facebook Connect.
Many of the optional features for higher security and distributed claims will only be used by those sites that need them, however it will give a clear migration path for sites starting with social login.
The specification looks a bit scary because for interoperability reasons we need to document the server side. Nat's blogpost on OpenID Connect in a Nutshell lays out how simple it is for a RP to implement.
No comments:
Post a Comment