Zend_Oauth

Introduction to OAuth

OAuth allows you to approve access by any application to your private data stored a website without being forced to disclose your username or password. If you think about it, the practice of handing over your username and password for sites like Yahoo Mail or Twitter has been endemic for quite a while. This has raised some serious concerns because there's nothing to prevent other applications from misusing this data. Yes, some services may appear trustworthy but that is never guaranteed. OAuth resolves this problem by eliminating the need for any username and password sharing, replacing it with a user controlled authorization process.

This authorization process is token based. If you authorize an application (and by application we can include any web based or desktop application) to access your data, it will be in receipt of an Access Token associated with your account. Using this Access Token, the application can access your private data without continually requiring your credentials. In all this authorization delegation style of protocol is simply a more secure solution to the problem of accessing private data via any web service API.

OAuth is not a completely new idea, rather it is a standardized protocol building on the existing properties of protocols such as Google AuthSub, Yahoo BBAuth, Flickr API, etc. These all to some extent operate on the basis of exchanging user credentials for an Access Token of some description. The power of a standardized specification like OAuth is that it only requires a single implementation as opposed to many disparate ones depending on the web service. This standardization has not occurred independently of the major players, and indeed many now support OAuth as an alternative and future replacement for their own solutions.

Zend Framework's Zend_Oauth currently implements a full OAuth Consumer conforming to the OAuth Core 1.0 Revision A Specification (24 June 2009) via the Zend_Oauth_Consumer class.

Protocol Workflow

Before implementing OAuth it makes sense to understand how the protocol operates. To do so we'll take the example of Twitter which currently implements OAuth based on the OAuth Core 1.0 Revision A Specification. This example looks at the protocol from the perspectives of the User (who will approve access), the Consumer (who is seeking access) and the Provider (who holds the User's private data). Access may be read-only or read and write.

By chance, our User has decided that they want to utilise a new service called TweetExpress which claims to be capable of reposting your blog posts to Twitter in a manner of seconds. TweetExpress is a registered application on Twitter meaning that it has access to a Consumer Key and a Consumer Secret (all OAuth applications must have these from the Provider they will be accessing) which identify its requests to Twitter and that ensure all requests can be signed using the Consumer Secret to verify their origin.

To use TweetExpress you are asked to register for a new account, and after your registration is confirmed you are informed that TweetExpress will seek to associate your Twitter account with the service.

In the meantime TweetExpress has been busy. Before gaining your approval from Twitter, it has sent a HTTP request to Twitter's service asking for a new unauthorized Request Token. This token is not User specific from Twitter's perspective, but TweetExpress may use it specifically for the current User and should associate it with their account and store it for future use. TweetExpress now redirects the User to Twitter so they can approve TweetExpress' access. The URL for this redirect will be signed using TweetExpress' Consumer Secret and it will contain the unauthorized Request Token as a parameter.

At this point the User may be asked to log into Twitter and will now be faced with a Twitter screen asking if they approve this request by TweetExpress to access Twitter's API on the User's behalf. Twitter will record the response which we'll assume was positive. Based on the User's approval, Twitter will record the current unauthorized Request Token as having been approved by the User (thus making it User specific) and will generate a new value in the form of a verification code. The User is now redirected back to a specific callback URL used by TweetExpress (this callback URL may be registered with Twitter or dynamically set using an oauth_callback parameter in requests). The redirect URL will contain the newly generated verification code.

TweetExpress' callback URL will trigger an examination of the response to determine whether the User has granted their approval to Twitter. Assuming so, it may now exchange it's unauthorized Request Token for a fully authorized Access Token by sending a request back to Twitter including the Request Token and the received verification code. Twitter should now send back a response containing this Access Token which must be used in all requests used to access Twitter's API on behalf of the User. Twitter will only do this once they have confirmed the attached Request Token has not already been used to retrieve another Access Token. At this point, TweetExpress may confirm the receipt of the approval to the User and delete the original Request Token which is no longer needed.

From this point forward, TweetExpress may use Twitter's API to post new tweets on the User's behalf simply by accessing the API endpoints with a request that has been digitally signed (via HMAC-SHA1) with a combination of TweetExpress' Consumer Secret and the Access Key being used.

Although Twitter do not currently expire Access Tokens, the User is free to deauthorize TweetExpress from their Twitter account settings. Once deauthorized, TweetExpress' access will be cut off and their Access Token rendered invalid.

Security Architecture

OAuth was designed specifically to operate over an insecure HTTP connection and so the use of HTTPS is not required though obviously it would be desireable if available. Should a HTTPS connection be feasible, OAuth offers a signature method implementation called PLAINTEXT which may be utilised. Over a typical unsecured HTTP connection, the use of PLAINTEXT must be avoided and an alternate scheme using. The OAuth specification defines two such signature methods: HMAC-SHA1 and RSA-SHA1. Both are fully supported by Zend_Oauth.

These signature methods are quite easy to understand. As you can imagine, a PLAINTEXT signature method does nothing that bears mentioning since it relies on HTTPS. If you were to use PLAINTEXT over HTTP, you are left with a significant problem: there's no way to be sure that the content of any OAuth enabled request (which would include the OAuth Access Token) was altered en route. This is because unsecured HTTP requests are always at risk of eavesdropping, Man In The Middle (MITM) attacks, or other risks whereby a request can be retooled so to speak to perform tasks on behalf of the attacker by masquerading as the origin application without being noticed by the service provider.

HMAC-SHA1 and RSA-SHA1 alleviate this risk by digitally signing all OAuth requests with the original application's registered Consumer Secret. Assuming only the Consumer and the Provider know what this secret is, a middle-man can alter requests all they wish - but they will not be able to validly sign them and unsigned or invalidly signed requests would be discarded by both parties. Digital signatures therefore offer a guarantee that validly signed requests do come from the expected party and have not been altered en route. This is the core of why OAuth can operate over an unsecure connection.

How these digital signatures operate depends on the method used, i.e. HMAC-SHA1, RSA-SHA1 or perhaps another method defined by the service provider. HMAC-SHA1 is a simple mechanism which generates a Message Authentication Code (MAC) using a cryptographic hash function (i.e. SHA1) in combination with a secret key known only to the message sender and receiver (i.e. the OAuth Consumer Secret and the authorized Access Key combined). This hashing mechanism is applied to the parameters and content of any OAuth requests which are concatenated into a "base signature string" as defined by the OAuth specification.

RSA-SHA1 operates on similar principles except that the shared secret is, as you would expect, each parties' RSA private key. Both sides would have the other's public key with which to verify digital signatures. This does pose a level of risk compared to HMAC-SHA1 since the RSA method does not use the Access Key as part of the shared secret. This means that if the RSA private key of any Consumer is compromised, then all Access Tokens assigned to that Consumer are also. RSA imposes an all or nothing scheme. In general, the majority of service providers offering OAuth authorization have therefore tended to use HMAC-SHA1 by default, and those who offer RSA-SHA1 may offer fallback support to HMAC-SHA1.

While digital signatures add to OAuth's security they are still vulnerable to other forms of attack, such as replay attacks which copy earlier requests which were intercepted and validly signed at that time. An attacker can now resend the exact same request to a Provider at will at any time and intercept its results. This poses a significant risk but it is quiet simple to defend against - add a unique string (i.e. a nonce) to all requests which changes per request (thus continually changing the signature string) but which can never be reused because Providers actively track used nonces within the a certain window defined by the timestamp also attached to a request. You might first suspect that once you stop tracking a particular nonce, the replay could work but this ignore the timestamp which can be used to determine a request's age at the time it was validly signed. One can assume that a week old request used in an attempted replay should be summarily discarded!

As a final point, this is not an exhaustive look at the security architecture in OAuth. For example, what if HTTP requests which contain both the Access Token and the Consumer Secret are eavesdropped? The system relies on at one in the clear transmission of each unless HTTPS is active, so the obvious conclusion is that where feasible HTTPS is to be preferred leaving unsecured HTTP in place only where it is not possible or affordable to do so.


Zend_Oauth