Introducing Cache Contexts, or: Why the browser does not know you are logged out

One of the Opera features I often see mentioned as "good" is our history navigation where you can navigate back and forwards in the history very quickly, and without making any revalidation/refetches of the content from the server.

This is made possible by our adherence to the principle stated in RFC 2616 (HTTP 1.1) section 13.13 about history list navigation.

13.13 History Lists

[…]History mechanisms and caches are different. In particular history mechanisms SHOULD NOT try to show a semantically transparent view of the current state of a resource. Rather, a history mechanism is meant to show exactly what the user saw at the time when the resource was retrieved.

By default, an expiration time does not apply to history mechanisms. If the entity is still in storage, a history mechanism SHOULD display it even if the entity has expired, unless the user has specifically configured the agent to refresh expired history documents.[…]

Unfortunately, now and then (such as just last week) there's somebody reporting this as a bug or even a "security issue". They are in particular using the Security Issue tag when the navigation happens on a site handling sensitive data, such as online banking, after the user has logged out.

Well, I can understand that online banking sites and other sites handling sensitive information do not want the client to show information retrieved from the site while the user was logged in, after the user has logged out. It's just that they forget a simple fact, inherent in HTTP: The browser does not know the user has (been) logged out, and currently there is no way for the browser to find out!

Somebody will probably now say "but cookies/no-cache/must-revalidate can be used for that". Not really. Let's look at each of these options:

  • Cookies: More precisely known as "HTTP State Management Cookies" are used to keep information about what is going on with respect to a given client and user. However, the browser does not know anything about what a cookie means, it just stores it, and sends it back to the server(s) that are supposed to receive it. That a cookie is deleted (such as during logout) does not mean anything special to the client, and in fact a site need not delete a cookie to mark a user as logged out, it can just change a flag in its own database, without telling the browser.
  • no-cache: This Cache-Control response directive may very well be the most misunderstood parameter in all of HTTP, quite likely because of its name, and its use in requests. There seems to be a significant belief that a client must never store a document served with this flag, or produce it as a result of clicking on a link. That is not the function of this directive. RFC 2616 sec. 14.9.1 states:

    no-cache

    […] a cache MUST NOT use the response to satisfy a subsequent request without successful revalidation with the origin server. […]

    This only means that when you click on a link, the client must first ask the server "Has this been modified since I loaded it?", and if the server says "No, it is not modified", then it may show it to you.

    This does not apply to history navigation, because "no-cache" defines the "expiration time" of the web page, nothing else.

  • must-revalidate: If "no-cache" is the most misunderstood parameter, "must-revalidate" may well be the most abused. According to RFC 2616 section 14.9.4:

    must-revalidate

    […] When the must-revalidate directive is present in a response received by a cache, that cache MUST NOT use the entry after it becomes stale to respond to a subsequent request without first revalidating it with the origin server. […]

    Opera may not quite follow the letter of this specification, but that is because of how the directive will normally be used. We treat this directive's presence as an indication that 1) the resource is expired (same as with "no-cache") and 2) during history navigation of secure sites a web page is revalidated before it is displayed to the user.

    You probably noticed the "secure site" condition. The reason for this is the aforementioned abuse of the "must-revalidate" directive. Quite a lot of sites (I have personally seen a lot of PHP powered Wikis with this problem) are sending the cache-directive combination "no-cache, no-store, must-revalidate", which to Opera means "check every time the user asks for it, even during history navigation, and do not store it on disk, only in RAM while we have it".

    What this means is that when you visit such as site (and all the directives are obeyed), then the browser will ask the server to validate the content (even statics) of each page (in some cases even images) every time, which results in very slow navigation. When we introduced "must-revalidate" the forums overflowed with "slow navigation" posts, and long-time Opera supporter non-troppo actually had to patch his Opera Wiki server to keep the problem at bay.

    This abuse is the reason why must-revalidate is only obeyed for secure sites.

As indicated above, some web service providers consider failure to revalidate after logout to be a potential vulnerability for their system. This has caused them to invest quite a lot of effort into ensuring that the browser behaves as they want it to. They have not just used the above mentioned cache directives, but also various scripting technologies.

The major drawback of all the current systems is that they will most often increase traffic to the website using them, thus increasing the load on the servers, which means that to stay operative they need even more servers, which means higher cost of operating the site.

All of it (trouble, wasted money, complaints, etc.) because they are not able to tell the browser that the user has been logged out.

So, how can this be solved in a more economic and predictable manner?

A solution should have the following requirements:

  • The user should be able to navigate history as normal while logged in, no revalidation should take place.
  • Already loaded resources should not be refetched unless the proper functioning of the site requires it (for example, the account page should display the current amounts when the user click on a link going to the accounts summary page)
  • When the user is logged out, all sensitive documents should be removed, and if they are still in history, they should be revalidated with the server before being displayed to the user.

This set of requirements indicates that the server should have a way to tell the browser

  1. That a page is part of the sensitive document group.
  2. When the user is logged out (or should be logged out).

I've recently written, and submitted to the IETF, a document describing a system that tries to solve these problems: Cache Contexts.

In this new system the server uses a Cache Context directive to tell the browser that the document(s) served are part of a specific group of documents, a "context", and when the context has ended its usefulness, the server can discard it, which will also tell the browser that the documents in the context should no longer be displayed to the user, unless they have been confirmed by the server.

The server can tell the client how long it should let the documents in the context live, and it may also connect the lifetime of a context to the lifetime of a cookie; when the cookie is deleted, all the documents in the context are deleted, too.

These are just some of the features offered by Cache Contexts.

If you are interested, it will be available from the IETF's Internet Draft repository in a couple of days. If you cannot wait, a copy of it can be found here.

If you have comments, suggestions, corrections, etc., feel free to discuss them here, or directly with me. General discussion of the proposal should take place on the IETF's HTTP Work Group mailing list.

draft-pettersen-cache-context-00.txt

15 replies on “Introducing Cache Contexts, or: Why the browser does not know you are logged out”

  1. Great write-up.I hear this all the time. My two favorite features (among many more) in Opera are 1)mouse gestures and 2)cached/history… When I tell people this, they bring up the security argument to which I usually respond in (more layman’s like) terms you bring up here…namely.. my back button is for my HISTORY. If I wanted the page to be refreshed, I would refresh it myself. No.. I want to go back- back to what I had entered in the last page for my search query. Back to change the edited post entry. Back to what I was looking at before. I don’t want to go back to something new. I don’t want to go back to an empty form. Don’t destroy everything I typed.. all my work….all my data.I’m sorry this is considered a security risk by some, to me- it’s just respecting a cardinal rule of usability: “Your data is sacred.” The “risk” is one that I’m aware of and mitigate myself. Frankly, the fact that Opera allows the “secure site condition” is the best of both worlds.Anyway- Hopefully your draft is well received- good luck.-Eddie

  2. namely.. my back button is for my HISTORY. If I wanted the page to be refreshed, I would refresh it myself. No.. I want to go back- back to what I had entered in the last page for my search query. Back to change the edited post entry. Back to what I was looking at before. I don’t want to go back to something new. I don’t want to go back to an empty form. Don’t destroy everything I typed.. all my work….all my data. Can’t say better. :up: It’s exactly one of the main reasons why I started uing Opera many years ago and still use it. 🙂

  3. I hear lame developers asking questions like “how can I disable back button?” everyday, because of this History VS Cache issue (or rather lack of understanding of it).This solves another problem too: how to flush cached pages after their content has changed (for example Edit page on a Wiki can’t flush page it actually edits).+1!

  4. I’m a web dev, and while I was having a look at one of our internal pages together with a colleague, I saw her using several clicks to navigate to the previous page by using the web site’s navigation menu. So I said, “You know you could’ve just used the back button….” Which started a discussion about this very issue; she believed that, since our internal (login-protected) JSP pages all haveresponse.setHeader(“Cache-Control”, “no-cache”);and similar lines, history navigation should be disabled like it is in IE (effectively, anyway). That stuff has been in the code since way before I started working there…. So, believing that “Opera can’t be wrong” – as you tend to do when it comes to standards after using it for a while 😉 – I had a look and found this article. (Great idea, and best of luck, by the way.)Just one question, though, that does seem like a valid security concern to me: If the login-protected part of the site is served via https and the user browses several pages while logged in, then logs out (back to http), should the back-button allow you back into the https-controlled part? It does in our case – perhaps because we do not use “must-revalidate”? Since the site is for students who might log in from any public terminal on campus, we’d obviously like to prevent that (and we’re assuming not every student knows how to clear caches and cookies or that the browser should be closed after a session, etc.).Cheers,sandgroper

  5. sandgroper: Logging out does not necessarily mean you are pushed out the unsecure portion of a site, you might just be directed to the secure login page.The real stricking point here is that the client does not know the context of the actions, and what is recommendable in the current context.Let’s say the secure page you are on have a link to an unsecure server, and just as the new page starts rendering you notice some content on the secure server (e.g “you won”) that will disappear if you refresh the page.For that matter going to the unsecure page does not necessarily mean you are logged out, it might just mean that you are taken to an unsecured part of the site (bad design, but it happens).Again the point is what the actual context of the actions are, and that is, in the current system, only known by (at most) two parties: The user and the service.However, you do have these configurations opera:config#Cache|AlwaysReloadHTTPSInHistory opera:config#Cache|CacheHTTPSAfterSessionsthat affect behaviour when navigating HTTPS pages, and leaving a secure server, respectively.

  6. Hi Yngve, I get what you mean – thanks for the reply. Also had a more detailed read of your proposed Cache Contexts. Seems like a well-thought-out and appropriate solution to a problem that should’ve probably been handled ages ago… ;)A definite +1 from me!

  7. When something major happens, certainly.ATM the IETF HTTPbis WG is rather busy with updating the main HTTP spec, so they have not been able to allocate time to consider this or other proposals.

  8. Hi everyone!Sorry for reopening such an old thread. I hope someone is still following and can help me with a peculiar problem:Opera’s Caching-Model is giving me some major grief with a web-app that I am developing. I’m trying to do cross-subdomain requests using script-tags as described here: http://code.google.com/webtoolkit/doc/latest/tutorial/Xsite.html.I know this isn’t the cleanest way to do this, but I don’t even want to get into the headaches that Opera (and IE) gives me, when I try to do it in an iframe using document.domain while developing the page with Google Web Toolkit, which also runs in another iframe.The problem arises when I try to use this model to do long-polling: It works beautifully in ALL browsers (IE, Firefox, Chrome, Safari, Android, …) except Opera. Opera aborts the pending requests after a 30s timeout and then ?!?fulfills the request from Cache instead?!? – even for content marked no-cache, no-store. Result: Magically, old chat-sessions are played back in a randomly interleaved pattern… Not good.Considering the RFC2616-quote from above regarding no-cache: “…a cache MUST NOT use the response to satisfy a subsequent request…”, I would consider this a bug, or at least a standards-deviation, as a failed request is still a request.Clearly, I could simply add some globally unique dummy-parameter to the request url, but that just generates more traffic and server-load for basically no reason. Same goes for using a reverse-proxy to avoid XSD-requests altogether.So, I’m wondering, and maybe one of you has a thought on this, whether it is possible to prevent Opera from defaulting to cache for requests that timed out without requiring the user to change any browser-settings.Also, is it possible to prevent Opera from timing the request out after 30s in the first place?Thank you guys, Markus

  9. Markus: Opera does not time out a HTTP connection, only the server can do that. 30s sounds like a serverside time out. (Although, we time out certain request in the security related area, like OCSP and CRL requests, but those do not directly affect the displayed content).If you use a tool like Wireshark you will be able to tell who cuts the connection. I am pretty sure it is the server, or it sends a 304 response. It could be your client side script, though.As for “cache” in that quote, I consider it most applicable to proxy caches, not the browser cache; Opera treats that directive as “store in RAM only”, and it will be kicked out quickly.

  10. Hi Yngve,thank you for your reply!I ran some more tests using the following super-trivial java “web-server”:public static void main(String[] args) throws Exception { System.out.println(“Listening…”); Socket socket = new ServerSocket(1080).accept(); System.out.println(“Waiting…”); Thread.sleep(45000); System.out.println(“Done.”); socket.getOutputStream().write(“window.alert(‘Still here!’);”.getBytes()); socket.close();}And the following test-html file:script type=”text/javascript” src=”http://127.0.0.1:1080/test.js”>Result:Chrome, Firefox, and IE8 patiently wait for the 45s to elapse and pop up a window saying “Still here!”. Opera (11.61) kills the request after 30s (the 30s begin when Opera starts the request, not when the server accepts the connection).If I set the sleep timeout to 5s and let Opera grab the script once and then set it back to 45s, it simply reuses the old script (you can check that by changing the text in window.alert). It does so even if you close the tab and reopen it, or open a second new window.If I close Opera completely, though, and reopen it later, it has indeed deleted the cache. So, the “store in RAM only” directive works. Opera also seems to clear this in RAM cache on the order of minutes as you said. That’s good. That way I can make the GUID-url-parameter a little shorter.Could this be a vulnerability, though? I am thinking of a man-in-the-middle attack where my proxy serves an HTML page containing a script-tag with the same src as a script-tag in the page I want to attack. Then I redirect to https://bankofamerica.com. Now all I need to do is hold up their own script request (which I can do even for https by stopping, for example, the 5th request) and Opera will instead reuse my script and give me access to their cookies, etc?Luckily, the in RAM cache does not seem to be shared between regular and “private” tabs. There, if the script doesn’t load, it doesn’t serve up an old version. So, maybe for https requests, it behaves the same. But for regular pages, this might be an issue.Greetings,~ Markus

  11. Hi Markus,what you’re seeing here is the “loading delayed timeout” for external javascripts. It is configurable, defaults to 30 seconds, and can be set here:opera:config#HTTP%20Loading%20DelayedThe reason it exists is that loading JS traditionally would prevent the rest of the page from parsing and rendering, and an external script on a server that did not respond would sometimes prevent pages from showing up. Especially on mobile phones with high latency this was a really annoying problem.These days, improvements like speculative parsing and @async on script tags makes this obsolete, and the timeout and preference will be removed from some future Opera version. (With a bit of hindsight, we probably should have limited it to only apply to .js files that are loaded during page parsing, not ones inserted through the DOM.)I hope you can work around it. Apologies for the problem..

  12. To make an on-topic comment more related to the blog post above: my suggestion for this is to develop heuristics to detect when users go “back” to a page that was requested in a different session state and marked as must-revalidate. I think it’s possible to develop heuristics that work with existing content so that we don’t need to add the complexity of cache-context and similar proposals to the protocol itself. But it’s a harder problem that it looks, and several possible answers are worth exploring!

  13. I’m sorry to write this, but my beloved Fast History Navigation Mode is not working on HTTPS sites (such as Wikipedia) lately. This is despite having Always Reload HTTPS In History disabled and Cache HTTPS After Sessions enabled. Another issue is that i have “The server attempted to apply security measures, but failed” errors on many HTTPS sites, but i believe it’s not Opera’s fault. But this error message should be made more clear, because it sounds like the connection was not secure at all.

Comments are closed.