A Brief Look at a Web Request Life Cycle
On occasion I have been asked about features that revolve around user login. This brings an interesting concept into play. It’s one, that as users, we overlook all the time. Websites are stateless. From one request to the next, they have no idea whats going on. I will explain in more detail, but first let me take a moment to mention that this is another high level look at a very complicated topic. There are several edge cases, and little rules that I will just mow right over, to explain the general concept. If you would like a very detailed explanation, using your project as an example, just give me a call or send me an email.
Before we can get to the real issue, we have to talk a bit about applications, web sites, and web apps. Applications, like Word, Mail, and iTunes, are complex programs that run on your local machine. They have access to all kinds of storage and are stateful. This means that they are aware of the state they are in. They can tell what the last button pressed was, and what the next track in the playlist is. Websites are stateless. They have no concept of existence beyond the request lifecycle (more on this later). Basically they have no idea what happened last, or what happens next. Web applications are just web sites, with clever ways of emulating state information. It’s important to realize that it’s only an emulation and it’s far from perfect.
As to the request life cycle, it’s actually pretty simple. First a client (browser) asks for some information. Then the server sends it some information. That’s it. Nothing more. The lifecycle of the request is complete and the next time the same client requests info, it starts over at the beginning.
A great example of this simple request life cycle is this very site for you. You clicked a link, and that started a request. You requested this article. It was delivered. The end. The website has no concept of anything beyond this. The server is ignorant of any other link on this site. Should you click another one, it would start a new request.
Obviously, this has some serious drawbacks. Specifically if you want to allow people to log in. You don’t want everyone reading your GMail just because they happened to guess the correct URL. This is where cookies come in. Cookies are a simple amount of data storage that your browser stores and sends back with every request. A very simple example of this is a web site that shows a pop up on first view and then doesn’t show one another time. A cookie could be used to store the fact that this isn’t your first time.
Cookies are used for very simple storage, and thats enough for some simple tasks, but nothing very complex. That brings us to sessions. Sessions are basically state information about a client. For example GMail sessions contain information about the logged in user, this helps GMail display the correct email.
Together cookies and session allow the emulation of a stateful web application, but it’s important to remember that it’s just emulation and the server has no concept of these facts in truth. I will show a more detailed example a bit later.
There are two other important terms that we must take a look at. The first is authentication. Authentication is simply a verification that you are who you say you are. If you claim that your big bird, and you prove it, then your authenticated. Authorization, the second important term, means to basically check that your allowed to do something. A simple real world example can be seen at the airport. First you have to prove you are who you say you are, by showing your ID. Then you prove your allowed on a plane, by showing your boarding pass. You can’t, in theory, use someone else’s boarding pass to get on flight you don’t belong on.
Putting it all together is going to take a flow chart.
Lets walk though an example. Lets say you wanted to view an order on your site. First you start at the top, and request the page that lists all orders. With that request is sent a cookie. The cookie is examined for validity, and passes, then it is examined to see if it’s “stale” (read too old). It fails this test because you haven’t been on the site in a while. So the site tells you to log in again. (Were going to skip some info here because it’s not important to this topic, but keep in mind that going to the login page is another request). You log in, and a new cookie is set. Then you request the orders list page again (remember every single tiny request is a new request). This time, you cookie is checked and it’s valid, it’s not stale, and any other logic (i.e. make sure the user is not banned) passes. The cookie is updated (so as not to become stale). The server then makes a check to make sure you are who you say you are. For example if you are logging in as john.smith does the cookie and other data match this. At this point you are authenticated. Then the server tries to apply session data. For example, previous pages, user details, favorite colors (again from a very top level). Then the server checks to see if your allowed to see the orders list page. You are (This is the point which you are authorized), so the page is finally returned to you.
But what you wanted to see was specific data for a single order, the list of orders was just a way to look up the order in question. So you find the order, and click on it. This begins a totally new request. So back to the top of the flow chart we go, and we start again by attempting to log in via a cookie.
The important take away from this example is that EVERY SINGLE request requires passing through the entire chain. Every single request is a login attempt that may or may not succeed. Because of this, a user actually logs in every single time they view a page.
Last activity, last login, and last logout are requests that are very common. However because of the machinations I just explained these are often not what people actually want. Last activity, can only be determined by the last time a request was made, and to update a database with that information, every single request, for every single user, is very expensive. Keep in mind that a single page from a user perspective could consist of between tens and hundreds of real requests. Last login is almost the same exact thing. Keep in mind that every single request starts with attempt to login. Last logout is a bit different. Logout isn’t a real action as far as the server is concerned. To log out a user, you typically, just invalidate their cookie. That makes them fail “login from cookie” the next time, and ensures that anyone else that has that cookie also can’t log in. However it is a real request and a time stamp can be set. However, most users do not log out of sites. They just let their cookies expire. There is no request for this, and because web sites are stateless, they have no idea that a cookie has expired until they try to check it again.
Some explanations of common examples I hear that go against this information include “number of active users” on some forums. This information is not accurate. They track the last time a user did a specific thing (load a page for example), then if that was less then 15 minuets ago, they are “active”. I also have heard about GMails “latest account activity” message in the footer. This is a good example of what I my self do for clients when they decide it’s important. If you look at Google’s example, they show that they only increment the counter “some times” in response to “specific events”. It does not track every request. Just enough so that you get the information you need.
An example of what I normally do for clients that need this information, is to track the last time the server updated the cookie. The server doesn’t update the cookie every request, only when it thinks it needs to because the cookie might go stale. Also, updating the cookie for most of my code means rotating some security information. In other words, were already doing a a database write, so storing a time stamp, is very inexpensive. This gives us the illusion of last activity, without totally bogging down the servers. It’s not very accurate, there are several other things that could trigger a cookie update (though all of them happen in the request life cycle). There are also times that a person may update their cookie, then not come back to the site. However I find this a good compromise.
As I stated this is a very high look at a very complex topic. Some of the information presented was at a very high level, and several topics were skipped. I do hope, however that this gives you, the reader and potential or existing client, a general idea of how web requests work in regards to login. Remember if you would like a more detailed example using your own project, all you have to do is call or email.
Coteyr.net Programming LLC. is about one thing. Getting your project done the way you like it. Using Agile development and management techniques, we are able to get even the most complex projects done in a short time frame and on a modest budget.
Feel free to contact me via any of the methods below. My normal hours are 10am to 10pm Eastern Standard Time. In case of emergency I am available 24/7.
Phone: (813) 421-4338