|
This is a windy justification and explanation. If you are already aware of these problems with Joomla and how they work, go to the top post in this thread and download the attached file. The lists near the bottom have instructions on how to proceed, and you can skip over this long explanation.
This goes right to the heart of how Joomla maintains sessions for your guests and users, but has been a problem for some time and needed to be addressed. The core team has been busting their butts trying to get 1.1 ready for public consumption, so haven't had time to focus on this needed upgrade to the core.
First, understand that Joomla tracks sessions through the use of cookies, authenticates sessions by a user's IP address, sets up sessions for all page requestors (both Visitors and registered users), and sets up a new session whenever a page is requested without either a cookie being presented or when the cookie's data cannot be authenticated by the requestor's ips address.
There two primary problems with this approach which lead to excessive sessions being maintained in the database and thus increasing a server's utilization (slightly, but noticably), excess sessions created by search engines, and excess sessions created by users who are behind proxy banks.
Most of us have noticed this when we use the "Who's Online" module. It tends to report extraordinarily high numbers of visitors, and often more registered users than are actually online. The "Who's Online" module is accurately reporting the number of sessions found in the Joomla database, so the problem is that we are getting too many registered sessions in the database.
Two things are causing this:
1. Browsers or page readers that do not allow cookies (like search engines).
Joomla stores the key to its sessions in the users cookie. This is actually the only real secure way to do it, as the only other option is to add the session key into each URL (you may have seen this on other websites). URL's that contain session id's are not considered secure due to the fact that if you follow a link from one website to another, that session key is now in the webserver log for a different server. Not usually a problem, but some webmasters are less than scrupulous.... and they now have that key to play around with.
If Joomla does not see this key in the cookie for a page request, it assumes this user has not been on the site ands sets up a session for them. The problem lies in the fact that some web users turn their cookies off, and some web readers (like search engines) don't accept them at all. In either of these cases, Joomla ends up creating a new session for every single page they request. If you get spidered by four search engines, and they create a sum total of five hundred page requests between them, you will have five hundred new sessions in your database when they are done. In fact, this is very very common for Joomla web sites that are well indexed by the search engines.
2. Users who access your website from behind proxy banks.
As a security measure (to protect their users) many large ISPs (like AOL) and corporations use Proxy Web Access Servers. When they have thousands of users going through them, they have to use banks of proxy servers to handle the load. Users who are behind these banks of proxy servers can potentially have a different ip address with every single page request. Since Joomla authenticates its sessions by ip address, these users look like first time page requests every time their ip address changes.
With Joomla creating a new session for every page request that does not have a valid session (as authenticated by this ip address), this can dramatically increase the number of sessions Joomla creates for that user. If they request 16 pages and thier ip address changes with each page request, you end up with 16 sessions in your database even though there is only one user.
You may have noticed that some AOL users cannot stay logged into your Joomla website unless they select the remember me option when logging in. This is why. Joomla cannot authenticate them because their IP address has changed. If they do select the remember me option in order to stay logged in, you end up with an inflated logged in user count and way too many sessions.
What can be done?
Fixing these problems is relatively straightforward, requiring that we find some way to test for cookies in a users request, and if we want to accomodate proxy bank users, modifying our authentication methodology.
Testing for cookies is relatively easy. Instead of setting up a session automatically, we need to test first. So instead of setting up a session for page's requested without a cookie, we need to simply set a test cookie instead. Then on subsequent page loads if that cookie exists, we proceed with setting up a session. This way if we never get our test cookie back (a page reader like a search engine, or a user who does not allow cookies) we never set up a session.
Fixing the problem for proxy bank users is a little more complex, but not a lot. First understand the reason for authenticating sessions based on IP address. By authenticating a user's ip address before allowing them to access an existing session, we lower the ease with which a session cracker can steal a session. We simply don't allow them access unless they are coming from the same IP address that they set the session up from. Although not foolproof, its a pretty good way of ensuring that someone halfway across the world cannot access a users session just because they found a key (remember your cookie holds the key, and it is sent with each page request).
We can fix this problem by modifying our session authentication very slightely, in a way that allows a small IP range to be authenticated instead of using a singular IP address. For thos who do not understand IP addresses, a typical ip address consists of four value ranges seperated by decimal points, ie: 192.168.2.1 . The allowed range for each value in this ip address is 0-256. The total number of IP addresses addressable by the internet protocal is around 2 billion, but that does not account for reserved address space, internet netwrok segment addressing, or space used by networks that are only transitional.
We don't want to allow our sessions to be aquired by any of billions of internet users, so we need a good compromise. By authenticating only the first three value units of an IP address, we force the requesting user to be in a very narrow range of 256 different IP addresses, that will typically be concentrated to a singular bandwidth provider thus localizing the range of allowable IP addresses. This is a substantially reduced range of IP addresses and represents for most of us an acceptable risk in exchange for access for users of proxy banks (the same risk that has to be mediatad by your bank or any other extremely secure website that cannot afford to exclude AOL users).
So, we need an option that allows us to authenticate based on a small range of ip addresses, which means authenticating only the first three value units of the requestors ip address, ie 192.168.2 instead of 192.168.2.1 .
I had to fix this problem, so just made sense to do it in a way that others could use too. Perhaps the Core Developers will find some value in these methods and provide this in a future release, but in the meantime we have a working solution for those who need it now. It involves changing some core class functions in your joomla.php file, which are attached to the post above for those who want to test this out. If enough of us test this, and prove that it works, the core devs may find value in this solution and potentially implement it (or a varient of it) in future releases.
Thanks,
GRAM
_________________ GRAM http://coders.mlshomequest.com/ < -- Developer of samSiteMap component
Last edited by gram on Wed Feb 01, 2006 4:31 pm, edited 1 time in total.
|