$page = $_GET['page']; include 'pages/' . $page;Going back to the overlap with coding standards, we do not know if the page= variable will be set in the query string, nor do we know if the requested file exists. A better method to avoid any error messages would be:
$page = empty($_GET['page']) ? 'index.php' : $_GET['page'];
if ( file_exists('pages/' . $page) )
include 'pages/' . $page;
else
include 'pages/404.php';
You may think that will allow any page in the 'pages' directory to be loaded, using the value of page from the URL. And you would be right to think so, but it will also allow absolutely any file to be loaded. By using "..", you can move up a directory and access any page you want. This has serious implications if the user is able to go up beyond the webroot and execute scripts or read files that are normally inaccessible - such as your password files for http authentication! You may never intend to link to any page that is not in the format myscript.php?page=somepage.php but do not think that that is all $_GET['page'] can contain. Instead you should be aware that a malicious user could call the script as myscript.php?page=../../somewhere/ishouldnot.be and so you must code accordingly. The next improvement would be to run a str_replace() on the incoming GET value and strip out any instances of ".." but let's go one step further for total peace of mind and say the only files it should be including are .php ones, and the only valid values for the page variable are alphanumeric strings:if ( empty($_GET['page']) ) {
$page = 'index';
} else {
$page = preg_replace('/[^a-zA-Z0-9]/', '', $_GET['page']);
}
if ( file_exists('pages/' . $page . '.php') )
include 'pages/' . $page . '.php';
else
include 'pages/404.php';
Note: I have written out this example in full for readability but this entire code snippet could be condensed to two lines using the ternary operator. I mention this because it shows how simple it is to convert a huge vulnerability into a piece of secure, functional code.
The principle shown here applies to every GET, POST or COOKIE value you use in your scripts. Also be aware that the request headers sent by the browser will be available in the SERVER array - this includes the user agent and referrer. Always assume the worst case and create your code to cope with any unexpected input either by throwing an error or resorting to a default value. Do not be tempted to think that using POST data is more secure than GET data. It's not. All (cookies included) can easily be manipulated by a user to contain anything the user wants.
There are two common types of attack that rely on manipulating the data sent to a script. The first of which is effectively the same thing as I've already described, except rather than relating to the filesystem it exploits databases.
(a) SQL injection
SQL injection refers to the method of inserting malicious code into a query. This occurs when you use input data directly in a query and the resulting statement is not as intended. For instance, consider the following query:
SELECT * FROM table WHERE id = $idAssume $id contains the value of id=42 from the URL. The query that is executed becomes:
SELECT * FROM table WHERE id = 42Now let's say we have a malicious user wanting to cause trouble. If they call the script with id=42; DROP DATABAS.E dbname, the query becomes two queries:
SELECT * FROM table WHERE id = 42; DROP DATABAS.E dbnameThe implications of this are obvious: by using unsanitized input in your queries, anyone can execute any query possible. This includes both read and write operations, which could lead to your entire database being deleted as illustrated. However, deletion is not the worst of it - you can have your most sensitive data read, overwritten or even additional false data inserted. For example, if you run a membership site that requires payment, you could have users simply creating their own accounts by directly inserting rows into your database. The problems need no further explanation but what can you do to prevent it? As you may have noticed, I opted for a generic example that only illustrated the queries themselves. The reason I did not show you the corresponding PHP code is because this major problem is not so significant when working with PHP and mySQL. According to the manual, the function mysql_query() does not support multiple queries - the reasoning behind this is, I assume, to help prevent SQL injection attacks of this nature. By only permitting one query per mysql_query() call, the worst attacks can be prevented. As this also hinders developers wishing to send multiple queries, this behaviour may go the same way as magic quotes (which was another safeguard against malicious inputs but has since been removed). Either way, even with this protection you are by no means safe from SQL injections. Let us now look at a basic member system. There is a members table in the database that stores the username and password for each account. Our login page looks up the submitted username and password in the database - if found, the user is logged in. Here is the code we could use (this code is to illustrate the vulnerability only, it is not intended to be a good example of how to code a login system):
// Assume the script has already connected to the database
// Check submitted data exists
if ( empty($_POST['username']) || empty($_POST['password']) ) {
die('Please fill in your username and password.');
}
// Convert to lowercase (username is not case sensitive)
$username = strtolower(trim($_POST['username']));
// Hash the password
$password = md5(trim($_POST['password']));
// Lookup in database
if ( ! $result = mysql_query("SELECT * FROM members WHERE LOWER(username) = '$username' AND password = '$password' LIMIT 1") ) {
die('MySQL error.');
}
// Process login or display failure message
if ( mysql_num_rows($result) == 1 ) {
// Successful login
} else {
// Failed login
}
You should be able to spot the problem with this code immediately. The password is fine, we know the return value for the md5() function is always a 32 character hexadecimal string. Regardless of what is submitted as the password, there can be no problems with that part. However, we do not validate the username at all - simply convert it to lowercase and remove unnecessary spaces. Under normal circumstances, I am sure you can imagine what the query would look like but what if I enter the following as my login credentials?
Username: admin' #
Password: apple
The resulting query that is executed is:
SELECT * FROM members WHERE LOWER(username) = 'admin' #' AND password = '1f3870be274f6c49b3e31a0c6728957f' LIMIT 1And as # symbol marks a comment, everything after it is ignored. In effect, the query we are now running is simply:
SELECT * FROM members WHERE LOWER(username) = 'admin'A malicious user could therefore log in to any account they desire simply by knowing the username! Another variant is injecting a OR 1=1 clause into the query, and since 1 always equals 1, the where conditions are satisfied. Most crucially, the character causing all this fuss is the single quote '. We could protect ourselves from SQL injection by simply replacing the ' with it's HTML entity equivalent, or more commonly, escaping it with a backslash. The mySQL extension actually contains a function to do all this for us - we only need to pass our data through the mysql_real_escape_string() function to ensure it is safe to use in a query. Going back to our example, if we pass $username through this mysql_real_escape_string(), the resulting query is:
SELECT * FROM members WHERE LOWER(username) = 'admin\' #' AND password = '1f3870be274f6c49b3e31a0c6728957f' LIMIT 1It now must look through the database for an account with username admin' # and the correct matching password. There is one other useful function for making data safe for use with queries and that is intval(). Obviously it is only applicable to numeric data but this function will always return a number - if the value passed in is numeric (or begins with a number), it returns that number and otherwise returns 0. If we were to use this on our first example, whatever we added after our "42" will be ignored. As you often will be relying on numeric values, intval() is a fast and easy way of keeping your script secure. (b) Session hijacking Sessions rely on a unique session ID for each visitor. The ID usually corresponds to a temporary file on the webserver used to store all the session data, although the underlying technology is irrelevant as PHP handles that for us. Every time a request is made, the session ID is sent to the server (either in a cookie or as a GET parameter in the URL query string) and the server looks up the corresponding session data for that particular user. With PHP, this data is then available in the $_SESSION array. All this does is allow us to maintain data across multiple page requests for a particular user. Every website that allows you to login will be using sessions. A problem occurs if someone else gets hold of your session ID. All a malicious user needs to become you is your session ID - if they then send it to the server as if it were their own, the server will assign all your session data to the hacker. Effectively, they will be logged in as you. You want to ensure that the session ID you receive is the same ID as you sent out to the same user, and no one else. The bad news is there is not a quick fix for this problem. The entire point of sessions is that they allow you to maintain state across a stateless protocol - that is, each request to a server occurs separately and has no relation to any previous or future request. Since we have no way of identifying the user, we have no way of determining if a session is hijacked. Luckily there are factors that do, for the most part, remain constant. It is highly likely that the IP address of the user and the browser being used (i.e. user agent) will remain constant throughout the session. Therefore if we add to our session code a quick check that compares the values of the current request with previously stored values, we can ensure the user does not have their session stolen. The downside to this method is that the IP address and user agent can legitimately (albeit rarely) change throughout a session and any users experiencing this would then become logged out, or whatever else you choose to do with suspected hijacks. As with all security precautions involving user actions, you need to find the right balance between security and functionality. If there is nothing to be lost if a session does get hijacked, there is no need to add this to your applications. However if you're handling sensitive data you will want to be as safe as possible. The following example illustrates how to protect against session hijacking:
// Activate PHPs session handling
session_start();
// Check for stored IP address / user agent
if ( isset($_SESSION['authenticate']) ) {
if ( md5($_SERVER['REMOTE_ADDR'] . $_SERVER['USER_AGENT']) != $_SESSION['authenticate'] ) {
die('Suspected session hijack.');
unset($_SESSION);
session_destroy();
}
} else {
// Create a store a md5 hash of identifying data
$_SESSION['authenticate'] = md5($_SERVER['REMOTE_ADDR'] . $_SERVER['USER_AGENT']);
}
Despite any additional security you put in place, you can never guarantee it will be enough. For the most important actions, the only way to ensure the current user is a valid user is to require them to input their password again. (Even this is not foolproof but if an account is illegally accessed in this way, it will not be the fault of your coding). Each time you hassle them to re-enter their password, you are negatively affecting their browsing experience. Once again, it is up to you to strike an appropriate balance between security and functionality.
And that should cover most of your common security pitfalls. There is one final point to mention if you are distributing your scripts: always, always release them with a disclaimer or terms in the license that absolves you of any responsibility should your scripts have security holes.
Added by Rodney on July-24-2007, 5:37 pm

2008-04-11 | 09:50 pm
2007-11-13 | 03:15 am
2007-10-28 | 03:17 am
2007-10-28 | 03:16 am