Making MySQL Do More
The standard approach to authentication systems in web development is simple: store your confidential data in a database, pull the needed data from said database and let your chosen language/framework (PHP, Ruby, Python, etc.) do the work. What many programmers don’t realize is that MySQL is capable of so much more than just reading and writing data. And that’s what I hope to show you in this tutorial.
Once we’re done here we’ll have an authentication system that is less expensive on the PHP and more reliant on MySQL’s power. Almost all of the work will be done through MySQL; the PHP code we write will actually do very little in the whole process. In fact, it’s strictly here for a use case scenario. One thing I want to make very clear though is that this approach is not recommended for production-grade systems. Because we’re relying on MySQL we will be limited to the hashing algorithms it supports. And until MySQL 5.5, the only supported hashing algorithms were MD5 and SHA1. Chances are you’re running a pre-5.5 release. We also will not be salting the passwords, a procedure you should perform on production systems. It’s merely outside the scope of this tutorial.
There have been numerous patches and implementations of SHA-256 released for the current MySQL version but they are outside the scope of this tutorial. If the interest is strong enough I’ll write up a guide on it. In the meantime, we’ll be using the SHA1 algorithm. If you have MySQL 5.5 installed then, by all means, use SHA2() instead of SHA1().
Procedures and Functions
These two words represent the very features that many people never learn about nor utilize with MySQL. Let’s change that. When you write application code you often times organize your coding into various constructs; whether it’s a function or a class method. What may surprise you is that MySQL has its own implementation of these. They come in the form of procedures and functions. They are what we call stored routines.
Procedures: A procedure is like a function in that it takes parameters and performs a set of instructions. However, unlike functions, it may not return a value. Procedures may also not be used in a query (for example, a SELECT clause).
Functions: Very similar to procedures, but like their scripting counterpart they may return a value. In fact, that is generally where you’ll determine which of the two to use: do you need a return or do you not need a return? Functions may also be used inside queries, unlike procedures which need to explicitly be called.
Both procedures and functions take parameters. Not surprisingly we give all parameters a name and declare its data type (varchar, double, date, etc.).
But with procedures is we give them an additional flag: the direction the variable will be moving. Procedure parameters are declared as either IN, OUT or INOUT. A parameter declared as IN may only be used as input. An OUT parameter is used as output. And lastly, an INOUT can be used as both. But wait, didn’t I just say procedures can’t return a value? Well, they can’t in that we cannot declare a procedure as a data type and use the RETURN function at the end. But can generate output with procedures through OUT parameters. In fact, this is where procedures have a one-up on functions: a procedure can generate multiple outputs; a function can only return one value of its designated data type.
-- Because we need to use semicolons in the procedure and function, we have to change the statement delimiter to something else -- We'll use || instead delimiter || CREATE PROCEDURE divide_numbers(IN lhv NUMERIC(5,3), IN rhv NUMERIC(5,3), OUT answer NUMERIC(5,3)) BEGIN -- This a pointless variable that's only here to show you what they look like DECLARE tmpAnswer NUMERIC(5,3); -- MySQL supports conditionals! IF(rhv > 0) THEN SELECT (lhv / rhv) INTO answer; ELSE SELECT (0) INTO answer; END IF; END; || -- The ||, since it's the delimiter, terminates the new procedure -- Execute the above to create the procedure and then do the following -- Do not just copy and paste this all into a single query -- If you are doing this from the command line then change the delimiter back delimiter ; -- How do we use a procedure? Well we CALL it -- The @ declares a variable CALL divide_numbers (10, 2, @answer); SELECT @answer; -- Drop the procedure since we won't be needed it anymore DROP PROCEDURE divide_numbers; -- Now on to functions delimiter || CREATE FUNCTION divide_numbers(lhv NUMERIC(5,3), rhv NUMERIC(5,3)) RETURNS NUMERIC(5,3) -- Declare the return type DETERMINISTIC -- DETERMINISTIC means the function will have the same output for the given parameters -- An example of a "NOT DETERMINISTIC" function would be a random number generator -- But 10 / 2 will always be 5, so this is DETERMINISTIC BEGIN DECLARE answer NUMERIC(5,3); IF(rhv > 0) THEN SET answer = lhv / rhv; ELSE SET answer = 0; END IF; RETURN(answer); END; || -- Revert delimiter if necessary (if you're on phpMyAdmin don't worry) delimiter ; -- We don't CALL functions but rather use them in queries (because they return values) SELECT divide_numbers(100, 30); -- Drop the function DROP FUNCTION divide_numbers;
Let’s Begin Coding
Make sure you read through the above code first. The first thing we’re going to do is create our users table. This is simple enough that I won’t explain it; you should understand what’s going on in the following SQL.
CREATE TABLE users ( uid SMALLINT NOT NULL AUTO_INCREMENT, username VARCHAR(10) NOT NULL DEFAULT '', password VARCHAR(30) NOT NULL DEFAULT '', PRIMARY KEY(uid), UNIQUE username(username) ); INSERT INTO users (username, password) VALUES ('user1', SHA1('hispass')), ('another', SHA1('theirpass')), ('you', SHA1('yourpass')), ('me', SHA1('mypass'));
The only thing you may be unfamiliar with is the UNIQUE keyword. MySQL supports a variety of indexes; all are used to make querying more efficient. PRIMARY KEY is one, and you’re bound to be familiar with it. UNIQUE is another, which has the constraint of not allowing duplicate entries with the same value.
Now what we need is a function that will authenticate a user for us. What this function needs to is simple:
- Accept a username and password as parameters.
- Perform a SELECT statement on the users table, looking for a row with the correct username and password
- If found, return true.
- Otherwise, return false.
One small detail we need to be mindful of during this is the output of the SHA1() function. Our password field in the users table is only 30 characters long so we’ll need to make sure we only take 30 characters from SHA1(). We do this with SUBSTR().
DELIMITER || CREATE FUNCTION valid_login(user VARCHAR(10), pass VARCHAR(20)) RETURNS BOOLEAN DETERMINISTIC BEGIN DECLARE valid BOOLEAN; DECLARE retRows INT; SELECT COUNT(*) INTO retRows FROM users WHERE username = user AND password = SUBSTR(SHA1(pass) FROM 1 FOR 30); IF(retRows > 0) THEN SET valid = TRUE; ELSE SET valid = FALSE; END IF; RETURN(valid); END; ||
Read through that and make sure you understand what’s happening. It’s a deterministic function because the same username and password will always have the same response, either true or false. We SELECT into an INT variable the number of rows returned from our validation query. That query is merely a check for a row with the same username and password given in the parameters. If the return is not zero, then we set the valid variable to true. Otherwise, set it to false. And in the end, return our answer.
An example of using the function:
SELECT valid_login('user1', 'hispass');
That will return a result of 1. In other words, it’s valid. That one line right there is all we’ll do in our authentication query within our application code. Pretty damn simple isn’t it? Speaking of the PHP, let’s get it done with.
Here Comes the PHP
I’ve written an example of how this plays out in PHP, and I’ll briefly go over it. But if you’re unfamiliar with OOP and PDO then I recommend you extrapolate the important pieces (which boils down to the actual query) and rewrite it however you feel most comfortable with. As stated, this is merely a use case and it’s applicable (sans the fact that this is PHP) to any scripting language you may be using for your web applications.
<?php class Authenticate { protected $db_; public function __construct(PDO $db) { $this->db_ = $db; } public function validate () { $args = func_get_args(); $args = array_map('trim', $args); if (!empty($args[0]) && !empty($args[1])) { $sql = $this->db_->prepare('SELECT valid_login (:user, :pass)'); $sql->bindParam('user', $args[0]); $sql->bindParam('pass', $args[1]); $sql->execute(); $result = $sql->fetch(PDO::FETCH_NUM); if ($result[0] === '1') { return true; } else { return false; } } throw new Exception('Invalid request.'); } } if (isset($_POST['submit'])) { try { $pdo = new PDO('mysql: host=localhost; dbname=DBNAME', 'USERNAME', 'PASSWORD'); $auth = new Authenticate($pdo); if ($auth->validate($_POST['username'], $_POST['password'])) { echo "Good."; } else { echo "Bad."; } } catch (PDOException $err) { die($err->getMessage ()); } catch (Exception $err) { die($err->getMessage ()); } } ?> <!DOCTYPE html> <html> <head> <title>Authentication Example</title> </head> <body> <form method="post"> <label for="username">Username</label> <input type="text" name="username" maxlen="10" /> <label for="password">Password</label> <input type="password" name="password" maxlen="20" /> <input type="submit" name="submit" value="Authenticate" /> </form> </body> </html>
Direct your attention to line 19 because this is where the magic happens. Remember that query we did earlier, with valid_login ()? Well, that’s all we’re doing here. We’re feeding in the given username and password from the form and looking at the result. In this example, I’m using a PDO prepared statement, but if you use the standard MySQL library then remember to sanitize your input. Procedures and functions do not make your input safe; you’re still vulnerable to SQL injections.
If the credentials are good then our result will be a “1”. If they’re bad then they’ll be “0”. This is because the MySQL BOOLEAN data type is really just an integer that may be either 0 (false) or 1 (true). So we check for that on line 26. Everything else in the above example is just PDO method calls and class definition. What’s important is you see the query and understand what it’s doing.
One thing I didn’t mention before is that stored routines are database specific. Thus if you create a procedure for one database on your server and try to use it when connected to another, you’ll get an error telling you it couldn’t find the procedure.
Let’s Wrap Up
To wrap this up I just want to cover some ground about when and when not to use this methodology. The short answer is that there are a few reasons when it’s good to go this route. What I’ve shown you in this guide has limited good uses. But where it is good, it’s very good.
If your application is being used on multiple mediums (say both a web application and a desktop application) then this is a very good way of handling things. The problem when you are accessing the database from a variety of mediums is that each must implement the login algorithm on their own. And if your database should change and render that algorithm ineffective, you need to rewrite not one but every implementation. So if you have users utilizing your desktop application then you’d effectively give them a headache. It would need to be recompiled and redistributed. Using a stored procedure though means you can modify it in one place and it’ll continue working everywhere.
There is no performance gain. Done correctly, there will not be a performance loss either. You should never be using stored routines to handle expensive processes, to begin with.
Stored routines are great for simplifying the typical back-and-forth communication you do with MySQL. A common occurrence in Web/MySQL applications is: SELECT data, change data, UPDATE data. A routine can be used to simplify this logic because it can handle the SELECT and UPDATE in one pass. As such a good routine has the potential to drastically clean up code that has a tenancy to be repeated.
Stored routines will not solve all of your querying needs though. In all cases but the few times where they’re useful, you should stick to performing the logic inside your application.