Fast MySQL Geo-Referencing IPs with MaxMind GeoIP Database
The fastest way I found when performing “ip to country” lookups using MySQL and MaxMind GeoIP database and others is what I describe on this post. Read on.
The most important thing is the index. Add a primary or unique index to (`ip_to`, `ip_from`, `country`), assuming our table has at least the 3 most needed fields (`ip_from`, `ip_to`, `country`).
This is the CREATE TABLE
for our table:
CREATE TABLE `ip2country` ( `ip_from` int unsigned NOT NULL, `ip_to` int unsigned NOT NULL, `country` char(2) NOT NULL, PRIMARY KEY (`ip_to`, `ip_from`, `country`) ) ENGINE=MYISAM;
Note that the index has to start with `ip_to` and not `ip_from`… and by adding both `ip_from` and `country` together with `ip_to`, MySQL will do the “Using index” optimization. In order to use the “Using index” optimization, the index has to be on all the 3 fields (`ip_to`, `ip_from`, `country`).
And the query to use is the following:
SELECT `country` FROM `ip2country` WHERE 123456789 BETWEEN `ip_from` AND `ip_to` LIMIT 1;
The other trick is to add the LIMIT 1
. This tells MySQL to stop searching once it finds the first match, if we don’t MySQL will keep searching for more matching rows. We know there is only one possible match, the index is even unique, but MySQL doesn’t see it that way so we need to tell MySQL LIMIT 1
. MySQL doesn’t stop on the first match automatically because it assumes there are other possible matches, there could be overlapping ranges. Only we know there can’t be overlapping IP ranges, MySQL doesn’t.
This is what EXPLAIN
tells us when executing our query:
mysql> EXPLAIN SELECT `country` FROM `ip2country` WHERE 123456789 BETWEEN `ip_from` AND `ip_to` LIMIT 1; +----+-------------+------------+-------+---------------+---------+---------+------+--------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+-------+---------------+---------+---------+------+--------+--------------------------+ | 1 | SIMPLE | ip2country | range | PRIMARY | PRIMARY | 4 | NULL | 134807 | Using where; Using index | +----+-------------+------------+-------+---------------+---------+---------+------+--------+--------------------------+
An added bonus is to use SQL_CACHE
assuming we have query_cache_type=2
in the my.cnf file. So the query would look like this:
SELECT SQL_CACHE `country` FROM `ip2country` WHERE 123456789 BETWEEN `ip_from` AND `ip_to` LIMIT 1;
Using MEMORY
tables for even faster queries
For even faster queries we can use the MEMORY
engine. Because MEMORY
doesn’t take advantage of the “Using index” optimization we only need an index on `ip_to`. MEMORY
tables are faster than MYISAM
even without that optimization, but make sure the index is of type BTREE
and not HASH
as that’s the default for MEMORY
tables.
The only disadvantage of MEMORY
tables is that once MySQL is restarted or the server goes down all data in them is lost. Not a problem, the solution is the following:
- To have two tables, one
MYISAM
named `ip2country_disk`, serving as backup, and anotherMEMORY
named `ip2country_mem`. - Create a simple text file with the following content
INSERT INTO `ip2country_mem` SELECT * FROM `ip2country_disk`;
. This is known as a MySQL initialization file. - Add
init-file=/path/to/filename.ini
(use your own path here) to my.cnf under the[mysqld]
section.
This way we never have to worry about losing the data in the MEMORY
table, but both tables will have to maintained when updating the data.
This is the CREATE TABLE
for the MYISAM
table:
CREATE TABLE `ip2country_disk` ( `ip_from` int unsigned NOT NULL, `ip_to` int unsigned NOT NULL, `country` char(2) NOT NULL, PRIMARY KEY (`ip_to`, `ip_from`, `country`) ) ENGINE=MYISAM;
And this for the MEMORY
table:
CREATE TABLE `ip2country_mem` ( `ip_from` int unsigned NOT NULL, `ip_to` int unsigned NOT NULL, `country` char(2) NOT NULL, PRIMARY KEY (`ip_to`) USING BTREE ) ENGINE=MEMORY;
This is what EXPLAIN
tells us:
mysql> EXPLAIN SELECT `country` FROM `ip2country_disk` WHERE 123456789 BETWEEN `ip_from` AND `ip_to` LIMIT 1; +----+-------------+-----------------+-------+---------------+---------+---------+------+--------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-----------------+-------+---------------+---------+---------+------+--------+--------------------------+ | 1 | SIMPLE | ip2country_disk | range | PRIMARY | PRIMARY | 4 | NULL | 134807 | Using where; Using index | +----+-------------+-----------------+-------+---------------+---------+---------+------+--------+--------------------------+ mysql> EXPLAIN SELECT `country` FROM `ip2country_mem` WHERE 123456789 BETWEEN `ip_from` AND `ip_to` LIMIT 1; +----+-------------+----------------+-------+---------------+---------+---------+------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------------+-------+---------------+---------+---------+------+--------+-------------+ | 1 | SIMPLE | ip2country_mem | range | PRIMARY | PRIMARY | 4 | NULL | 134722 | Using where | +----+-------------+----------------+-------+---------------+---------+---------+------+--------+-------------+
You can skip the whole MEMORY
stuff and just use the MYISAM
disk based table together with SQL_CACHE
which is already pretty good.
All queries were tested with MySQL 5.1.47, you may get different results with older versions.
Hope this helps someone seeking the best performance out of MySQL for “ip to country” lookups. Feel free to leave comments below.