Tera-WURFL 2.1.3 Database Roundup

I’ve been working hard on Tera-WURFL 2.1.3 lately.  I’ve been meaning to release it for over a month now, the problem is that I just keep finding improvements to make.  Since Tera-WURFL’s development is now hosted on GitHub, you can always see what I’m doing and grab a copy for yourself.  New in version 2.1.3 is a MongoDB Database Connector.  I keep getting pounded by people about NoSQL and I haven’t given it a fair chance until now.  Simon Harris from www.pointbeing.net was nice enough to contribute the initial MongoDB connector, and I’ve spent a few days working on it and testing it against MySQL5.  At first the performance seems outstanding - almost double that of MySQL5 - but I noticed some inconsistencies with my tests, so I built a benchmarking script and an comparison tool that automatically tests one installation against another and checks for consistency.  I’ve tested three different database connectors against each other and my memcached-based Tera-WURFL Enterprise.

Testing Setup

In order to test the Database Connectors, I brought up a virtual machine on an under-utilized ESXi server (dual 6-core Opterons and 32GB FB-DDR3).

Virtual Machine Specs:

  • 4x Opteron 2.2GHz cores
  • 12GB FB-DDR3 RAM
  • 20GB of storage on a SAS array over MPIO iSCSI
  • Ubuntu Server 10.04.1
  • MySQL 5.1
  • MongoDB 1.7.0

This machine doesn’t have extremely fast CPUs, just a lot of them.  At any rate, the specs were consistent so I was able to get a reasonable test result.  In order to test the performance of the DBs, I ran about 66,000 unique user agents (these are UAs that are not easily matched and create a lot of DB load) through the connectors and measured the rate at which they detected them.

The results

The following results are the average speed at which the different DB Connectors detected the test user agents.  The two graphs show the difference between uncached and cached detections.  For each database, I ran the test with the all the WURFL capabilities loaded (blue) and with only ['product_info']['is_wireless_device'] loaded (red) to get an idea of the performance benefit of using Tera-WURFL’s CAPABILITY_FILTER option.

Uncached Device Detection Performance in Tera-WURFL 2.1.3

Uncached Detections v2.1.3

Note: memcached is not capable of serving uncached detections, so it is left out of this test.

Cached Device Detection Performance in Tera-WURFL 2.1.3

Cached Detections v2.1.3

Conclusion

I was very surprised by these results so I tested them repeatedly just to be sure.  I was so excited that MongoDB would be the clear winner here, but it just isn’t the case for uncached detections.  Mongo did fare well when serving the cached full capabilities - 63% higher than MySQL5.  MS SQL Server 2005 was particularly painful to test.  By comparison, MongoDB took about 8 minutes to churn though all 66,000 user agents, but my initial test with MS SQL Server was 3 hours and 8 minutes.  Since I tested all the other connectors on Ubuntu Server, and the MS SQL Server connector requires Microsoft’s PHP SQL Server Driver, I had to run the tests on a different VM.  On this VM I was getting between 1-3 uncached detections per second, but I moved the testing to my development laptop (2.8GHz Core 2 Duo, 4GB RAM and a Samsung 250GB SSD) and was able to achieve better results.  MySQL fared very well overall and remains at the top of the list.  I have not mentioned the MySQL Nested Set connector before, but it’s been included in Tera-WURFL for a couple versions now.  It just extends the MySQL5 connector and adds right and left values to all of the devices in the WURFL so it can return the complete fallback tree for a given WURFL ID in a single indexed query.  Unfortunately, this seems to provide a negligible improvement in speed.  I would still recommend it for environments where the webserver is on a different host than the database.

Look for Tera-WURFL 2.1.3 to be released within a couple weeks.  I’m trying to get some unit-testing code integrated before I release it.

Update

I’ve retested MongoDB and MySQL5 again on my production server against a typical distribution of 44887 user agents.  The performance was much better, but the MySQL <-> MongoDB performance was proportional to the first results:

Uncached - All Caps Uncached - Min Caps Cached - All Caps Cached - Min Caps
MySQL 303 510 2735 7740
MongoDB 195 255 3182 6154

stevekamerman

COO @scientiamobile