The UUID vs. the Integer
In the world of distributed systems, integer indexes are left behind in favor of UUIDs. Since there’s no shared counters between two unconnected systems, this is a great solution.
I think that the UUID in general is better than integers though for indexing a relational database for a couple reasons:
It might not be traditional, it might have implications on the speed of joins (number comparison vs. string comparison? well, they should just be a bunch of bytes, so no big difference? well, the string is probably many more bytes…), database size, and other performance issues, but I think it’s well worth it.
- There are millions of pages on the internet about autoincrementing of database indexes and Postgres serial vs. MySQL auto_increment and a whole bunch of balogna about how to get your index. If you just create a UUID in your app and use it, you avoid all that bull.
- Security is much improved. I mean this in two ways. There is the classical incrementing of numbers to get to the next blog entry or photo album or whatever that is supposed to be private. UUIDs have long been used here. But let’s say you have a permissions to user table. And someone figures out a sql injection that lets them put in their userid and a permission id into the table. If your permissions are indexed on a UUID, they’re just stabbing in the dark trying to get a valid, useful permission. They’ll probably need to do enough stabbing that you’ll notice the attack and shut them off. (This assumes the sql injection doesn’t let them read from the permissions table).
The speed thing isn’t really an issue since you’d probably store the data in a fixed-width binary format which will be plenty fast. The issue is disk use. 128-bit UUIDs will mean more disks and very quickly more servers. Ted wrote a bit about this the other day. (Pay no attention to the base64 distraction at the end.)