Today I’m going to talk to you about why it’s important to use universally unique identifiers (UUID) as keys to your data, and I will show you some of the problems that they can solve.
In the distributed computing world of mobile devices and an internet of things it becomes almost impossible to have one source for key generation, so each device need to be able to create their own keys, while still making sure that those keys are unique. The answer is to use UUIDs, which is a standard by the Open Software Foundation, and there’s an industry wide support for them. For example, PHP, Java, .NET, Oracle, and MySQL all have ways to generate them, and SQL Server and ProgresSQL even has native UUID data types.
Let talk about the benefits of using UUIDs, and I will start by pointing out some of the problems they can solve. When doing our data models, we have been taught to normalize our databases so that we find primary keys as one of the columns or a combination of columns. If not found, we usually add a generated id, such as an automatically incremented integer. One problem with using a multi-column key is that the queries become more complex as each column needs to be included in each join, and more complex means a lower performance and also larger indices. Another problem is that it’s harder for the consumer of the data to know which columns are included in the key.
With auto-generated keys, we get a simpler join, but a problem is that each consumer need to ask the generator for a valid key, and another problem is that there will be “holes” in the sequence when rows are deleted. Yet another problem is that any data type used has a fixed limit (even if huge), so it will eventually run out. Also, it becomes tricky for a client to find out which id was generated during an insert.
With UUIDs, we get the same benefit of the simpler join as we got with the auto-generated keys, but now each user of the data can generate new keys and rows, without access to any central key generator – it can even happen while disconnected. So any device or thing can generate a key and the maximum 3.4 x 10^38 combinations both ensure that each key is unique and that there’s no practical limit of keys.
But even UUIDs have drawbacks, and the most obvious is that they are hard to remember. Therefore, it’s a good practice to provide ways of accessing data using attributes that are easier to remember. For example, provide a service that can get an order by providing an order number in addition to getting it using the primary key, the UUID.
So make sure that you use UUIDs as keys to your data to allow distributed key generation, simplified joins, and get better performance.