My informal poll of fellow developers reveals that by about a 4 to 1 margin, we are all using UNIQUEIDENTIFIER primary keys. Very Bad Idea.
The GUID (UNIQUEIDENTIFIER) datatype is a wide column (16 bytes) and contains a unique combination of 33 uppercase and numeric characters. As the primary key it will be be stored in a clustered index (unless specified otherwise). it will be the page pointer for each leaf page in any non-clustered index. About the only thing you can do to improve the performance of this is to increase the Fill Factor to 100%.
When a GUID is used instead of an integer identity column then the 33 characters need to be matched for each row that is returned using that column in the where clause. If a high volume of inserts are done on these tables then the GUIDs, being large, will contribute to page splits, as will the fact that NEWID() generates a random value, which could place a new record on any of the data pages. The result? Performance problems.
A good clustering key should be sequential. But, a GUID that is not sequential - like one that has it's value generated in the client or generated by the newid() function in SQL Server can be a very bad choice - mostly because of the fragmentation that it creates in the base table but also because of its size. It's unnecessarily wide -- 4 times wider than an int-based identity - which can give you 2 billion unique rows. If you need more than 2 billion you can always go with a bigint (8-byte int) and get 2^63-1 rows. You could use the new NEWSEQUENTIALID() function to improve things, but you're still dealing with 16 bytes of glop.
Here are some estimates from Kimberly Tripp on database size:
Base Table with 1,000,000 rows (3.8MB for int vs. 15.26MB with UNIQUEIDENTIFIER)
6 nonclustered indexes (22.89MB for int vs. 91.55MB with UNIQUEIDENTIFIER)
Read more: PETER BROMBERG UnBlog
The GUID (UNIQUEIDENTIFIER) datatype is a wide column (16 bytes) and contains a unique combination of 33 uppercase and numeric characters. As the primary key it will be be stored in a clustered index (unless specified otherwise). it will be the page pointer for each leaf page in any non-clustered index. About the only thing you can do to improve the performance of this is to increase the Fill Factor to 100%.
When a GUID is used instead of an integer identity column then the 33 characters need to be matched for each row that is returned using that column in the where clause. If a high volume of inserts are done on these tables then the GUIDs, being large, will contribute to page splits, as will the fact that NEWID() generates a random value, which could place a new record on any of the data pages. The result? Performance problems.
A good clustering key should be sequential. But, a GUID that is not sequential - like one that has it's value generated in the client or generated by the newid() function in SQL Server can be a very bad choice - mostly because of the fragmentation that it creates in the base table but also because of its size. It's unnecessarily wide -- 4 times wider than an int-based identity - which can give you 2 billion unique rows. If you need more than 2 billion you can always go with a bigint (8-byte int) and get 2^63-1 rows. You could use the new NEWSEQUENTIALID() function to improve things, but you're still dealing with 16 bytes of glop.
Here are some estimates from Kimberly Tripp on database size:
Base Table with 1,000,000 rows (3.8MB for int vs. 15.26MB with UNIQUEIDENTIFIER)
6 nonclustered indexes (22.89MB for int vs. 91.55MB with UNIQUEIDENTIFIER)
Read more: PETER BROMBERG UnBlog