TomNomNom.com

Blogging since 2008 until 2010

PostgreSQL sequence naming - how it works

I've been dabbling with PostgreSQL for a little while now. There are a lot of differences when compared to MySQL - but I think I like it. One of the big differences is the way auto incrementing fields are handled. In MySQL you can simply use the 'auto_increment' attribute - but PostgreSQL uses sequences instead.

The easiest way to define a sequence is to create a table with a field that uses the 'SERIAL' pseudo-datatype. Such fields are actually of the integer datatype, but let PostgreSQL know to create a sequence associated with that field, named thusly: tablename_fieldname_seq

Having migrated a 250-ish table database from MySQL to PostgreSQL, I found myself in need of a way to update all the sequences that had been automatically created. It was then that I discovered that there is a maximum length of 64 bytes for a sequence name, and that some of my table and field names were long enough to exceed this limit (bad, I know. But I didn't name them :-P )

It took a fair bit of analysis to figure out what PostgreSQL does when such a situation arises, so allow me to save you the leg work of ever having to figure it out. Writing it out in pseudo-code would be way too verbose and look hiddious (with my pseudo-code it would, anyway) - so I've just written it as an AWK script for processing a file in the format tablename fieldname.

#!/bin/awk -f
{
 tablename = $1;
 fieldname = $2;

 # Default to using all of tablename and 
 # all of fieldname
 seqname = tablename "_" fieldname "_seq";

 # If the length exceeds 64 bytes, we need to fix it!
 if (length(seqname) > 64)
 {

  # If tablename AND fieldname are longer than 29 bytes,
  # use the first 29 bytes of each
  if ((length(tablename > 29)) && (length(fieldname) > 29))
  {
    seqname = substr(tablename,1,29) "_" substr(fieldname,1,29) "_seq";
  }

  # If only tablename is longer than 29 bytes, use the first
  # 29 bytes of it and all of fieldname
  if ((length(tablename) > 29) && (length(fieldname) <= 29))
  {
    seqname = substr(tablename,1,(58 - length(fieldname))) "_" substr(fieldname,1,29) "_seq";
  }

  # If only fieldname is longer than 29 bytes, use the first
  # 29 bytes of it and all of tablename
  if ((length(tablename) <= 29) && (length(fieldname) > 29))
  {
    seqname = substr(tablename,1,29) "_" substr(fieldname,1,(58 - length(tablename))) "_seq";
  }
 }

 # Generate SQL to update the sequences
 print "SELECT setval('" seqname "', (SELECT max(" $2 ") + 1 from " $1 "));"
}

I've tried to comment it as best I can - I really need syntax highlighting on my code blocks :-P

You may notice that this particular script generates SQL to update sequences for all given tablename / fieldname pairs - that was the whole reason I wrote it in the first place: after inserting data from my old MySQL tables, the sequences needed to be updated so that any new data would get the correct value from the sequence.

A typical output from the script might be:

SELECT setval('users_uid_seq', (SELECT max(uid) + 1 from users));
SELECT setval('customers_cid_seq', (SELECT max(cid) + 1 from customers));
SELECT setval('products_product_id_seq', (SELECT max(product_id) + 1 from products));

If that's confused the hell out of you, or I've done something stupid: please feel free to email me on mail at tomnomnom dot com.

First posted: Thu, 08 Jan 2009 00:51:02 +0000