We're updating a large dataset at work -- there's about an 18% increase in the number of tuples in the new dataset spread across a highly-normalized 8 or so tables.
I have to port the new data into a (more) efficient table structure -- so I'm de-normalizing the heck out of the data reducing the schema from eight tables to a single table.
In the old architecture, four of the tables have unique key values that were imposed on the data during the original port. So, to maintain application compatibility in the data catalog, these key values have to be maintained. Additionally, new tuples of data have to be added to the data set and new (old) key values assigned.
In porting over one of the updated tables, which uses a string-code as the primary key, I first export the old table columns (pkey, str_code) into a temp table and then add an auto_increment int() column to the new table marking the column as default = null. I then do a simple update-join to bring over the old pkey values based on the native str_code.
This leaves me with a new numeric column that has a variable number of NULL values (representing the delta of the new-data import) interspersed with the legacy data pkey -> str_code values.
The problem is: how do I convert the NULL pkey fields to a meaningful value that maintains the auto-increment without causing mysql to totally freak?
The first thing I do is get the max-value of the pkey:
select max(pkey_field_name) from table_name;+---------------+ | max(pkey_...) | +---------------+ | 4162 | +---------------+ 1 row in set (0.00 sec)
Next, I need to reset the auto_increment value of the column because, since the column is just a numeric column, it currently defaults to zero. Attempting to convert the column on-the-fly to auto-increment will cause mysql to spit and error out about duplicate primary key values...
mysql> alter table table_name auto_increment=4163;Query OK, 3965 rows affected (0.05 sec) Records: 3965 Duplicates: 0 Warnings: 0
Now that I have the auto_increment reset, I can convert the column to auto_increment type and, in the process of converting, mySQL will re-number the NULL column key values using the new auto_increment value so that my numbering scheme is seamless.
When I query the data back, I can see that my new column is completely re-ordered with the legacy data maintained and the new data correctly keyed.
Resetting the auto-increment key is a handy little trick to know -- I also use it when building test datasets and I need a fast way to reset my table values.
Hope this helps!
Note: Here's the complete and full steps to successfully complete this operation.
(DDL for admin1 table)
CREATE TABLE `admin1` ( `id` int(11) unsigned NOT NULL AUTO_INCREMENT, `field1` varchar(255) NOT NULL, `field2` varchar(255) NOT NULL, `field3` varchar(255) NOT NULL, `field4` varchar(255) NOT NULL, `field5` varchar(255) NOT NULL, PRIMARY KEY (`id`) ) ENGINE=MyISAM AUTO_INCREMENT=4163 DEFAULT CHARSET=utf8
(DDL for admin1ll table)
CREATE TABLE `admin1ll` ( `field1` int(11) NOT NULL, `field2` varchar(255) NOT NULL ) ENGINE=MyISAM DEFAULT CHARSET=utf8
Step 1: Create a new integer column in the table
ALTER TABLE `meridian`.`admin1` DROP COLUMN `id`, ADD COLUMN `id` int UNSIGNED FIRST, CHANGE COLUMN `field1` `field1` varchar(255) NOT NULL AFTER `id`, CHANGE COLUMN `field2` `field2` varchar(255) NOT NULL AFTER `field1`, CHANGE COLUMN `field3` `field3` varchar(255) NOT NULL AFTER `field2`, CHANGE COLUMN `field4` `field4` varchar(255) NOT NULL AFTER `field3`, CHANGE COLUMN `field5` `field5` varchar(255) NOT NULL AFTER `field4`;
Step 2: Update the new column by inserting the previous-tables pkey values
update admin1, admin1ll set admin1.id = admin1ll.field1 where admin1.field1 = admin1ll.field2
Step 3: Update the Auto_Increment value:
> select max(id) from admin1;
> alter table admin1 auto_increment=4031;
Note: the row count does not imply or set the auto_increment value!
Step 4: Reformat Column
Set column to auto_increment, unsignent, not null, primary key
Step 5: Validate!
from admin1, admin1ll
where admin1.id <> admin1ll.field1
and admin1.field1 = admin1ll.field2
> Empty set (1.70 sec)