Did you know? Tables in PostgreSQL are limited to 1,600 columns

Did you know a table can have no more than 1,600 columns? This blog article was inspired by a conversation Pierre Ducroquet and I had.

First, the documentation

The PostgreSQL documentation Appendix K states a table can have a maximum of 1,600 columns.

This is a hard coded limit that can be found in the source code at src/include/access/htup_details.h:

#define MaxTupleAttributeNumber 1664
#define MaxHeapAttributeNumber	1600

Reaching the limit the expected way

Let’s fully validate the claim and test accordingly.

Playing with table definition

Here, we’ll use a simple bash script because it is easy to adapt while testing.

-- Classic example

DO $$             
DECLARE
    i int;
BEGIN
    EXECUTE 'DROP TABLE IF EXISTS tint_1601;';
    EXECUTE 'CREATE TABLE tint_1601(i_1 int);';
    FOR i IN 2..1601 LOOP
        EXECUTE format('ALTER TABLE tint_1601 ADD COLUMN i_%s int;', i);
    END LOOP;
END $$;

The typical output is as follows:

NOTICE:  table "tint_1600" does not exist, skipping
ERROR:  tables can have at most 1600 columns
CONTEXT:  SQL statement "ALTER TABLE tint_1600 ADD COLUMN i_1601 int;"
PL/pgSQL function inline_code_block line 8 at EXECUTE

So far so good (or at least, all is working as expected).

You might have the idea to try replacing int4 with int2 type to create a 1,600+ column table. It will not work as this is a hard coded limit.

Testing the limits

Let us continue pushing to the limits.

We now create another 1,600 column table using the char(127) data type.

We reuse our sql script with some modifications:

-- Create a table with 1,600 columns: 1 x int + 1599 x char(127)
DO $$             
DECLARE
    i int;
BEGIN
    EXECUTE 'DROP TABLE IF EXISTS tint_1600;';
    EXECUTE 'CREATE TABLE tint_1600(i_1 int);';
    FOR i IN 2..1600 LOOP
        EXECUTE format('ALTER TABLE tint_1600 ADD COLUMN c_%s char(127) NOT NULL;', i);
    END LOOP;
END $$;

-- Insert a tuple - 1 x int + 1599 x char(127)
DO $$
DECLARE
    s TEXT;
BEGIN
    s := format( 
                 'INSERT INTO tint_1600 VALUES (1%s);'
               , repeat( $q$,'1'::char(127)$q$ , 1599 ) 
               );
    EXECUTE s;
END $$;

The output is:

ERROR:  row is too big: size 25616, maximum size 8160

As we can see, the table has 1,600 columns but this time the tuple cannot fit a single heap page which explains the error “row is too big: size 25616, maximum size 8160”. If you paid attention to the modified script, you can see columns are defined as NOT NULL so at table creation PostgreSQL could have proven data insertion was impossible.

What about JOINs?

To keep things simple, let us auto-join:

SELECT a.*,b.* FROM tint_1600 a, tint_1600 b;
ERROR:  target lists can have at most 1664 entries

Now the SELECT clause (a.*,b.*) is reaching its own limit (MaxTupleAttributeNumber = 1664).

Reaching the column limit the unexpected way

Sometimes, you have to modify your application and it generates schema modifications.
Most of the time, there are table modifications like adding or dropping columns.

Exploring `ADD` / `DROP COLUMN`

Let us see what happens from the SQL side when we add, then drop, a column.

=# CREATE TABLE tadc_1600(i_1 int NOT NULL);

CREATE TABLE

=# ALTER TABLE tadc_1600 ADD COLUMN i_2 int NOT NULL;

ALTER TABLE

=# SELECT attname,attnum,attstorage,attnotnull,attisdropped 
   FROM pg_attribute 
   WHERE attrelid=(
                   SELECT oid 
                   FROM pg_class 
                   WHERE relname='tadc_1600'
                   ) 
     AND attnum > 0 ORDER BY attnum;
     
 attname | attnum | attstorage | attnotnull | attisdropped 
---------+--------+------------+------------+--------------
 i_1     |      1 | p          | t          | f
 i_2     |      2 | p          | t          | f
(2 rows)

=# ALTER TABLE tadc_1600 DROP COLUMN i_2;

ALTER TABLE

=# SELECT attname,attnum,attstorage,attnotnull,attisdropped 
   FROM pg_attribute 
   WHERE attrelid=(
                   SELECT oid 
                   FROM pg_class 
                   WHERE relname='tadc_1600'
                   ) 
     AND attnum > 0 ORDER BY attnum;

           attname            | attnum | attstorage | attnotnull | attisdropped 
------------------------------+--------+------------+------------+--------------
 i_1                          |      1 | p          | t          | f
 ........pg.dropped.2........ |      2 | p          | f          | t
(2 rows)

When dropping a column,

the name becomes ‘.’ + ‘pg.dropped.’ + attnum + ‘.’,
the column becomes NULLable,
the column is marked as dropped.

Iterating ADD / DROP COLUMN

One can wonder if there is a limit to the number of add/drop operations that can be run on a given table.

As usual, let us try:

-- ADD / DROP COLUMN example
DO $$             
DECLARE
    i int;
BEGIN
    EXECUTE 'DROP TABLE IF EXISTS tadc;';
    EXECUTE 'CREATE TABLE tadc(i_1 int);';
    FOR i IN 2..1601 LOOP
        EXECUTE format('ALTER TABLE tadc ADD COLUMN i_%s int;', i);
        EXECUTE format('ALTER TABLE tadc DROP COLUMN i_%s;', i);
    END LOOP;
END $$;

The output is:

ERROR:  tables can have at most 1600 columns
CONTEXT:  SQL statement "ALTER TABLE tadc ADD COLUMN i_1601 int;"
PL/pgSQL function inline_code_block line 8 at EXECUTE

Oh oh! We reached the 1,600 limit here as well.

Let us explore a bit after add/drop column 1,599 times:

=# SELECT attname,attnum,attstorage,attnotnull,attisdropped 
   FROM pg_attribute 
   WHERE attrelid=(
                   SELECT oid 
                   FROM pg_class 
                   WHERE relname='tadc'
                   ) 
     AND attnum > 0 ORDER BY attnum;

             attname             | attnum | attstorage | attnotnull | attisdropped 
---------------------------------+--------+------------+------------+--------------
 i_1                             |      1 | p          | t          | f
 ........pg.dropped.2........    |      2 | p          | f          | t
 ........pg.dropped.3........    |      3 | p          | f          | t
 ........pg.dropped.4........    |      4 | p          | f          | t
 ........pg.dropped.5........    |      5 | p          | f          | t

 ........pg.dropped.1599........ |   1599 | p          | f          | t
 ........pg.dropped.1600........ |   1600 | p          | f          | t
(1600 rows)

Well, table tadc has 1,600 columns. You can see this as modifications are appending and table content rewriting is avoided.

At this point, further column add & drop modifications will fail.

Is there anything I can do to escape this situation?

The VACUUM knight shall save the PostgreSQL princess, right?

The VACUUM command operates at the tuple level so even if you run a VACUUM FULL the table structure will not change.

So, the dragon ate the knight, what’s next?

This is not an issue with dead tuples but rather an issue with the catalog.
You’ll need to create a new table definition.

Here are some solutions, from simple to complex:

Build a new table (requires service downtime)
- CREATE TABLE (LIKE INCLUDING ALL)
- COPY data from old to new table
- Rename tables
- Drop old table
Leverage logical replication (minimize service downtime)
- CREATE TABLE LIKE (INCLUDING ALL)
- CREATE local PUBLICATION/SUBSCRIPTION
- Once data is synchronized, stop/pause application service
- Drop subscription
- Rename tables
- Restart/resume application
- Drop old table

What about Foreign Keys?

The above solution works fine for simple cases. But real life tables often
use integrity constraints. Let’s explore a bit using foreign keys.

-- Foreign key case

=# CREATE TABLE colors (id int, name text );
=# CREATE TABLE objects ( id int, color_id int, name text );

=# ALTER TABLE colors ADD PRIMARY KEY (id);
=# ALTER TABLE objects ADD CONSTRAINT fk_color
                       FOREIGN KEY (color_id) REFERENCES colors (id);

=# INSERT INTO colors 
   VALUES (1,'red'), (2, 'green'), (3, 'blue' );

=# INSERT INTO objects 
   VALUES (1,1, 'red object')
         ,(2,2, 'green object')
         ,(3,3,'blue object');

Let’s apply the recipe:

-- Duplicate table structure (valid columns only)  and copy data
=# CREATE TABLE tmp_colors (LIKE colors INCLUDING ALL);
=# INSERT INTO tmp_colors SELECT * FROM colors;

-- Do the DROP/RENAME trick
=# BEGIN;
=# DROP TABLE colors;
=# ALTER TABLE tmp_colors RENAME TO colors;
=# COMMIT;

The DROP TABLE command issued an error:

ERROR:  cannot drop table colors because other objects depend on it
DETAIL:  constraint fk_color on table objects depends on table colors
HINT:  Use DROP ... CASCADE to drop the dependent objects too.

As we can see, the recipe has to be changed to include dependent tables as well.

Adding CASCADE will drop FK constraints on dependent tables.

Let’s run a modified version of the recipe:

-- Do the DROP/RENAME trick
=# BEGIN;

=# DROP TABLE colors CASCADE;  -- DROP related FOREIGN KEY constaints

=# ALTER TABLE tmp_colors RENAME TO colors;

-- Recreate FK contraint
=# ALTER TABLE objects ADD CONSTRAINT fk_color
                       FOREIGN KEY (color_id) REFERENCES colors (id);

COMMIT;

We have to check the behaviour is the expected one:

=# INSERT INTO objects VALUES (5,5,'ro');
ERROR:  insert or update on table "objects" violates foreign key constraint "fk_color"
DETAIL:  Key (color_id)=(5) is not present in table "colors".

=# INSERT INTO objects VALUES (5,3,'ro');
INSERT 0 1

Success!

When integrity constraints are too numerous or you find it difficult to follow,
you may use pg_dump/pg_restore to rebuild all automatically. If service downtime
is an issue, you may use logical replication to perform like pg_dump/pg_restore.

Best is to avoid having to deal with this

As you can see, having to deal with the 1,600 column limit is not something you would
like to do just for fun (usually). Notably, it can lead to service downtime.

Talk to us

Do you have other ideas of how to address this situation? Have you run into odd ways of reaching this hard-coded limit? Contact us! We always love a good discussion about PostgreSQL.