Sequences简述
说起数据库,我们都首先想到的就是数据表和索引。但是要进行检索的时候,我们都需要一个标识;标识有物理标识,比如块号以及偏移量;也有逻辑标识,通常就是我们常说的主键。而主键的值大部分情况下就是一串序列号。
由于序列号是自增的,业务方如果来维护这个自增的逻辑就太过复杂了。所以,作为一个数据库产品一般都会提供Sequence机制,保证即使很多数据库会话同时使用同一个Sequence,产生的新序列号是唯一的。比如在MySQL中的AUTO_INCREMENT
和在PostgreSQL中的Sequence对象。
PostgreSQL中的Sequence
PostgreSQL中,通过nextval('my_seq')
的方式使用sequences,如下。
CREATE SEQUENCE my_tab_id_seq;
CREATE TABLE my_tab (
id bigint PRIMARY KEY DEFAULT nextval('my_tab_id_seq'::regclass),
...
);
ALTER SEQUENCE my_tab_id_seq OWNED BY my_tab.id;
等价于简化写法
CREATE TABLE my_tab (
id bigserial PRIMARY KEY,
...
);
PostgreSQL 10中,引入了标准的SQL方式GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY [ ( sequence_options ) ]
,定义一个表使用自动生成的唯一值。
CREATE TABLE my_tab (
id bigint GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
...
);
Sequence操作
Sequence同样是PostgreSQL的一类对象,在pg_class
可以检索到相关信息。并且在PostgreSQL 10中,每个Sequence具体信息,将存储在pg_sequence
系统表中。
putong-test=# select oid,relname from pg_class where relname ~ 'my_tab_id_seq';
-[ RECORD 1 ]----------
oid | 132344
relname | my_tab_id_seq
putong-test=# select pg_relation_filepath(seqrelid), seqtypid, seqstart, seqincrement, seqmax, seqmin, seqcache, seqcycle from pg_sequence where seqrelid = '132344' limit 2;
-[ RECORD 1 ]--------+--------------------
pg_relation_filepath | base/16428/132344
seqtypid | 20
seqstart | 1
seqincrement | 1
seqmax | 9223372036854775807
seqmin | 1
seqcache | 1
seqcycle | f
关于Sequence的操作函数有这四种:nextval currval lastval setval
,其中nextval保证的是唯一性并不保证连续性,并且不是事务安全的,如下例。
putong-test=# begin;
BEGIN
putong-test=# select nextval('my_tab_id_seq');
nextval
---------
7
(1 row)
putong-test=# rollback ;
ROLLBACK
putong-test=# select nextval('my_tab_id_seq');
nextval
---------
8
(1 row)
当一个事务运行的时候,db无法预知这个事务会commit还是rollback。如果很多事务运行,一部分成功一部分失败,无法追查哪些id成了被释放了。记住sequence不能提供连续的id。如果在一些场景中,不允许不连续,那么Sequence不适用。
Sequence默认值冲突
这里有个问题就是:如果用户插入了比当前Sequence值大的数,这样当主键Sequence增长到该值时,它会产生主键约束冲突的错误。因此,用户的提前插入应该是一个错误,可利用GENERATED ALWAYS
的方式避免:
putong-test=# CREATE TABLE my_tab (
id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
a text
);
CREATE TABLE
基于这种方式,我们不想使用Sequence的默认值时,需要使用OVERRIDING SYSTEM VALUE
,这就很难误操作了,如下例。
putong-test=# insert into my_tab(a) values ('adsf');
INSERT 0 1
putong-test=# insert into my_tab(a) values ('adsf');
INSERT 0 1
putong-test=# insert into my_tab(a) values ('adsf');
INSERT 0 1
putong-test=# select * from my_tab;
id | a
----+------
1 | adsf
2 | adsf
3 | adsf
(3 rows)
putong-test=# insert into my_tab (id , a) values (4, 'asdf');
ERROR: cannot insert into column "id"
DETAIL: Column "id" is an identity column defined as GENERATED ALWAYS.
HINT: Use OVERRIDING SYSTEM VALUE to override.
putong-test=# insert into my_tab (id , a) OVERRIDING SYSTEM VALUE values (4, 'asdf');
INSERT 0 1
putong-test=# insert into my_tab(a) values ('adsf');
ERROR: duplicate key value violates unique constraint "my_tab_pkey"
DETAIL: Key (id)=(4) already exists.
ALTER SEQUENCE
在PostgreSQL 10中,Sequence是一个特殊的表,该表只有一行。之前sequence不是在表中保存,那么ALTER SEQUENCE
的时候只是修改一行数据,不能rollback。现在在系统表中保存,ALTER SQUENCE就是事务安全的了,但是setval() nextval()
仍然不是事务安全的。
commit log
commit 3d79013b970d4cc336c06eb77ed526b44308c03e Author: Andres Freund <andres@anarazel.de> Date: Wed May 31 16:39:27 2017 -0700 Make ALTER SEQUENCE, including RESTART, fully transactional. Previously the changes to the "data" part of the sequence, i.e. the one containing the current value, were not transactional, whereas the definition, including minimum and maximum value were. That leads to odd behaviour if a schema change is rolled back, with the potential that out-of-bound sequence values can be returned. To avoid the issue create a new relfilenode fork whenever ALTER SEQUENCE is executed, similar to how TRUNCATE ... RESTART IDENTITY already is already handled. This commit also makes ALTER SEQUENCE RESTART transactional, as it seems to be too confusing to have some forms of ALTER SEQUENCE behave transactionally, some forms not. This way setval() and nextval() are not transactional, but DDL is, which seems to make sense. This commit also rolls back parts of the changes made in 3d092fe540 and f8dc1985f as they're now not needed anymore. Author: Andres Freund Discussion: https://postgr.es/m/20170522154227.nvafbsm62sjpbxvd@alap3.anarazel.de Backpatch: Bug is in master/v10 only
这意味着ALTER SEQUENCE会创建一个新的数据文件,当commit的时候,老的文件删掉。这和TRUNCATE CLUSTER VACUUM(FULL)以及部分ALTER TABLE操作类似。必然,这样ALTER TABLE 会比之前更慢。但是这个操作不是那么频繁,还是可以接受的。
Sequence管理
当前Sequence的值
CREATE OR REPLACE FUNCTION sequence_values() RETURNS TABLE(name text, value bigint) AS $sequence_values$
DECLARE
nsp_name TEXT;
seq_name TEXT;
BEGIN
FOR nsp_name, seq_name IN
SELECT nspname::text, relname::text
FROM pg_class
JOIN pg_namespace
ON pg_class.relnamespace = pg_namespace.oid WHERE relkind='S'
LOOP
RETURN QUERY EXECUTE 'SELECT ''' || nsp_name || '.' || seq_name || '''::text, last_value FROM "' || nsp_name || '"."' || seq_name || '"';
END LOOP;
END;
$sequence_values$
LANGUAGE plpgsql;
该函数得到库中所有Sequence的当前值。
重置Sequence值
当COPY的时候,sequence没有及时的更新,会因为主键冲突,导致数据插不进去(血泪的教训)。如下SQL来修正Sequence的值,使其从之前数据最大值开始。
SELECT 'SELECT SETVAL(' ||
quote_literal(quote_ident(PGT.schemaname) || '.' || quote_ident(S.relname)) ||
', COALESCE(MAX(' ||quote_ident(C.attname)|| '), 1) ) FROM ' ||
quote_ident(PGT.schemaname)|| '.'||quote_ident(T.relname)|| ';'
FROM pg_class AS S,
pg_depend AS D,
pg_class AS T,
pg_attribute AS C,
pg_tables AS PGT
WHERE S.relkind = 'S'
AND S.oid = D.objid
AND D.refobjid = T.oid
AND D.refobjid = C.attrelid
AND D.refobjsubid = C.attnum
AND T.relname = PGT.tablename
ORDER BY S.relname;
上述SQL会拼接一个 reset.sql
,执行reset.sql
。
修复 sequence的owner
SELECT 'ALTER SEQUENCE '|| quote_ident(MIN(schema_name)) ||'.'|| quote_ident(MIN(seq_name))
||' OWNED BY '|| quote_ident(MIN(TABLE_NAME)) ||'.'|| quote_ident(MIN(column_name)) ||';'
FROM (
SELECT
n.nspname AS schema_name,
c.relname AS TABLE_NAME,
a.attname AS column_name,
SUBSTRING(d.adsrc FROM E'^nextval\\(''([^'']*)''(?:::text|::regclass)?\\)') AS seq_name
FROM pg_class c
JOIN pg_attribute a ON (c.oid=a.attrelid)
JOIN pg_attrdef d ON (a.attrelid=d.adrelid AND a.attnum=d.adnum)
JOIN pg_namespace n ON (c.relnamespace=n.oid)
WHERE has_schema_privilege(n.oid,'USAGE')
AND n.nspname NOT LIKE 'pg!_%' escape '!'
AND has_table_privilege(c.oid,'SELECT')
AND (NOT a.attisdropped)
AND d.adsrc ~ '^nextval'
) seq
GROUP BY seq_name HAVING COUNT(*)=1;
检查孤儿Sequence
SELECT ns.nspname AS schema_name, seq.relname AS seq_name
FROM pg_class AS seq
JOIN pg_namespace ns ON (seq.relnamespace=ns.oid)
WHERE seq.relkind = 'S'
AND NOT EXISTS (SELECT * FROM pg_depend WHERE objid=seq.oid AND deptype='a')
ORDER BY seq.relname;
引用文献
https://wiki.postgresql.org/wiki/Fixing_Sequences