Table of Contents

Database design

Database scripts

For the automatic creation and modification of the database scheme, the database scripts is used.

JPA metamodel

We are using generated jpa metamodel by configuration, where criteria api is used.

...
// id
if (filter.getId() != null) {
  predicates.add(builder.equal(root.get(AbstractEntity_.id), filter.getId()));
}
...

Primary keys (bigint vs. uuid)

Pros and cons of the use of uuid as the primary key for the database.

Listing and searching according to the string guid in PostgreSQL:

SELECT encode(i.id, 'hex')::uuid, i.* FROM idm_identity i WHERE i.id = uuid_send('5cd7ec1c-6b34-45b7-9835-e533e9374ebf'::uuid)

Primary keys testing

The aim of the testing should be to verify the steady time of data record / reading in the chart with an increasing number of records and the size of the charts and indexes. Two charts in the binding 1:N (master/detail) with the primary keys, indexes for the uniqueness of the record name (master name and detail name in one master) and for the index of the relation between the masters - detail for searching all the details of the master in question were created (todo: link to git for the structure, when it appears in develop).

How to read the individual records:

bigint as the primary key in PostgreSQL 9.4
Number of records (master) Number of records (detail) Insert (master) Insert (detail) Load details by master Index Size (master) Index size (detail) PK size (detail)
5000 96995 9ms 10ms 1ms 0,5MB 8MB 2MB
10000 203597 11ms 11ms 2ms 1MB 18MB 4MB
50000 967475 11ms 11ms 2ms 4MB 85MB 19.5MB
100000 1943455 11ms 12ms 2ms 8MB 171MB 39MB
1000000 19505030 11ms 12ms 2.5ms 73MB 1703MB 417MB
binary(16) representing uuid as the primary key in PostgreSQL 9.4
Number of records (master) Number of records (detail) Insert (master) Insert (detail) Load details by master Index Size (master) Index size (detail) PK size (detail)
5000 97652 11ms 12ms 2.5ms 1MB 16M 4.5MB
10000 194804 11.2ms 13,5ms 3ms 1.5MB 31MB 9MB
50000 973836 11.4ms 13.5ms 3ms 5MB 156MB 45.5MB
100000 1947062 11.7ms 14.5ms 3ms 10MB 313MB 95.2MB
1000000 19380690 30ms 30ms 3.5ms 68MB 2653MB 750MB

The testing took place on localhost (16GB RAM, 2 processors, regular HDD, default database configuration, logging turned off), so the individual times are not fully representative - it is the stability that matters the most. With the increasing number of records, the disadvantage of the non-sequential uuid PK starts to come through, when there are more accesses to the disk - work with different pages (the default set-up size of one page is 8 kB - can contain approx. 200 uuid) in which the index itself is saved.

After replacement of HDD by SSD:

binary(16) representing uuid as the primary key in PostgreSQL 9.4 - SSD disk
Number of records (master) Number of records (detail) Insert (master) Insert (detail) Load details by master Index Size (master) Index size (detail) PK size (detail)
1000000 19 380 690 5ms 5ms 3.5ms 68MB 2653MB 750MB
1500000 28 732 177 5ms 5ms 4ms 134MB 4707MB 1130MB
1500000 28 732 177 5ms 5ms 4ms 134MB 4707MB 1130MB
1500000 28 732 177 5ms 5ms 4ms 134MB 4707MB 1130MB

The use of binary(16) identifiers in PostgreSQL 9.4 on the number of records tested is (~40M) stable.