ActiveRecord PostgreSQL Partitioning
ActiveRecord migrations and model helpers for creating and managing PostgreSQL 10+ partitions!
db/schema.rb
— please use the :sql
schema formatThis gem is tested against:
I plan to separate out the model functionality into a new gem and port the migration functionality into pg_ha_migrations (some of which has already been done).
I will continue to maintain this gem (bugfixes / support for new versions of Rails) until that work is complete.
I originally planned to add a feature for automatic partition creation, but I think that functionality would be better served by pg_partman.
Add this line to your application’s Gemfile:
gem 'pg_party'
And then execute:
$ bundle
Or install it yourself as:
$ gem install pg_party
Note that the gemspec does not require pg
, as some model methods may work for other databases.
Migration methods will be unavailable unless pg
is installed.
These values can be accessed and set via PgParty.config
and PgParty.configure
.
caching
true
caching_ttl
-1
(never expire cache entries)schema_exclude_partitions
rake db
dump
true
create_template_tables
template:
option when creating partitioned tables to override this default.true
create_with_primary_key
primary_key: [:id, :created_at], partition_key: :created_at
create_with_primary_key:
option when creating partitioned tablesfalse
include_subpartitions_in_partition_list
YourModelClass.partiton_list
mby default.include_subpartitions:
option to override this.false
(for backward compatibility)Note that caching is done in-memory for each process of an application. Attaching / detaching partitions will clear the cache, but only for the process that initiated the request. For multi-process web servers, it is recommended to use a TTL or disable caching entirely.
# in a Rails initializer
PgParty.configure do |c|
c.caching_ttl = 60
c.schema_exclude_partitions = false
c.include_subpartitions_in_partition_list = true
# Postgres 11+ users starting fresh may consider the below options to rely on Postgres' native features instead of
# this gem's template tables feature.
c.create_template_tables = false
c.create_with_primary_key = true
end
These methods are available in migrations as well as ActiveRecord::Base#connection
objects.
create_range_partition
table_name
, partitition_key:
create_list_partition
table_name
, partition_key:
create_hash_partition
(Postgres 11+)table_name
, partition_key:
create_range_partition_of
table_name
, start_range:
, end_range:
partition_type:
of :range
, :list
, or :hash
and a partition_key:
create_list_partition_of
table_name
, values:
partition_type:
of :range
, :list
, or :hash
and a partition_key:
create_hash_partition_of
(Postgres 11+)table_name
, modulus:
, remainder
partition_type:
of :range
, :list
, or :hash
and a partition_key:
create_default_partition_of
(Postgres 11+)table_name
attach_range_partition
parent_table_name
, child_table_name
, start_range:
, end_range:
attach_list_partition
parent_table_name
, child_table_name
, values:
attach_hash_partition
(Postgres 11+)parent_table_name
, child_table_name
, modulus:
, remainder
attach_default_partition
(Postgres 11+)parent_table_name
, child_table_name
detach_partition
parent_table_name
, child_table_name
create_table_like
table_name
, new_table_name
partitions_for_table_name
table_name
, include_subpartitions:
(true or false)parent_for_table_name
table_name
traverse: true
to return the top-level table in the hierarchy (for subpartitions)nil
if the table is not a partition / has no parenttable_partitioned?
table_name
add_index_on_all_partitions
table_name
using Postgres’s ADD INDEX CONCURRENTLYtable_name
, column_name
(all add_index
arguments are supported)in_threads:
option to add indexes in parallel threads when there are many partitions. A value of 2 to 4disable_ddl_transaction!
in your migration to disable transactions when using this command with in_threads:
algorithm: :concurrently
.Create range partitioned table on created_at::date
with two partitions:
class CreateSomeRangeRecord < ActiveRecord::Migration[5.1]
def up
# proc is used for partition keys containing expressions
create_range_partition :some_range_records, partition_key: ->{ "(created_at::date)" } do |t|
t.text :some_value
t.timestamps
end
# optional name argument is used to specify child table name
create_range_partition_of \
:some_range_records,
name: :some_range_records_a,
start_range: "2019-06-07",
end_range: "2019-06-08"
# optional name argument is used to specify child table name
create_range_partition_of \
:some_range_records,
name: :some_range_records_b,
start_range: "2019-06-08",
end_range: "2019-06-09"
end
end
Create list partitioned table on id
with two partitions:
class CreateSomeListRecord < ActiveRecord::Migration[5.1]
def up
# symbol is used for partition keys referring to individual columns
create_list_partition :some_list_records, partition_key: :id do |t|
t.text :some_value
t.timestamps
end
# without name argument, child partition created as "some_list_records_<hash>"
create_list_partition_of \
:some_list_records,
values: 1..100
# without name argument, child partition created as "some_list_records_<hash>"
create_list_partition_of \
:some_list_records,
values: 101..200
# default partition support is available in Postgres 11 or higher
create_default_partition_of \
:some_list_records
end
end
Create hash partitioned table on account_id
with two partitions (Postgres 11+ required):
modulus:
should always equal the total number of partitions planned for the tableremainder:
is an integer which should be in the range of 0 to modulus-1
class CreateSomeHashRecord < ActiveRecord::Migration[5.1]
def up
# symbol is used for partition keys referring to individual columns
# create_with_primary_key: true, template: false on Postgres 11 will rely on PostgreSQL's native partition schema
# management vs this gem's template tables
# Note composite primary keys will require a workaround in ActiveRecord, such as through the use of the composite_primary_keys gem
create_hash_partition :some_hash_records, partition_key: :account_id, primary_key: [:id, :account_id],
create_with_primary_key: true, template: false do |t|
t.bigserial :id, null: false
t.bigint :account_id, null: false
t.text :some_value
t.timestamps
end
# without name argument, child partition created as "some_list_records_<hash>"
create_hash_partition_of \
:some_hash_records,
modulus: 2,
remainder: 0
# without name argument, child partition created as "some_list_records_<hash>"
create_hash_partition_of \
:some_hash_records,
modulus: 2,
remainder: 1
end
end
Advanced example with subpartitioning: Create list partitioned table on account_id
subpartitioned by range on created_at
with default partitions. This example is for a table with no primary key… perhaps for some analytics use case.
class CreateSomeListSubpartitionedRecord < ActiveRecord::Migration[5.1]
def up
create_list_partition :some_list_subpartitioned_records, partition_key: :account_id, id: false,
template: false do |t|
t.bigint :account_id, null: false
t.text :some_value
t.created_at
end
create_default_partition_of \
:some_list_subpartitioned_records,
name: :some_list_subpartitioned_records_default,
partition_type: :range,
partition_key: :created_at
create_range_partition_of \
:some_list_subpartitioned_records_default,
name: :some_list_subpartitioned_records_default_2019,
start_range: '2019-01-01',
end_range: '2019-12-31T23:59:59'
create_default_partition_of \
:some_list_subpartitioned_records_default
create_list_partition_of \
:some_list_subpartitioned_records,
name: :some_list_subpartitioned_records_1,
values: 1..100,
partition_type: :range,
partition_key: :created_at
create_range_partition_of \
:some_list_subpartitioned_records_1,
name: :some_list_subpartitioned_records_1_2019,
start_range: '2019-01-01',
end_range: '2019-12-31T23:59:59'
create_default_partition_of
:some_list_subpartitioned_records_1
create_list_partition_of \
:some_list_subpartitioned_records,
name: :some_list_subpartitioned_records_2,
values: 101..200,
partition_type: :range,
partition_key: :created_at
create_range_partition_of \
:some_list_subpartitioned_records_2,
name: :some_list_subpartitioned_records_2_2019,
start_range: '2019-01-01',
end_range: '2019-12-31T23:59:59'
create_default_partition_of \
:some_list_subpartitioned_records_2
end
end
Unfortunately, PostgreSQL 10 doesn’t support indexes on partitioned tables.
However, individual partitions can have indexes.
To avoid explicit index creation for every new partition, we’ve introduced the idea of template tables.
For every call to create_list_partition
and create_range_partition
, a clone <table_name>_template
is created.
Indexes, constraints, etc. created on the template table will propagate to new partitions in calls to create_list_partition_of
and create_range_partition_of
:
class CreateSomeListRecord < ActiveRecord::Migration[5.1]
def up
# template table creation is enabled by default - use "template: false" or the config option to opt-out
create_list_partition :some_list_records, partition_key: :id do |t|
t.integer :some_foreign_id
t.text :some_value
t.timestamps
end
# create index on the template table
add_index :some_list_records_template, :some_foreign_id
# create partition with an index on "some_foreign_id"
create_list_partition_of \
:some_list_records,
values: 1..100
# create partition with an index on "some_foreign_id"
create_list_partition_of \
:some_list_records,
values: 101..200
end
end
Attach an existing table to a range partitioned table:
class AttachRangePartition < ActiveRecord::Migration[5.1]
def up
attach_range_partition \
:some_range_records,
:some_existing_table,
start_range: "2019-06-09",
end_range: "2019-06-10"
end
end
Attach an existing table to a list partitioned table:
class AttachListPartition < ActiveRecord::Migration[5.1]
def up
attach_list_partition \
:some_list_records,
:some_existing_table,
values: 200..300
end
end
Attach an existing table to a hash partitioned table:
class AttachHashPartition < ActiveRecord::Migration[5.1]
def up
attach_hash_partition \
:some_hash_records,
:some_existing_table,
modulus: 2,
remainder: 1
end
end
Detach a partition from any partitioned table:
class DetachPartition < ActiveRecord::Migration[5.1]
def up
detach_partition :parent_table, :child_table
end
end
add_index
commandsPostgres 11+ will automatically cascade CREATE INDEX operations to partitions and subpartitions, however
CREATE INDEX CONCURRENTLY is not supported, meaning table locks will be taken on each table while the new index is built.
Postgres 10 provides no way to cascade index creation natively.
add_index_on_all_partitions
method solves for these limitations by recursively creating the specified
class AddSomeValueIndexToSomeListRecord < ActiveRecord::Migration[5.1]
# add_index_on_all_partitions with in_threads option may not be used within a transaction
# (also, algorithm: :concurrently cannot be used within a transaction)
disable_ddl_transaction!
def up
add_index :some_records_template, :some_value # Only if using Postgres 10 with template tables
# Pass the `in_threads:` option to create indices in parallel across multiple Postgres connections
add_index_on_all_partitions :some_records, :some_value, algorithm: :concurrently, in_threads: 4
end
end
For more examples, take a look at the Combustion schema definition and integration specs:
Class methods available to all ActiveRecord models:
partitioned?
range_partition_by
key
(partition key column) or block returning partition key expressionlist_partition_by
key
(partition key column) or block returning partition key expressionhash_partition_by
key
(partition key column) or block returning partition key expressionClass methods available to both range and list partitioned models:
partitions
include_subpartitions:
argument to include all subpartitions in the returned listin_partition
child_table_name
partition_key_eq
value
Class methods available to range partitioned models:
create_partition
start_range:
, end_range:
partition_key_in
start_range
, end_range
Class methods available to list partitioned models:
create_partition
values:
partition_key_in
values
Class methods available to hash partitioned models:
create_partition
modulus:
, remainder:
partition_key_in
values
Configure model backed by a range partitioned table to get access to the methods described above:
class SomeRangeRecord < ApplicationRecord
# block is used for partition keys containing expressions
range_partition_by { "(created_at::date)" }
end
Configure model backed by a list partitioned table to get access to the methods described above:
class SomeListRecord < ApplicationRecord
# symbol is used for partition keys referring to individual columns
list_partition_by :id
end
Configure model backed by a hash partitioned table to get access to the methods described above:
class SomeHashRecord < ApplicationRecord
# symbol is used for partition keys referring to individual columns
hash_partition_by :id
end
Dynamically create new partition from range partitioned model:
# additional options include: "name:" and "primary_key:"
SomeRangeRecord.create_partition(start_range: "2019-06-09", end_range: "2019-06-10")
Dynamically create new partition from list partitioned model:
# additional options include: "name:" and "primary_key:"
SomeListRecord.create_partition(values: 200..300)
Dynamically create new partition from hash partitioned model:
# additional options include: "name:" and "primary_key:"
SomeHashRecord.create_partition(modulus: 2, remainder: 1)
For range partitioned model, query for records where partition key in range of values:
SomeRangeRecord.partition_key_in("2019-06-08", "2019-06-10")
For list and hash partitioned models, query for records where partition key in list of values:
SomeListRecord.partition_key_in(1, 2, 3, 4)
For all partitioned models, query for records matching partition key:
SomeRangeRecord.partition_key_eq(Date.current)
SomeListRecord.partition_key_eq(100)
For all partitioned models, retrieve currently attached partitions:
SomeRangeRecord.partitions
SomeListRecord.partitions(include_subpartitions: true) # Include nested subpartitions
For both all partitioned models, retrieve ActiveRecord model scoped to individual partition:
SomeRangeRecord.in_partition(:some_range_records_partition_name)
SomeListRecord.in_partition(:some_list_records_partition_name)
To create range partitions by month for previous, current and next months it’s possible to use this example. To automate creation of partitions, run Log.maintenance
every day with cron:
class Log < ApplicationRecord
range_partition_by { '(created_at::date)' }
def self.maintenance
partitions = [Date.today.prev_month, Date.today, Date.today.next_month]
partitions.each do |day|
name = Log.partition_name_for(day)
next if ActiveRecord::Base.connection.table_exists?(name)
Log.create_partition(
name: name,
start_range: day.beginning_of_month,
end_range: day.next_month.beginning_of_month
)
end
end
def self.partition_name_for(day)
"logs_y#{day.year}_m#{day.month}"
end
end
For more examples, take a look at the model integration specs:
The development / test environment relies heavily on Docker.
Start the containers in the background:
$ docker-compose up -d
Install dependencies:
$ bin/de bundle
$ bin/de appraisal
Run the tests:
$ bin/de appraisal rake
Open a Pry console to play around with the sample Rails app:
$ bin/de console
Bug reports and pull requests are welcome on GitHub at https://github.com/rkrage/pg_party. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.
The gem is available as open source under the terms of the MIT License.
Everyone interacting in the PgParty project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.