I'm experimenting with using HyperLogLog (HLL) data types in some columns. One problem with these is that they take up quite a lot more space than a BIGINT. pg_shard potentially allays a lot of those issues. The extension that provides the HLL datatype is this one by aggregateknowledge. These two extensions seem complimentary for warehousing purposes.
Surprisingly, these data types work with sharded tables for most types of reads, but not for writes (see below). When I attempt an update like so:
update test_hll_shard set users = users||hll_hash_text('foobar') where date = date('2015-01-08');
I get:
ERROR: cannot plan sharded modification containing values which are not constants or constant expressions
I can sort of work around this by setting the literal bytes in this field. Which works fine -- but adding HLL values requires a read and a write. Since pg_shard (understandably) doesn't allow for more than a single statement transaction, this leaves my use-case vulnerable to race conditions in multi-writer environments.
Since this function is available on the workers, and is deterministic based on the value of the existing row and the new HLL value to be added, there shouldn't be any issue with dispatching this expression through to the workers.
Is there a hard limitation preventing pg_shard from dispatching modifications for non-constant expressions?
I'm experimenting with using HyperLogLog (HLL) data types in some columns. One problem with these is that they take up quite a lot more space than a BIGINT. pg_shard potentially allays a lot of those issues. The extension that provides the HLL datatype is this one by aggregateknowledge. These two extensions seem complimentary for warehousing purposes.
Surprisingly, these data types work with sharded tables for most types of reads, but not for writes (see below). When I attempt an update like so:
I get:
I can sort of work around this by setting the literal bytes in this field. Which works fine -- but adding HLL values requires a read and a write. Since pg_shard (understandably) doesn't allow for more than a single statement transaction, this leaves my use-case vulnerable to race conditions in multi-writer environments.
Since this function is available on the workers, and is deterministic based on the value of the existing row and the new HLL value to be added, there shouldn't be any issue with dispatching this expression through to the workers.
Is there a hard limitation preventing pg_shard from dispatching modifications for non-constant expressions?