This is my query:
CREATE TABLE rtl.intermediate AS (
SELECT
customer_id,
MAX(new_to) AS new_to,
MIN(age) AS age,
MIN(gender) AS gender,
MIN(existing) AS existing
FROM rtl.base
WHERE
country='China'
AND
product='cereal'
AND
dt BETWEEN '2015-01-01' AND '2016-01-01'
GROUP BY customer_id
) WITH DATA
UNIQUE PRIMARY INDEX (customer_id, new_to, gender);
It currently takes about 10 seconds to run, and I would like to bring it down to 2 seconds. The rtl.base table is partitioned on date (every 7 days) and has a primary index on customer_id, product, country, date (called dt). I have collected statistics on the partition and the age column.
This is the explain:
1) First, we lock a distinct rtl."pseudo table" for read on a
RowHash to prevent global deadlock for
rtl.base.
2) Next, we lock rtl.intermediate for
exclusive use, and we lock rtl.base for read.
3) We lock a distinct DBC."pseudo table" for read on a RowHash for
deadlock prevention.
4) We lock DBC.DBase for read on a RowHash.
5) We do a single-AMP ABORT test from DBC.DBase by way of the unique
primary index "Field_1 = 'rtl'" with a residual condition of (
"'0000BF0A'XB= DBC.DBase.Field_2").
6) We create the table header.
7) We do an all-AMPs SUM step to aggregate from 53 partitions of
rtl.base with a condition of (
"(rtl.base.dt >= DATE '2015-01-01') AND
((rtl.base.dt <= DATE '2016-01-01') AND
((rtl.base.country = 'CHN') AND
(rtl.base.product = 'cereal')))")
, grouping by field1 ( rtl.base.customer_id).
Aggregate Intermediate Results are computed globally, then placed
in Spool 3. The size of Spool 3 is estimated with no confidence
to be 8,142,324 rows (293,123,664 bytes). The estimated time for
this step is 0.28 seconds.
8) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of
an all-rows scan into Spool 1 (all_amps), which is redistributed
by the hash code of (rtl.base.customer_id,
rtl.base.new_to,
rtl.base.gender) to all AMPs. Then we do a
SORT to order Spool 1 by row hash. The size of Spool 1 is
estimated with no confidence to be 8,142,324 rows (227,985,072
bytes). The estimated time for this step is 0.15 seconds.
9) We do an all-AMPs MERGE into
rtl.intermediate from Spool 1 (Last Use).
The size is estimated with no confidence to be 8,142,324 rows.
The estimated time for this step is 1 minute and 27 seconds.
10) We lock a distinct DBC."pseudo table" for write on a RowHash for
deadlock prevention, we lock a distinct DBC."pseudo table" for
write on a RowHash for deadlock prevention, and we lock a distinct
DBC."pseudo table" for write on a RowHash for deadlock prevention.
11) We lock DBC.Indexes for write on a RowHash, we lock DBC.TVFields
for write on a RowHash, we lock DBC.TVM for write on a RowHash,
and we lock DBC.AccessRights for write on a RowHash.
12) We execute the following steps in parallel.
1) We do a single-AMP ABORT test from DBC.TVM by way of the
unique primary index "Field_1 = '0000BF0A'XB, Field_2 =
'INTERMEDIATE'".
2) We do an INSERT into DBC.Indexes (no lock required).
3) We do an INSERT into DBC.Indexes (no lock required).
4) We do an INSERT into DBC.Indexes (no lock required).
5) We do an INSERT into DBC.TVFields (no lock required).
6) We do an INSERT into DBC.TVFields (no lock required).
7) We do an INSERT into DBC.TVFields (no lock required).
8) We do an INSERT into DBC.TVFields (no lock required).
9) We do an INSERT into DBC.TVFields (no lock required).
10) We do an INSERT into DBC.TVM (no lock required).
11) We INSERT default rights to DBC.AccessRights for
rtl.intermediate.
13) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> No rows are returned to the user as the result of statement 1.
thanks!
runs out of CPU time and does not complete = workload CPU limit?
Changing the granularity to '7' DAY will not help, it's still sorting the same number of rows.
Columnar will not help, too, as it needs more CPU.
You probably need to change the index order to
PARTITION BY (COLUMN, RANGE_N(dt BETWEEN '2010-01-01' AND '2016-01-01' EACH INTERVAL '7' DAY))
,UNIQUE INDEX (dt, country, product, channel);
But why do you want a USI on those columns? It's just CPU/IO/perm overhead and probbaly never used. And if this combination is unique why it's not defined as UPI in your first create?
Do you need to change the PI, otherwise it should be faster because there's no redistribution needed.
Btw, if you run TD15.10 there are additional options for Columnar tables like Primary AMP Index...