Hello All,
I’m experiencing some issues with Merge queries. I've two tables of which structures given below,
Table #1:
C1 INT -- Primary Unique Index -- Account #
C2 DECIMAL(16,0) -- Secondary Unique Index
C3 INT - NULL -- Cell/Bucket
C4 INT - ROW_NUMBR
Points: One-to-one relationship between C1 and C2. C3 will be NULL initially. I'm having issues with updating this field. I'll explain my issue in detail in a while.
Table #2:
C1 INT -- Primary Non-Unique Index
C2 DECIMAL(16,0)
C5 INT -- Segment
Points: One-to-one relationship between C1 and C2. However, there is one-to-many relationship between C1/C2 and C5. We will not have any duplicate rows in this table.
Requirement: For the first segment, I will have to choose Accounts from Table #1 that have a set of Segments AND another set of segments AND another set of segments
AND another set of segments AND NOT IN another set of segments in C5 of Table #2. For example,
Cell/Bucket Include_Set1 Include_Set2 Include_Set3 Include_Set3 Exclude_Set HardCap
1 1000,1001,1002 1010,1011,1012 1020,1021,1022 1030,1031,1033,1034 1040,1041,1042 111111
2 2000,2001,2002 2010,2011,2012 2020,2021,2022 2030,2031,2033,2034 2040,2041,2042 300000
3 3000,3001,3002 3010,3011,3012 3020,3021,3022 3030,3031,3033,3034 3040,3041,3042 400000
4 Take 50% of the population from Cell #3 and assign it to Cell #4
.
.
.
.
.
100 ........... ........... ........... ........... ..............
The number of values in the "include sets" can vary upto 1000 values and We will have upto 100 Cells.
Scripts I'm using to accomplish this task are,
1. Drop tables 1 and 2. Re-populate Table #1 with data and Set C3 which is Cell# to NULL for all records. Repopulate Table #2 with data. Table #2 will be stable until
this process is done. We will keep updating C3 of Table #1 throughout this process.
2. Running an auto-process to create sql queries and store them in a sql file
3. Call the sql file through a ksh script by using the command bteq < Qry.sql > Qry_1.log
Scripts in the sql file:
========================
.logon db-name/userid,pwd;
.SET ERROROUT STDOUT;
.IF ERRORCODE<>0 THEN .GOTO ERROR_BOTTOM;
MERGE INTO <TABLE1>
USING (
SELECT TOP 111111 C1_A -- Selecting only top 111111 accounts from the down-stream output
FROM ( SELECT DISTINCT D.C1 AS C1 -- Selecting Accounts which satisfy all required segments as per the requirement
FROM (
(
(SELECT A.C1 FROM <TABLE1> A INNER JOIN <TABLE2> B ON A.C1 = B.C1 AND A.C3 IS NULL AND B.C5 IN (1000,1001,1002))
INTERSECT
(SELECT A.C1 FROM <TABLE1> A INNER JOIN <TABLE2> B ON A.C1 = B.C1 AND A.C3 IS NULL AND B.C5 IN (1010,1011,1012))
INTERSECT
(SELECT A.C1 FROM <TABLE1> A INNER JOIN <TABLE2> B ON A.C1 = B.C1 AND A.C3 IS NULL AND B.C5 IN (1020,1021,1022))
INTERSECT
(SELECT A.C1 FROM <TABLE1> A INNER JOIN <TABLE2> B ON A.C1 = B.C1 AND A.C3 IS NULL AND B.C5 IN (1030,1031,1033,1034))
)
EXCEPT
(
(SELECT A.C1 FROM <TABLE1> A INNER JOIN <TABLE2> B ON A.C1 = B.C1 AND A.C3 IS NULL AND B.C5 IN (1040,1041,1042))
)
) D
) E
) Z
ON C1 = Z.C1_A
AND C3 IS NULL
WHEN MATCHED THEN UPDATE SET C3 = 1
;
.IF ERRORCODE<>0 THEN .GOTO ERROR_BOTTOM;
COLLECT STATISTICS ON <TABLE1> COLUMN (C1);
.IF ERRORCODE<>0 THEN .GOTO ERROR_BOTTOM;
COLLECT STATISTICS ON <TABLE1> COLUMN (C3);
.IF ERRORCODE<>0 THEN .GOTO ERROR_BOTTOM;
MERGE INTO <TABLE1>
USING (
SELECT TOP 300000 C1_A -- Selecting only top 300000 accounts from the down-stream output
FROM ( SELECT DISTINCT D.C1 AS C1 -- Selecting Accounts which satisfy all required segments as per the requirement
FROM (
(
(SELECT A.C1 FROM <TABLE1> A INNER JOIN <TABLE2> B ON A.C1 = B.C1 AND A.C3 IS NULL AND B.C5 IN (2000,2001,2002))
INTERSECT
(SELECT A.C1 FROM <TABLE1> A INNER JOIN <TABLE2> B ON A.C1 = B.C1 AND A.C3 IS NULL AND B.C5 IN (2010,2011,2012))
INTERSECT
(SELECT A.C1 FROM <TABLE1> A INNER JOIN <TABLE2> B ON A.C1 = B.C1 AND A.C3 IS NULL AND B.C5 IN (2020,2021,2022))
INTERSECT
(SELECT A.C1 FROM <TABLE1> A INNER JOIN <TABLE2> B ON A.C1 = B.C1 AND A.C3 IS NULL AND B.C5 IN (2030,2031,2033,2034))
)
EXCEPT
(
(SELECT A.C1 FROM <TABLE1> A INNER JOIN <TABLE2> B ON A.C1 = B.C1 AND A.C3 IS NULL AND B.C5 IN (2040,2041,2042))
)
) D
) E
) Z
ON C1 = Z.C1_A
AND C3 IS NULL
WHEN MATCHED THEN UPDATE SET C3 = 2
;
.IF ERRORCODE<>0 THEN .GOTO ERROR_BOTTOM;
COLLECT STATISTICS ON <TABLE1> COLUMN (C1);
.IF ERRORCODE<>0 THEN .GOTO ERROR_BOTTOM;
COLLECT STATISTICS ON <TABLE1> COLUMN (C3);
-- Cell #4 would NOT be calculated at this point.
...
...
...
After updating all 100 Segments...
-- Assigning row numbers which have fallen into any of the buckets/cells.
MERGE INTO <TABLE1>
USING ( SELECT DISTINCT C3 AS C3_A
, C1 AS C1_A
, ROW_NUMBER() OVER(ORDER BY C3) ROW_NUM
FROM <TABLE1> WHERE C3 IS NOT NULL
) Z
ON C3 = Z.C3_A
AND C3 IS NOT NULL
AND C1 = Z.C1_A
WHEN MATCHED THEN UPDATE SET C4 = Z.ROW_NUM
;
.IF ERRORCODE<>0 THEN .GOTO ERROR_BOTTOM;
-- Splitting Cell #3 into 2
MERGE INTO <TABLE1>
USING ( SELECT DISTINCT C3 AS C3_A
,C1 AS C1_A
,ROW_NUMBER() OVER(PARTITION BY C3 ORDER BY C3) ROW_NUM
FROM <TABLE1>
WHERE C3 = 3
) Z
ON C3 = Z.C3_A
AND C3 = 3
AND C1 = Z.C1_A
WHEN MATCHED THEN UPDATE SET C3 = CASE WHEN Z.ROW_NUM MOD 2 = 1 THEN 3
WHEN Z.ROW_NUM MOD 2 = 0 THEN 4
END
;
.IF ERRORCODE<>0 THEN .GOTO ERROR_BOTTOM;
.LABEL ERROR_BOTTOM
.logoff
.quit
-- End of the sql script...
Issue: This is a most business critical process. I have to run this process at least once a day. Most of the time, I get the correct counts. Sometimes, I get some weird
counts even though the log file shows the correct count. For example, the log file showed the counts given below for Cell#1, last week.
*** Merge completed. 111111 rows changed.
*** Total elapsed time was 37 seconds.
However, when I checked the table, I saw 222222 rows for Cell #1. Whenever I run this process, TABLE#1 gets truncated and re-populated with data and C3 which is
Cell# will be set to NULL. Then only, the KSH script gets called. I got the correct count when I just reran the process without making any changes in the SQL file.
This happens at least once a week! Sometimes, split does not happen correctly too. I just don't know what is happening here. I'm pretty sure that it has nothing to do
with whatever data that I have in both tables. Otherwise, rerunning the process would NOT have given me the correct counts.
Details of Teradata I’m using :
VERSION 13.10.07.03
RELEASE 13.10.07.04
LANGUAGE SUPPORT MODE Standard
Has anyone come across this issue which is driving me crazy? Any suggestions please?
Thank you,
Saravanan
Forums: