Quantcast
Channel: Teradata Forums - Database
Viewing all articles
Browse latest Browse all 14773

Issues with MERGE Query - forum topic by mcsaravanan

$
0
0

Hello All,
 
I’m experiencing some issues with Merge queries. I've two tables of which structures given below,
 
Table #1:
 
C1 INT -- Primary Unique Index -- Account #
C2 DECIMAL(16,0) -- Secondary Unique Index
C3 INT - NULL -- Cell/Bucket
C4 INT - ROW_NUMBR
 
Points: One-to-one relationship between C1 and C2. C3 will be NULL initially. I'm having issues with updating this field. I'll explain my issue in detail in a while.
 
Table #2:
 
C1 INT -- Primary Non-Unique Index
C2 DECIMAL(16,0)
C5 INT -- Segment
 
Points: One-to-one relationship between C1 and C2. However, there is one-to-many relationship between C1/C2 and C5. We will not have any duplicate rows in this table.
 
Requirement: For the first segment, I will have to choose Accounts from Table #1 that have a set of Segments AND another set of segments AND another set of segments
AND another set of segments AND NOT IN another set of segments in C5 of Table #2. For example,
 
Cell/Bucket        Include_Set1                 Include_Set2                 Include_Set3                 Include_Set3                 Exclude_Set                        HardCap
 
1                      1000,1001,1002             1010,1011,1012             1020,1021,1022             1030,1031,1033,1034            1040,1041,1042             111111
 
2                      2000,2001,2002             2010,2011,2012             2020,2021,2022             2030,2031,2033,2034            2040,2041,2042             300000
 
3                      3000,3001,3002             3010,3011,3012             3020,3021,3022             3030,3031,3033,3034            3040,3041,3042             400000
4                      Take 50% of the population from Cell #3 and assign it to Cell #4
.
.
.
.
.
 
100                   ........... ........... ........... ...........             ..............
 
The number of values in the "include sets" can vary upto 1000 values and We will have upto 100 Cells.
 
Scripts I'm using to accomplish this task are,
 
1. Drop tables 1 and 2. Re-populate Table #1 with data and Set C3 which is Cell# to NULL for all records. Repopulate Table #2 with data. Table #2 will be stable until
this process is done. We will keep updating C3 of Table #1 throughout this process.
 
2. Running an auto-process to create sql queries and store them in a sql file
 
3. Call the sql file through a ksh script by using the command bteq < Qry.sql > Qry_1.log
 
Scripts in the sql file:
========================
 
.logon db-name/userid,pwd;
.SET ERROROUT STDOUT;
 
.IF ERRORCODE<>0 THEN .GOTO ERROR_BOTTOM;
 
MERGE INTO <TABLE1> 
USING (       
            SELECT TOP 111111 C1_A -- Selecting only top 111111 accounts from the down-stream output    
          FROM (  SELECT DISTINCT D.C1 AS C1   -- Selecting Accounts which satisfy all required segments as per the requirement            
                            FROM (  
                                           ( 
                                                (SELECT A.C1 FROM <TABLE1> A INNER JOIN <TABLE2> B ON A.C1 = B.C1 AND A.C3 IS NULL AND B.C5 IN (1000,1001,1002))
 
                                                INTERSECT
 
                                                (SELECT A.C1 FROM <TABLE1> A INNER JOIN <TABLE2> B ON A.C1 = B.C1 AND A.C3 IS NULL AND B.C5 IN (1010,1011,1012))
 
                                                INTERSECT
 
                                                (SELECT A.C1 FROM <TABLE1> A INNER JOIN <TABLE2> B ON A.C1 = B.C1 AND A.C3 IS NULL AND B.C5 IN (1020,1021,1022))
 
                                                INTERSECT
 
                                                (SELECT A.C1 FROM <TABLE1> A INNER JOIN <TABLE2> B ON A.C1 = B.C1 AND A.C3 IS NULL AND B.C5 IN (1030,1031,1033,1034))
                                          )
           
                                                EXCEPT            
                                         (
                                                (SELECT A.C1 FROM <TABLE1> A INNER JOIN <TABLE2> B ON A.C1 = B.C1 AND A.C3 IS NULL AND B.C5 IN (1040,1041,1042))
                                         )
                                    ) D        
             ) E       
) Z   
   ON C1 = Z.C1_A
  AND C3 IS NULL 
 WHEN MATCHED THEN UPDATE   SET C3 = 1
;
 
.IF ERRORCODE<>0 THEN .GOTO ERROR_BOTTOM;
 
COLLECT STATISTICS ON <TABLE1> COLUMN (C1);
 
.IF ERRORCODE<>0 THEN .GOTO ERROR_BOTTOM;
 
COLLECT STATISTICS ON <TABLE1> COLUMN (C3);
 
.IF ERRORCODE<>0 THEN .GOTO ERROR_BOTTOM;
 
MERGE INTO <TABLE1> 
USING (       
            SELECT TOP 300000 C1_A -- Selecting only top 300000 accounts from the down-stream output    
          FROM (  SELECT DISTINCT D.C1 AS C1   -- Selecting Accounts which satisfy all required segments as per the requirement            
                            FROM (  
                                            ( 
                                                (SELECT A.C1 FROM <TABLE1> A INNER JOIN <TABLE2> B ON A.C1 = B.C1 AND A.C3 IS NULL AND B.C5 IN (2000,2001,2002))
 
                                                INTERSECT
 
                                                (SELECT A.C1 FROM <TABLE1> A INNER JOIN <TABLE2> B ON A.C1 = B.C1 AND A.C3 IS NULL AND B.C5 IN (2010,2011,2012))
 
                                                INTERSECT
 
                                                (SELECT A.C1 FROM <TABLE1> A INNER JOIN <TABLE2> B ON A.C1 = B.C1 AND A.C3 IS NULL AND B.C5 IN (2020,2021,2022))
 
                                                INTERSECT
 
                                                (SELECT A.C1 FROM <TABLE1> A INNER JOIN <TABLE2> B ON A.C1 = B.C1 AND A.C3 IS NULL AND B.C5 IN (2030,2031,2033,2034))
                                            )
           
                                                EXCEPT            
                                           (
                                                (SELECT A.C1 FROM <TABLE1> A INNER JOIN <TABLE2> B ON A.C1 = B.C1 AND A.C3 IS NULL AND B.C5 IN (2040,2041,2042))
                                           )
                                    ) D        
             ) E       
) Z   
   ON C1 = Z.C1_A
  AND C3 IS NULL 
 WHEN MATCHED THEN UPDATE   SET C3 = 2
;
 
.IF ERRORCODE<>0 THEN .GOTO ERROR_BOTTOM;
 
COLLECT STATISTICS ON <TABLE1> COLUMN (C1);
 
.IF ERRORCODE<>0 THEN .GOTO ERROR_BOTTOM;
 
COLLECT STATISTICS ON <TABLE1> COLUMN (C3);
 
-- Cell #4 would NOT be calculated at this point.
...
...
...
After updating all 100 Segments...
 
 
-- Assigning row numbers which have fallen into any of the buckets/cells.
 
MERGE INTO <TABLE1>
USING (  SELECT DISTINCT C3 AS C3_A
               , C1 AS C1_A
               , ROW_NUMBER() OVER(ORDER BY C3) ROW_NUM
          FROM <TABLE1> WHERE C3 IS NOT NULL
)  Z
ON C3 = Z.C3_A
AND C3 IS NOT NULL
AND C1 = Z.C1_A
WHEN MATCHED THEN UPDATE SET C4 =  Z.ROW_NUM
;
 
.IF ERRORCODE<>0 THEN .GOTO ERROR_BOTTOM;
 
-- Splitting Cell #3 into 2
 
MERGE INTO <TABLE1>
USING (  SELECT DISTINCT C3 AS C3_A
                   ,C1 AS C1_A
                   ,ROW_NUMBER() OVER(PARTITION BY C3 ORDER BY C3) ROW_NUM
           FROM <TABLE1>
          WHERE C3 = 3
)  Z
ON C3 = Z.C3_A
AND C3 = 3
AND C1 = Z.C1_A
WHEN MATCHED THEN UPDATE SET C3 = CASE WHEN Z.ROW_NUM MOD 2 = 1 THEN 3
                                                                             WHEN Z.ROW_NUM MOD 2 = 0 THEN 4
                                                                  END
;
 
.IF ERRORCODE<>0 THEN .GOTO ERROR_BOTTOM;
 
.LABEL ERROR_BOTTOM
.logoff
.quit
 
-- End of the sql script...
 
Issue: This is a most business critical process. I have to run this process at least once a day. Most of the time, I get the correct counts. Sometimes, I get some weird
counts even though the log file shows the correct count. For example, the log file showed the counts given below for Cell#1, last week.
 
*** Merge completed. 111111 rows changed.
 *** Total elapsed time was 37 seconds.
 
However, when I checked the table, I saw 222222 rows for Cell #1. Whenever I run this process, TABLE#1 gets truncated and re-populated with data and C3 which is
Cell# will be set to NULL. Then only, the KSH script gets called. I got the correct count when I just reran the process without making any changes in the SQL file.
 
This happens at least once a week! Sometimes, split does not happen correctly too. I just don't know what is happening here. I'm pretty sure that it has nothing to do
with whatever data that I have in both tables. Otherwise, rerunning the process would NOT have given me the correct counts.
 
Details of Teradata I’m using :
 
VERSION          13.10.07.03
RELEASE         13.10.07.04
LANGUAGE SUPPORT MODE   Standard
 
Has anyone come across this issue which is driving me crazy? Any suggestions please?
 
 
Thank you,
Saravanan
 

Forums: 

Viewing all articles
Browse latest Browse all 14773

Trending Articles