Outer Joins and Optimizer Issues - forum topic by wpitterl

June 25, 2013, 12:09 pm

≫ Next: Cross tab - Pivot - forum topic by elvicio

≪ Previous: Newbie Here - forum topic by susmita

Hi all,

I'm having some performance issues due to the way certain queries are being processed by the optimizer. I have a set of views that I want to join together, and each of these views has the same JOIN/WHERE clause combination:
SELECT ...

FROM datatable t1 INNER JOIN sectable t2ON t1.FilterCol = t2.FilterColSELECT ... FROM datatable t1 INNER JOIN sectable t2 ON t1.FilterCol = t2.FilterCol WHERE USER = t2.USERNAME
WHERE USER = t2.USERNAME

The query joins together 7 of these views, inner joining between 2 and then left outer joining to 5 more:

SELECT v1.TYPE_CD, v2.TYPE_CD, v3.TYPE_CD, v4.TYPE_CD, v5.TYPE_CD, v6.TYPE_CD, v7.TYPE_CD, COUNT(*) FROM VIEW1 v1 INNER JOIN VIEW2 v2 ON (v1.SK = v2.SK AND v1.DT_SK = v2.DT_SK AND v1.TYPE_CD = v2.TYPE_CD AND v1.NUM_SK = v2.NUM_SK) LEFT OUTER JOIN VIEW3 v3 ON (v1.SK = v3.SK AND v1.DT_SK = v3.DT_SK AND v1.TYPE_CD = v3.TYPE_CD AND v1.NUM_SK = v3.NUM_SK) LEFT OUTER JOIN VIEW4 v4 ON (v1.SK = v4.SK AND v1.DT_SK = v4.DT_SK AND v1.TYPE_CD = v4.TYPE_CD AND v1.NUM_SK = v4.NUM_SK) LEFT OUTER JOIN VIEW5 v5 ON (v2.SK = v5.SK AND v2.DT_SK = v5.DT_SK AND v2.TYPE_CD = v5.TYPE_CD AND v2.NUM_SK = v5.NUM_SK AND v2.LINE_NUM_SK = v5.LINE_NUM_SK) LEFT OUTER JOIN VIEW6 v6 ON (v2.SK = v6.SK AND v2.DT_SK = v6.DT_SK AND v2.TYPE_CD = v6.TYPE_CD AND v2.NUM_SK = v6.NUM_SK AND v2.LINE_NUM_SK = v6.LINE_NUM_SK) LEFT OUTER JOIN VIEW7 v7 ON (v2.SK = v7.SK AND v2.DT_SK = v7.DT_SK AND v2.TYPE_CD = v7.TYPE_CD AND v2.NUM_SK = v7.NUM_SK AND v2.LINE_NUM_SK = v7.LINE_NUM_SK) WHERE v1.FROM_DT = '2012-12-01' GROUP BY 1,2,3,4,5,6,7 ORDER BY 1,2,3,4,5,6,7 ;

These tables are all quite large (some in the billions of rows). The basic breakdown of how the optimizer executes this is that it filters the first view (v1) on the date in the WHERE clause, then filters it on the join condition from the view definition, then spools that result. Next it joins that spool with the second view (v2), matching on the conditions in the ON clauses, which effectively filters that table based on the date in the WHERE clause as well. However, for the other five views (the ones that are outer joined), the optimizer decides to first join each of these tables to the tables from their view definitions individually (doing an all-rows scan in the process) before joining them to the spool that contains the data which has already been filtered on the WHERE clause. For some queries it makes almost no difference in execution time, but for a query like this in which the WHERE clause should be eliminating 99% of the processing, its quite a problem. The same query executed on the same data set, but with no joins in the view definitions, executes in a couple minutes versus a couple hours with the view definition give above.

I'm having trouble tracking down the reason why the optimizer isn't choosing a better way to do this. Removing the JOIN/WHERE clauses from the view definitions results in the query executing as expected - filtering the first table on the WHERE clause from the query, then joining each additional table to those results one by one, effectively filtering every join based on the WHERE clause. I can see from the first part of the execution plan that the optimizer knows that it can do the same thing with the new view definitions, because thats exactly what it does when it joins the first two tables, but I can't figure out why it won't do it for the rest of them. I'm guessing it has something to do with the outer joins, but why does the same query work as expected when the view definitions don't contain any joins?

Appreciate any insight on this!

Tags:

Forums:

↧

Cross tab - Pivot - forum topic by elvicio

June 25, 2013, 12:41 pm

≫ Next: Selecting count(*) from multiple tables - response (4) by dnoeth

≪ Previous: Outer Joins and Optimizer Issues - forum topic by wpitterl

HI,
I NEED TO DO A ROWCOUNT ON ALL TABLES ON A GIVEN DATABASE. I CAN EASILY DO THIS IN SQL USING A PIVOT TABLE.

CAN SOMEONE HELP?

THIS IS WHAT I HAVE:
LETS SAY THIS IS (TABLE_COUNTS)

TABLENAMES ROWCOUNTS
TABLE1 500
TABLE2 200
TABLE3 800
TABLE4 499

THIS IS WHAT I WANT TO ACCOMPLISH:
LETS SAY THIS IS (TABLE_INPUTDATA)

TABLE1 TABLE2 TABLE3 TABLE4
500 200 800 499

CAN SOMEONE HELP PLEASE. LET ME KNOW IF YOU NEED ADDITIONAL INFORMATION. THANKS.

Tags:

Forums:

↧

Selecting count(*) from multiple tables - response (4) by dnoeth

June 25, 2013, 2:22 pm

≫ Next: Updating a table through view - response (1) by dnoeth

≪ Previous: Cross tab - Pivot - forum topic by elvicio

You need to do the aggregates within Dervied Tables and then Outer join them:

SELECT
   acct.acc_no,
   acct.sort_code,
   COALESCE(credit.cnt, 0) AS credit_count,
   COALESCE(debit.cnt, 0) AS debit_count
FROM account_details acct
LEFT JOIN
 (
   SELECT acc_no, sort_code, COUNT(*) AS cnt
   FROM credit_trans
 ) AS credit
ON acct.acc_no = credit.acc_no
AND acct.sort_code = credit.sort_code
LEFT JOIN
 (
   SELECT acc_no, sort_code, COUNT(*) AS cnt
   FROM debit_trans
 ) AS debit
on acct.acc_no = debit.acc_no
AND acct.sort_code = debit.sort_code

Dieter

↧

Updating a table through view - response (1) by dnoeth

June 25, 2013, 2:23 pm

≫ Next: One or more values to pass in a prompt - response (3) by dnoeth

≪ Previous: Selecting count(*) from multiple tables - response (4) by dnoeth

Hi Mahesh,
it's the same a SELECTing from this view, the souce code is resolved by the parser and you actually access the base table:
When you EXPLAIN the update you'll notice that the base table is updated.

Dieter

↧

One or more values to pass in a prompt - response (3) by dnoeth

June 25, 2013, 2:33 pm

≫ Next: DBQL Metrics - response (3) by dnoeth

≪ Previous: Updating a table through view - response (1) by dnoeth

When you're on TD14 you might simply use the strtok_split_to_table function:

WHERE P_ID IN
 (
   SELECT CAST(token as INT) 
   FROM TABLE (STRTOK_SPLIT_TO_TABLE(1, #sq(prompt(''))#, ',')
        RETURNS (outkey INTEGER,
                 tokennum INTEGER,
                 token VARCHAR(20) CHARACTER SET UNICODE)
              ) AS d 
)

Dieter

↧

DBQL Metrics - response (3) by dnoeth

June 25, 2013, 2:45 pm

≫ Next: Improve performance of like queries. - response (3) by dnoeth

≪ Previous: One or more values to pass in a prompt - response (3) by dnoeth

I/O is the number of logical disk I/Os not the number of records. The esitmated vs. actual number of records is found in QryLogSteps.
In your case the high count might indicate a Full Table Scan reading 542334 datablocks.

Dieter

↧

Improve performance of like queries. - response (3) by dnoeth

June 25, 2013, 2:47 pm

≫ Next: Is there any way to handle single digit's in date? - response (7) by Harpreet Singh

≪ Previous: DBQL Metrics - response (3) by dnoeth

WHERE POSITION(V_FIRST_NM IN FIRST_NM) > 0

or

WHERE SUBSTRING(FIRST_NM FROM 1 FOR CHAR_LENGTH(V_FIRST_NM) = V_FIRST_NM

Dieter

↧

Is there any way to handle single digit's in date? - response (7) by Harpreet Singh

June 25, 2013, 11:30 pm

≫ Next: Is there any way to handle single digit's in date? - response (8) by dnoeth

≪ Previous: Improve performance of like queries. - response (3) by dnoeth

Not sure what I am doing wrong here
SEL TO_TIMESTAMP('11/04/2011 11:26:35.345''MM/DD/YYYY hh:mi:ss.FF3')
error: SELECT Failed. 9134: YYYY value must be four digits and in the range 1-9999

↧

Is there any way to handle single digit's in date? - response (8) by dnoeth

June 25, 2013, 11:52 pm

≫ Next: Creating a Soft RI - response (3) by taruntrehan

≪ Previous: Is there any way to handle single digit's in date? - response (7) by Harpreet Singh

Simply add a comma ;-)
SEL TO_TIMESTAMP('11/04/2011 11:26:35.345', 'MM/DD/YYYY hh:mi:ss.FF3')

Dieter

↧

Creating a Soft RI - response (3) by taruntrehan

June 26, 2013, 12:37 am

≫ Next: qualify rank() over (partition) - question - forum topic by gksenthilkumar

≪ Previous: Is there any way to handle single digit's in date? - response (8) by dnoeth

Too late to respond though; but Thanks for the inputs...

↧

qualify rank() over (partition) - question - forum topic by gksenthilkumar

June 20, 2013, 3:35 pm

≫ Next: Timestamp format on teradata retrieval - response (9) by gtsoccer

≪ Previous: Creating a Soft RI - response (3) by taruntrehan

Hello, i need some help please,

I have a dataset similar to the following:
inr rd dt    c1 c2
1 1 '2000-01-01 10:10:10 1111' 1 1
1 1 '2000-01-01 10:10:10 2222' 2 2
1 1 '2000-01-01 10:10:10 3333' 3 3
1 1 '2000-01-01 10:10:10 3333' 4 4
1 2 '2000-01-01 10:10:10 1111' 1 1
1 2 '2000-01-01 10:10:10 2222' 2 2
1 2 '2000-01-01 10:10:10 3333' 3 3
1 2 '2000-01-01 10:10:10 4444' 4 4
1 1 '2000-01-01 10:10:10 1111' 1 1
1 1 '2000-01-01 10:10:10 2222' 2 2
1 1 '2000-01-01 10:10:10 3333' 3 3
1 1 '2000-01-01 10:10:10 3333' 3 3

The result set should look like the following:
inr rd dt    c1 c2
1 1 '2000-01-01 10:10:10 3333' 4 4
1 2 '2000-01-01 10:10:10 4444' 4 4
1 1 '2000-01-01 10:10:10 3333' 3 3
Any suggestions would be appreciated.

Forums:

Database

↧

Timestamp format on teradata retrieval - response (9) by gtsoccer

June 26, 2013, 6:35 am

≫ Next: Teradata Driver, LDAP & Talend? - forum topic by teradatatester

≪ Previous: qualify rank() over (partition) - question - forum topic by gksenthilkumar

Dieter and all, I have a timestamp without seconds that I want to keep as a timestamp & not a VARCHAR like this: 5/27/2013 3:36
When I try to create the table using the command belowI get an error that Teradata SQL Assist expected something like a 'CHECK' keyword or an 'UNIQUE' key word between the word 'trns_dt' and '('
trns_dt (timestamp(0), format 'MM/DD/YYYYBHH:MI:BB')
or
trns_dt (timestamp(0), format 'MM/DD/YYYYBHH:MI')
Any thoughts on how to fix this?

↧

Teradata Driver, LDAP & Talend? - forum topic by teradatatester

June 26, 2013, 8:01 am

≫ Next: Partition Elimination - response (3) by ilf

≪ Previous: Timestamp format on teradata retrieval - response (9) by gtsoccer

Has anyone successfully used the Teradata driver with LDAP authentication in the ETL tool Talend?

Tags:

Forums:

↧

Partition Elimination - response (3) by ilf

June 26, 2013, 8:25 am

≫ Next: Outer Joins and Optimizer Issues - response (1) by wpitterl

≪ Previous: Teradata Driver, LDAP & Talend? - forum topic by teradatatester

Hi,

I have a question. We have a table having column with timestamp data type. Partition is applied on the column by casting it to date data type. attached below.
Col1 TIMESTAMP(0))
PRIMARY INDEX ( Col2 )
PARTITION BY RANGE_N(CAST((Col1 ) AS DATE AT TIME ZONE INTERVAL '3:00' HOUR TO MINUTE ) BETWEEN DATE '2012-01-01' AND DATE '2014-12-31' EACH INTERVAL '1' DAY );
Do you think this casting of the partitioning column can have a performance issue in insertion and sel queries?
Your reponse will be appreciated

↧

Outer Joins and Optimizer Issues - response (1) by wpitterl

June 26, 2013, 9:16 am

≫ Next: Updating a table through view - response (2) by mayya@teradataforum

≪ Previous: Partition Elimination - response (3) by ilf

Anyone have any input? Dieter maybe?

↧

Updating a table through view - response (2) by mayya@teradataforum

June 26, 2013, 9:25 am

≫ Next: Timestamp format on teradata retrieval - response (10) by dnoeth

≪ Previous: Outer Joins and Optimizer Issues - response (1) by wpitterl

Thank you Dieter. But i have few queries,
I created a table as below,  (V2R5)
CREATE SET TABLE zam_product1 ,NO FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL,                     CHECKSUM = DEFAULT
(product_id INTEGER,
product_name VARCHAR(20) CHARACTER SET LATIN NOT CASESPECIFIC,
sale_date DATE FORMAT 'dd-mm-yyyy',                      daily_sales DECIMAL(18,6))
PRIMARY INDEX ( product_id );
I created a view as below,
replace view zam_product_V2                as select
product_name
,sale_date
from jedi_mvn_db.zam_product1;
I granted update rights to user "jedi_cdw_dba" on column "product_name" as below,
grant select,update (product_name) on zam_product_v2 to jedi_cdw_dba;
then i see the rights given from dbc.allrights as below,
select * from dbc.allrights where tablename='zam_product_v2';
But in result i can see that update rights are given on column "sale_date"(2 nd column in view & 3rd column in table).
And when i run below query, it says sale_date column does not exist but it does exist in my view.
grant select,update (sale_date) on jedi_mvn_db.zam_product_v2to jedi_cdw_dba;
How view and its table is related to each other now? Please explain me the reason, since i am very much confused because of this.
Mahesh

↧

Timestamp format on teradata retrieval - response (10) by dnoeth

June 26, 2013, 9:38 am

≫ Next: Updating a table through view - response (3) by dnoeth

≪ Previous: Updating a table through view - response (2) by mayya@teradataforum

Your syntax is wrong, within a Create Table it's
trns_dt timestamp(0) format 'MM/DD/YYYYBHH:MI'

Dieter

↧

Updating a table through view - response (3) by dnoeth

June 26, 2013, 9:53 am

≫ Next: CPPI question - forum topic by cursavas

≪ Previous: Timestamp format on teradata retrieval - response (10) by dnoeth

Hi Mahesh,
i don't know what caused this, are you 100% shure the create view/grant/select all refernce the correct objects in the correct database?
When the base table of a view is dropped the view will stop working ("table does not exist") but when a new table with the same name is created it will try to access this new table instead. Maybe you dropped/recreated the base table?
Btw, V2R5 is several years old, if this is a TD Express you should definitely use a current release.

Dieter

↧

CPPI question - forum topic by cursavas

June 26, 2013, 12:21 pm

≫ Next: CPPI question - response (1) by Fred

≪ Previous: Updating a table through view - response (3) by dnoeth

Hello,
I am trying to add a range to my CPPI column but get this error when I run the alter table statement:
'Executed as Single statement. Failed [5731 : HY000] TBL_ADD_PARTITIONS_FOR_PERIOD:DROP RANGE and ADD RANGE for level 1 are not allowed; partitioning expression, if any, at that level is not a RANGE_N function or involves comparison of character or graphic data'
Below are my statements:

CREATE MULTISET TABLE TestTable

(Column1 varchar(50),

Column2 varchar(250))

PRIMARY INDEX TestTable (Column1)

partition by (range_n(Column2 between 'a' and 'b', NO RANGE ));

Alter Table TestTable Modify PRIMARY INDEX ADD RANGE BETWEEN 'c' and 'd'

I am using version 13.10.

Do you know how I can resolve the above error?

Thanks

Cenk

Tags:

ppi