Hi Troels,
i doubt that it performed better before the upgrade, all TD release ever (and probably all other DBMSes, too) show a similar behaviour. You simply can't get a good plan for queries like that.
Implicit casting should be performed in the first one
Well, there's an implicit cast, you'll see it when you check explain:
3) We do an all-AMPs RETRIEVE step from Table_1 by
way of an all-rows scan with a condition of (
"(TERADATA_EDUCATION.Table_1.Textfield (FLOAT, FORMAT
'-9.99999999999999E-999'))= 1.00000000000000E 000") into Spool 1
(group_amps), which is built locally on the AMP
When you compare numeric and char values the char must be converted to a numeric value (using the most flexible numeric format, FLOAT) and not vice versa, because the numeric value 1 might be represented as a string in lots of different ways,e.g '1', '1.0', ' 1', etc.
Would you expect that the optimizer casts 1 automagically to '00001'?
When a typecast like this is applied to a column the optimizer can't use the existing statistics anymore, because it's different datatypes. Additionally it also can't use an existing index on that column.
The "high confidence" is due to accessing the PI of the table, otherwise it would be "no confidence".
This one is going wrong to - probably because of the explicit casting - and estimes again the size of the spool to be 6 rows with high confidence with a very poor performance as a result.
SEL * FROM Table_1
WHERE CAST(Textfield AS INTEGER) = 1
3) We do an all-AMPs RETRIEVE step from Table_1 by
way of an all-rows scan with a condition of (
"(TRANSLATE((TERADATA_EDUCATION.Table_1.Textfield )USING
LATIN_TO_UNICODE)(INTEGER, FORMAT '-(10)9'))= 1") into Spool 1
(group_amps), which is built locally on the AMPs.
Same problem as before.
Bottom line: Know your datatypes and use them accordingly :-)
Btw, i've seen similar problems when the datatype in a column changed in the datamodel and end users were not notified.
Dieter
Hi Troels,
i doubt that it performed better before the upgrade, all TD release ever (and probably all other DBMSes, too) show a similar behaviour. You simply can't get a good plan for queries like that.
Well, there's an implicit cast, you'll see it when you check explain:
3) We do an all-AMPs RETRIEVE step from Table_1 by
way of an all-rows scan with a condition of (
"(TERADATA_EDUCATION.Table_1.Textfield (FLOAT, FORMAT
'-9.99999999999999E-999'))= 1.00000000000000E 000") into Spool 1
(group_amps), which is built locally on the AMP
When you compare numeric and char values the char must be converted to a numeric value (using the most flexible numeric format, FLOAT) and not vice versa, because the numeric value 1 might be represented as a string in lots of different ways,e.g '1', '1.0', ' 1', etc.
Would you expect that the optimizer casts 1 automagically to '00001'?
When a typecast like this is applied to a column the optimizer can't use the existing statistics anymore, because it's different datatypes. Additionally it also can't use an existing index on that column.
The "high confidence" is due to accessing the PI of the table, otherwise it would be "no confidence".
3) We do an all-AMPs RETRIEVE step from Table_1 by
way of an all-rows scan with a condition of (
"(TRANSLATE((TERADATA_EDUCATION.Table_1.Textfield )USING
LATIN_TO_UNICODE)(INTEGER, FORMAT '-(10)9'))= 1") into Spool 1
(group_amps), which is built locally on the AMPs.
Same problem as before.
Bottom line: Know your datatypes and use them accordingly :-)
Btw, i've seen similar problems when the datatype in a column changed in the datamodel and end users were not notified.
Dieter