Hello,
We have one complex view (VIEWA) created on TABLE1 (~2BIL transactional data), TABLE2 (~5K dimensional data), TABLE3 (~6K dimensional data), TABLE4 (~5K dimensional data)) and returning ~900Mil data for my data warehouse. After all the filters suggested by business, VIEWA returns past 9 years + present year data. Now, if I do a "SELECT * FROM VIEWA;", then it takes about 1 hour and 30 mins to show me all data. If I run a "SELECT *" on each of the tables, then they actually show me results within 2 mins since we have collected stats on them and we have join indices on them.
Now, users want to check one period (can be read as fiscal month as well -- lets say there are 13 fiscal months in a year) data through ad-hoc BO report created by them. Here, users want to see "01" period of "2015" year. I have one more view (VIEWB) which contains 3 columns to get the corresponding period information. VIEWB has only 2013, 2014, 2015 periods listed.
Columns in VIEWA: FISCAL_END_DATE, DIM_1_ATTR, DIM_2_ATTR, DIM_3_ATTR, ....., DIM_40_ATTR, TOTAL_SALES_AMOUNT
Columns in VIEWB: FISCAL_END_DATE, YR_ID, PRD_ID
When BO pulls both these objects, the underlying query gets created as (QUERY1):
SELECT * FROM VIEWA A, VIEWB B
WHERE A.FISCAL_END_DATE = B.FISCAL_END_DATE
AND B.YR_ID = '2015' AND B.PRD_ID='01';
This query runs for hours. I tried writing this query in below different ways and they also took hours to execute (QUERY2, QUERY3, QUERY4).
SELECT * FROM VIEWA A,
(SELECT FISCAL_END_DATE FROM VIEWB WHERE YR_ID = '2015' AND PRD_ID='01') B
WHERE A.FISCAL_END_DATE = B.FISCAL_END_DATE;
SELECT * FROM VIEWA AS A
WHERE EXISTS (SELECT B.FISCAL_END_DATE FROM VIEWB AS B
WHERE A.FISCAL_END_DATE = B.FISCAL_END_DATE AND B.YR_ID = '2015' AND B.PRD_ID='01');
SELECT * FROM VIEWA A
INNER JOIN VIEWB B
ON A.FISCAL_END_DATE = B.FISCAL_END_DATE
WHERE B.YR_ID = '2015'
AND B.PRD_ID='01';
When I write the query in this below way, the query returns data in 8 mins (QUERY5).
SELECT * FROM VIEWA A
WHERE A.FISCAL_END_DATE = ( SELECT B.FISCAL_END_DATE FROM VIEWA B WHERE
B.YR_ID = '2015' AND B.PRD_ID='01');
Now, ad-hoc reports in BO isn't allowing ad-hoc queries to run in them. So, our BO team isn't able to customize the query as QUERY5.
So, the users are anxious to know why QUERY1 takes so long time whereas QUERY5 runs in 8 mins. We are researching on this and haven't found an answer yet. Can someone please help to find the answer? Thanks in advance!
A bit further analysis showed that the actual error is
[5526] SPL1045:E(L15), Invalid or missing INTO clause.
Do I have to create a table on DB(not volatile) and save the contents there? Does TD not allow to use select on Volatile table within a stored procedure?