Sorry for confusion..
I have data like below..
here I want to pick one row for each inr & rd for latest timestamp(dt col)..
if we have multiple rows for latest timestamp for that inr & rd then I want to get the latest row id rows for that latest(max) dt for each inr & rd...
like below highlighted rows are my output for this scenario
table contains 5+ billion rows, 100+ columns, for specific process it will fetch only 1MM rows
any suggestion in performance stand point..
Sorry for confusion..
I have data like below..
here I want to pick one row for each inr & rd for latest timestamp(dt col)..
if we have multiple rows for latest timestamp for that inr & rd then I want to get the latest row id rows for that latest(max) dt for each inr & rd...
like below highlighted rows are my output for this scenario
table contains 5+ billion rows, 100+ columns, for specific process it will fetch only 1MM rows
any suggestion in performance stand point..
inr
rd
dt
c1
c2
…
c100
1
1
2000-01-01 10:10:10 1111
1
1
…
1
1
2000-01-01 10:10:10 2222
2
2
…
1
1
2000-01-01 10:10:10 3333
3
3
…
1
1
2000-01-01 10:10:10 4444
4
4
…
1
2
2000-01-01 10:10:10 1111
1
1
…
1
2
2000-01-01 10:10:10 4444
4
4
…
1
2
2000-01-01 10:10:10 4444
5
5
…
1
2
2000-01-01 10:10:10 4444
5
5
…
1
2
2000-01-01 10:10:10 2222
2
2
…
1
2
2000-01-01 10:10:10 3333
3
3
…
1
3
2000-01-01 10:10:10 1111
1
1
…
1
3
2000-01-01 10:10:10 3333
3
3
…
1
3
2000-01-01 10:10:10 3333
4
4
…
1
3
2000-01-01 10:10:10 3333
5
5
…
1
3
2000-01-01 10:10:10 2222
2
2
…