最近要求写一段程序,现把问题描述一下,以整理思绪。
这张表叫做收寄表TB_EVT_MAIL_CLCT,主要描述收寄的相关信息,为做简化,包含字段有邮件编号,收寄日期,收件人,发件人,加载日期,加载时间
收寄日期(CLCT_DATE)就是指的是你去邮局寄邮件的日期,但收寄日期并不代表邮局人员录入数据到系统的时间,很有可能他是隔了一两天才录入得,据经验过了一周之后,一周之前的收寄日期所在邮件的都会进入系统中。比如收寄日期为,那时,收寄日期为所在的记录都进入系统。
加载日期(LOAD_DATE)就是指加载到数据仓库的日期。一般每隔半小时都会有数据加载到数据库中,也就是说收寄日期为的邮件信息有可能在不同时间段有数据加载入库。
目的是用来分析,比如收寄日期为,大致何时这个日期的邮件都会加载入库。
--------------------------------------------------------------------------------------------------------------------------------------------------
我的设计如下:
设计一张表,用来解决问题。该表的字段有如下,举2条记录
| Begin_Date | Fir_Acc | Sec_Acc | Thi_Acc | Fou_Acc | Fiv_Acc | Six_Acc | Sev_Acc | End_Acc |
| 100 | 200 | 300 | 400 | 500 | 500 | 500 | ||
| 100 | 200 | 300 | 300 | 300 | 300 | 300 |
不再增加便意味着结束。
邮件记录数比如FirDay_Account, 可以如下得到
SEL count(*) From TB_EVT_MAIL_CLCT WHERE CLCT_DATE = ''
讯享网
邮件记录数比如SecDay_Account, 可以如下得到
讯享网SEL count(*) From TB_EVT_MAIL_CLCT WHERE CLCT_DATE = ''
可以看到第二天邮件的数量的是累加的。
程序可以用perl的DBI或者sql来写,程序每天执行一次。
--------------------------------------------------------------------------------------------------------------------------------------------------
之前一直用c语言的单个增加的思想,导致在设计实现上存在困难,具体表现在处理变量变化性上出现问题。仔细一想,用集合的思想便可以解决问题。比如group by便是典型的集合思想,有效解决了时间递进的问题。以下是实现的步骤
之前的改进:为后续做更好的分析,在目标表的末尾添加一字段End_Account。
0、目标表设置为SET,这样可以排重。
1、Load_Date -CLCT_Date作为从源表中插入数据的选择条件,共分7种情况,并将数据加载到临时表TEMP_TB_DAILY_MAIL_CLCT_SUM
对特殊的两个字段处理,End_Date 取TD数据库时间最大值。End_Account先赋值0.
2、临时表的记录按CLCT_Date做Group by,并做sum,加载到目标表中。
3、处理End_Date和End_Account字段,这里做个假设,若是相邻两天的记录数相同且不为0,则表明该天的记录数都已加入入库。这样便可设置字段End_Account和End_Date.
缺点:执行过程效率低
改进方向:1、临时表真正建为VOLATILE 2、对于已全部加载的数据,通过某个逻辑判断,使得不再计算。
启发:陷入思维的沼泽时,不妨以系统地,集合的思想来对待问题,说不定豁然开朗。
贴上核心代码以供参考:
DROP TABLE ${TARGETDB}.TEMP_TB_DAILY_MAIL_CLCT_SUM; .IF ERRORCODE <> 0 THEN QUIT 12; CREATE MUTISET TABLE ${TARGETDB}.TEMP_TB_DAILY_MAIL_CLCT_SUM {BEGIN_Date DATE NOT NULL ,Fir_Account INTEGER ,Sec_Account INTEGER ,Thi_Account INTEGER ,Fou_Account INTEGER ,Fiv_Account INTEGER ,Six_Account INTEGER ,Sev_Account INTEGER ,END_Date DATE ,END_Account INTEGER } PRIMARY INDEX(BEGIN_Date); .IF ERRORCODE <> 0 THEN QUIT 12; 从TB_EVT_MAIL_CLCT加载数据到临时表 INSERT INTO ${TARGETDB}.TEMP_TB_DAILY_MAIL_CLCT_SUM CLCT_Date AS BEGIN_Date,fir_Account,sec_Account,thi_Account,fou_Account,fiv_Account,six_Account,sev_Account,END_Date,END_Account select CLCT_Date,count(*),0,0,0,0,0,0,MAX_Date,0 FROM ${TARGETDB}.TB_EVT_MAIL_CLCT WHERE CLCT_Date = Load_Date and CLCT_Date<date+1 GROUP BY CLCT_Date; .IF ERRORCODE <> 0 THEN QUIT 12; INSERT INTO ${TARGETDB}.TEMP_TB_DAILY_MAIL_CLCT_SUM CLCT_Date AS BEGIN_Date,fir_Account,sec_Account,thi_Account,fou_Account,fiv_Account,six_Account,sev_Account,END_Date,END_Account select CLCT_Date,0,count(*),0,0,0,0,0,MAX_Date,0 FROM ${TARGETDB}.TB_EVT_MAIL_CLCT WHERE CLCT_Date = Load_Date-1 and CLCT_Date<date+1 GROUP BY CLCT_Date; .IF ERRORCODE <> 0 THEN QUIT 12; INSERT INTO ${TARGETDB}.TEMP_TB_DAILY_MAIL_CLCT_SUM CLCT_Date AS BEGIN_Date,fir_Account,sec_Account,thi_Account,fou_Account,fiv_Account,six_Account,sev_Account,END_Date,END_Account select CLCT_Date,0,0,count(*),0,0,0,0,MAX_Date,0 FROM ${TARGETDB}.TB_EVT_MAIL_CLCT WHERE CLCT_Date = Load_Date-2 and CLCT_Date<date+1 GROUP BY CLCT_Date; .IF ERRORCODE <> 0 THEN QUIT 12; INSERT INTO ${TARGETDB}.TEMP_TB_DAILY_MAIL_CLCT_SUM CLCT_Date AS BEGIN_Date,fir_Account,sec_Account,thi_Account,fou_Account,fiv_Account,six_Account,sev_Account,END_Date,END_Account select CLCT_Date,0,0,0,count(*),0,0,0,MAX_Date,0 FROM ${TARGETDB}.TB_EVT_MAIL_CLCT WHERE CLCT_Date = Load_Date-3 and CLCT_Date<date+1 GROUP BY CLCT_Date; .IF ERRORCODE <> 0 THEN QUIT 12; INSERT INTO ${TARGETDB}.TEMP_TB_DAILY_MAIL_CLCT_SUM CLCT_Date AS BEGIN_Date,fir_Account,sec_Account,thi_Account,fou_Account,fiv_Account,six_Account,sev_Account,END_Date,END_Account select CLCT_Date,0,0,0,0,count(*),0,0,MAX_Date,0 FROM ${TARGETDB}.TB_EVT_MAIL_CLCT WHERE CLCT_Date = Load_Date-4 and CLCT_Date<date+1 GROUP BY CLCT_Date; .IF ERRORCODE <> 0 THEN QUIT 12; INSERT INTO ${TARGETDB}.TEMP_TB_DAILY_MAIL_CLCT_SUM CLCT_Date AS BEGIN_Date,fir_Account,sec_Account,thi_Account,fou_Account,fiv_Account,six_Account,sev_Account,END_Date,END_Account select CLCT_Date,0,0,0,0,0,count(*),0,MAX_Date,0 FROM ${TARGETDB}.TB_EVT_MAIL_CLCT WHERE CLCT_Date = Load_Date-5 and CLCT_Date<date+1 GROUP BY CLCT_Date; .IF ERRORCODE <> 0 THEN QUIT 12; INSERT INTO ${TARGETDB}.TEMP_TB_DAILY_MAIL_CLCT_SUM CLCT_Date AS BEGIN_Date,fir_Account,sec_Account,thi_Account,fou_Account,fiv_Account,six_Account,sev_Account,END_Date,END_Account select CLCT_Date,0,0,0,0,0,count(*),0,MAX_Date,0 FROM ${TARGETDB}.TB_EVT_MAIL_CLCT WHERE CLCT_Date = Load_Date-6 and CLCT_Date<date+1 GROUP BY CLCT_Date; .IF ERRORCODE <> 0 THEN QUIT 12; INSERT INTO ${TARGETDB}.TEMP_TB_DAILY_MAIL_CLCT_SUM CLCT_Date AS BEGIN_Date,fir_Account,sec_Account,thi_Account,fou_Account,fiv_Account,six_Account,sev_Account,END_Date,END_Account select CLCT_Date,0,0,0,0,0,0,count(*),MAX_Date,0 FROM ${TARGETDB}.TB_EVT_MAIL_CLCT WHERE CLCT_Date = Load_Date-7 and CLCT_Date<date+1 GROUP BY CLCT_Date; .IF ERRORCODE <> 0 THEN QUIT 12; 从临时表中加载到目标表TB_DAILY_MAIL_CLCT_SUM INSERT INTO ${TARGETDB}.TB_DAILY_MAIL_CLCT_SUM sel BEGIN_Date ,sum(fir_Account) ,sum(sec_Account) ,sum(thi_Account) ,sum(fou_Account) ,sum(fiv_Account) ,sum(six_Account) ,sum(sev_Account) ,END_Date ,END_Account FROM ${TARGETDB}.TB_DAILY_MAIL_CLCT_SUM GROUP BY BEGIN_Date; .IF ERRORCODE <> 0 THEN QUIT 12; 修改结束时间 UPDATE ${TARGETDB}.TB_DAILY_MAIL_CLCT_SUM SET END_Date = CLCT_DATE+5 and END_Account = six_Account WHERE six_Account=sev_Account and six_Account<>0; .IF ERRORCODE <> 0 THEN QUIT 12; UPDATE ${TARGETDB}.TB_DAILY_MAIL_CLCT_SUM SET END_Date = CLCT_DATE+4 and END_Account = fiv_Account and sev_Account = fiv_Account WHERE fiv_Account=six_Account and fiv_Account<>0; .IF ERRORCODE <> 0 THEN QUIT 12; UPDATE ${TARGETDB}.TB_DAILY_MAIL_CLCT_SUM SET END_Date = CLCT_DATE+3 and END_Account = fou_Account and six_Account = fou_Account and sev_Account = fou_Account WHERE fou_Account=fiv_Account and fou_Account<>0; .IF ERRORCODE <> 0 THEN QUIT 12; UPDATE ${TARGETDB}.TB_DAILY_MAIL_CLCT_SUM SET END_Date = CLCT_DATE+2 and END_Account = thi_Account and fiv_Account = thi_Account and six_Account = thi_Account and sev_Account = thi_Account WHERE thi_Account=fou_Account and thi_Account<>0; .IF ERRORCODE <> 0 THEN QUIT 12; UPDATE ${TARGETDB}.TB_DAILY_MAIL_CLCT_SUM SET END_Date = CLCT_DATE+1 and END_Account = sec_Account and fou_Account = sec_Account and fiv_Account = sec_Account and six_Account = sec_Account and sev_Account = sec_Account WHERE sec_Account=thi_Account and sec_Account<>0; .IF ERRORCODE <> 0 THEN QUIT 12; UPDATE ${TARGETDB}.TB_DAILY_MAIL_CLCT_SUM SET END_Date = CLCT_DATE and END_Account = fir_Account and thi_Account = fir_Account and fou_Account = fir_Account and fiv_Account = fir_Account and six_Account = fir_Account and sev_Account = fir_Account WHERE fir_Account=sec_Account and fir_Account<>0; .IF ERRORCODE <> 0 THEN QUIT 12;

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容,请联系我们,一经查实,本站将立刻删除。
如需转载请保留出处:https://51itzy.com/kjqy/122062.html