2025年生产故障:ADG备库通过索引查询数据报ORA-01555问题分析

生产故障:ADG备库通过索引查询数据报ORA-01555问题分析IT 数据库行业小学生 记录日常工作中数据库知识及一些故障案例 如有不对请指正 欢迎关注小编 小编微信 xh CSDN dba notes 一 环境信息及现象 客户一套 oracle11 2 0 4 0 rac 的 adg 环境 操作系统为 AIX 在做完主备切换的时候备库 alert

大家好,我是讯享网,很高兴认识大家。

IT数据库行业小学生,记录日常工作中数据库知识及一些故障案例,如有不对请指正,欢迎关注小编,小编微信xh,CSDN:dba_notes

750a0f51452e65ad2c7f474ae2510c3b.jpeg
讯享网


一、环境信息及现象

      客户一套oracle11.2.0.4.0 rac的adg环境,操作系统为AIX,在做完主备切换的时候备库alert日志经常出现ORA-01555错误,由于备库承担部分查询业务,已经影响到业务的正常进行。

二、故障分析及解决办法

2.2.1 初步分析

       通过alert日志发现,大部分ORA-01555报错时的Query Duration0或者很短的时间,初步感觉不应该是由于UNDO被覆盖导致的(undo_retention值为7小时,undo表空间使用率为60%左右),这里选取一条最简单的语句进行分析,alert日志报错信息如下:

Fri Sep 24 09:42:53 2021 ORA-01555 caused by SQL statement below (SQL ID: 44zgyk79038wn, Query Duration=1 sec, SCN:0x08de.ad): SELECT "PROPOSALNO" FROM "LIS"."LCRNEWSTATEHISTORY""LCR"

讯享网

通过MOS文档IF: ORA-1555 Reported with Query Duration = 0 , or a Few Seconds (Doc ID .1)提示,造成这个问题的原因主要有索引/表不一致,或者主键索引损坏。

08b69cd79dc6aa45decfaed5b16599e4.png

2.2.2 进一步分析

对以上语句分别进行全表扫描查询和通过索引查询,结果如下:

讯享网SQL> select /*+ FULL(t1) */ count(distinct PROPOSALNO) from LIS.LCRNEWSTATEHISTORY t1; COUNT(DISTINCTPROPOSALNO) -------------------------  select /*+ index(LIS.LCRNEWSTATEHISTORYPK_LCRNEWSTATEHISTORY) */ count(distinct PROPOSALNO) from LIS.LCRNEWSTATEHISTORY where PROPOSALNO is not null; select /*+ index(LIS.LCRNEWSTATEHISTORY PK_LCRNEWSTATEHISTORY) */ count(distinct PROPOSALNO) from LIS.LCRNEWSTATEHISTORY where PROPOSALNO is not null * ERROR at line 1: ORA-01555: snapshot too old: rollback segment number 247 with name "_SYSSMU247_$" too small

通过以上测试可以发现,只有通过索引进行查询就会报ORA-01555(错误可复现),这里初步判断为索引的问题,可能会有坏块,于是分析索引结构。

SQL> analyze index LIS.PK_LCRNEWSTATEHISTORY validate structure; analyze index LIS.PK_LCRNEWSTATEHISTORY validate structure * ERROR at line 1: ORA-01555: snapshot too old: rollback segment number 247 with name "_SYSSMU247_$" too small

同样报错,于是使用dbv进行检测(有关dbv用法请参考MOS:DBVERIFY - Database file Verification Utility (Doc ID 35512.1)

讯享网oracle@uchod1053:/oradb>dbv userid=grid/xxxxxx file=+DATA/lispr/datafile/lis_indx.461. DBVERIFY: Release 11.2.0.4.0 - Production on Fri Sep 24 23:59:08 2021 Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved. DBVERIFY - Verification starting : FILE = +DATA/lispr/datafile/lis_indx.461. itl[8] has higher commit scn(0x08de.b36e6585) than block scn (0x08de.20676e2b) Page  failed with check code 6056 itl[31] has higher commit scn(0x08de.f08d561f) than block scn (0x08de.2604d295) Page  failed with check code 6056 itl[3] has higher commit scn(0x08de.dcbb6062) than block scn (0x08de.21e73996) Page  failed with check code 6056 itl[2] has higher commit scn(0x08de.f10978c4) than block scn (0x08de.19db6530) Page  failed with check code 6056 itl[30] has higher commit scn(0x08de.b394bdd7) than block scn (0x08de.23d3d46f) Page  failed with check code 6056 itl[3] has higher commit scn(0x08de.d9da9fb7) than block scn (0x08de.17dbaf65) Page  failed with check code 6056 itl[105] has higher commit scn(0x08de.f95d825f) than block scn (0x08de.23b26365) Page  failed with check code 6056 itl[15] has higher commit scn(0x08de.fbad5538) than block scn (0x08de.1726f072) Page  failed with check code 6056 itl[7] has higher commit scn(0x08de.b2bb2abc) than block scn (0x08de.210ff5e6) Page  failed with check code 6056 itl[19] has higher commit scn(0x08de.d96db16a) than block scn (0x08de.1860cf5f) Page  failed with check code 6056 itl[7] has higher commit scn(0x08de.a3cce991) than block scn (0x08de.054255e0) Page  failed with check code 6056 itl[3] has higher commit scn(0x08de.c20adcb7) than block scn (0x08de.21e737c2) Page  failed with check code 6056 itl[2] has higher commit scn(0x08de.f35465c3) than block scn (0x08de.17cfa5ff) Page  failed with check code 6056 itl[45] has higher commit scn(0x08de.f9405a45) than block scn (0x08de.19c88446) Page  failed with check code 6056 itl[24] has higher commit scn(0x08de.ef) than block scn (0x08de.25acba10) Page  failed with check code 6056 itl[40] has higher commit scn(0x08de.f2ca7710) than block scn (0x08de.21b26967) Page  failed with check code 6056 itl[68] has higher commit scn(0x08de.e7338b41) than block scn (0x08de.) Page  failed with check code 6056 itl[53] has higher commit scn(0x08de.e20917fa) than block scn (0x08de.1e8d0a9e) Page  failed with check code 6056 itl[3] has higher commit scn(0x08de.fbd92d2a) than block scn (0x08de.00) Page  failed with check code 6056 itl[46] has higher commit scn(0x08de.e169fa87) than block scn (0x08de.1c70daad) Page  failed with check code 6056 itl[56] has higher commit scn(0x08de.f8467e62) than block scn (0x08de.1c5fdfaa) Page  failed with check code 6056 itl[46] has higher commit scn(0x08de.f) than block scn (0x08de.1776ebdd) Page  failed with check code 6056 itl[71] has higher commit scn(0x08de.ecf63db2) than block scn (0x08de.17e3979e) Page  failed with check code 6056 itl[66] has higher commit scn(0x08de.f2a63874) than block scn (0x08de.17aaa322) Page  failed with check code 6056 itl[24] has higher commit scn(0x08de.ef7a9ddb) than block scn (0x08de.1e) Page  failed with check code 6056 itl[85] has higher commit scn(0x08de.f96cb454) than block scn (0x08de.0f4da702) Page  failed with check code 6056 itl[60] has higher commit scn(0x08de.df8eb33d) than block scn (0x08de.1d53f8b6) Page  failed with check code 6056 itl[30] has higher commit scn(0x08de.fb578f9b) than block scn (0x08de.1ffd8cc7) Page  failed with check code 6056 itl[51] has higher commit scn(0x08de.f28da248) than block scn (0x08de.1c) Page  failed with check code 6056 itl[55] has higher commit scn(0x08de.eba5752a) than block scn (0x08de.17e9094d) Page  failed with check code 6056 itl[57] has higher commit scn(0x08de.ef0758d7) than block scn (0x08de.178d1c6f) Page  failed with check code 6056 itl[2] has higher commit scn(0x08de.f87983b0) than block scn (0x08de.1fb8fc7a) Page  failed with check code 6056 itl[126] has higher commit scn(0x08de.f9c8ca4f) than block scn (0x08de.018d26c6) Page  failed with check code 6056 DBVERIFY - Verification complete Total Pages Examined :  Total Pages Processed (Data) : 0 Total Pages Failing (Data) : 0 Total Pages Processed (Index):  Total Pages Failing (Index): 33 Total Pages Processed (Other): 4789 Total Pages Processed (Seg) : 0 Total Pages Failing (Seg) : 0 Total Pages Empty :  Total Pages Marked Corrupt : 0 Total Pages Influx : 0 Total Pages Encrypted : 0 Highest block SCN : 0 (0.0)

这里可以看到,没有标记坏块的出现,但是检测过程中报了6056错误,通过MOS查询该错误,发现以下文档ALERT Bug ORA-600 [kdsgrp1] / ORA-1555 / ORA-600 [ktbdchk1: bad dscn] / ORA-600 [2663] due to Invalid Commit SCN in INDEX (Doc ID .1)内容跟本次问题高度相似,以下为文档摘录。

98c885f15e4e79e031ff8a9f95dd2d41.png

通过该文档的WORKAROUND,首选方法还是打补丁,但是由于客户原因,客户希望在不打补丁的情况下看有没有方法解决。经过和客户讨论,可以在主库将有问题的表的索引进行删除并重建,自动同步到备库,达到备库索引重建的目的(待验证)

2.2.3确认影响范围

在备库采用hint方式扫描所有的索引,发现有问题的索引39个,由于涉及到隐私性,在此就不一一列举。

2.2.4 处理方案验证

主库选取一张问题表,复制该表和所有索引,DG同步到备库,备库查询验证,具体测试如下:

SQL>select a.uniqueness,b.index_name,b.column_name from dba_indexesa,dba_ind_columns b where a.table_name=b.table_name anda.index_name=b.index_name and a.table_owner='LIS' anda.table_name='LCRNEWSTATEHISTORY' order by a.uniqueness desc;UNIQUENES INDEX_NAME               COLUMN_NAME--------- ------------------------------ --------------------UNIQUE PK_LCRNEWSTATEHISTORY PROPOSALNONONUNIQUE INX_RNEWHISTORY_POLNO  POLNONONUNIQUE INX_RNEWHISTORY_CONTNO  CONTNOSQL>create table LIS.LCRNEWSTATEHISTORY_TT as select * from LIS.LCRNEWSTATEHISTORY;Table created.SQL>alter table LIS.LCRNEWSTATEHISTORY_TT add constraint PK_LCRNEWSTATEHISTORY_TT primary key(PROPOSALNO);Table altered.SQL>create index LIS.INX_RNEWHISTORY_POLNO_TT on LIS.LCRNEWSTATEHISTORY_TT(POLNO);Index created.SQL>create index LIS.INX_RNEWHISTORY_CONTNO_TT on LIS.LCRNEWSTATEHISTORY_TT(CONTNO);Index created.

备库通过索引查询,未发现报错。

讯享网SQL>select /*+ index(t PK_LCRNEWSTATEHISTORY_TT) */ count(*) from LIS.LCRNEWSTATEHISTORY_TT t; COUNT(*)---------- SQL>select /*+ index(t INX_RNEWHISTORY_CONTNO_TT) */ count(*) from LIS.LCRNEWSTATEHISTORY_TT t; COUNT(*)---------- SQL>select /*+ index(t INX_RNEWHISTORY_POLNO_TT) */ count(*) from LIS.LCRNEWSTATEHISTORY_TT t; COUNT(*)---------- 

以上方式验证了通过重建索引解决该问题的可行性。

2.2.5 后续处理

后面通过dbms_metadata.get_ddl获取到索引的创建语句,然后在非业务高峰期在主库删除有问题表的索引进行创建。

小讯
上一篇 2025-01-23 17:47
下一篇 2025-01-14 09:53

相关推荐

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容,请联系我们,一经查实,本站将立刻删除。
如需转载请保留出处:https://51itzy.com/kjqy/52317.html