文章详情页
Oracle诊断案例-SGA与Swap之一
浏览:2日期:2023-11-17 09:20:58
link:http://www.eygle.com/case/sga1.htm案例描述:用户报告,服务器启动一段时间以后,无法建立数据库连接重新启动几分钟以后,再次无法连接系统无法正常使用. 1.登陆系统SunOS 5.8login: rootPassWord: Last login: Tue Mar 23 13:56:59 from 172.16.31.41Sun Microsystems Inc. SunOS 5.8 Generic Patch October 2001You have new mail.2.su 为Oracle用户检查启动的Oracle进程发现后台进程正常,有一定量的用户连接wapplatform:/>su - oracleSun Microsystems Inc. SunOS 5.8 Generic Patch October 2001You have new mail./eXPort/home1/oracle>lsadmin codesyndealt31 exp.sh local.cshrc local.profile oraclebak oui v6_databaseapp exp.log jre local.login nsmail oradata swanexport/home1/oracle>cd admin/export/home1/oracle/admin>ps -efgrep oraoracle 25269 25258 0 13:58:36 pts/3 0:00 grep oraoracle 25257 24906 0 13:58:31 pts/4 0:00 vi alert_HSWAPDB.logoracle 25267 1 1 13:58:34 ? 0:00 oracleHSWAPDB (LOCAL=NO)oracle 25184 1 0 13:56:57 ? 0:00 ora_p007_HSWAPDBoracle 25182 1 0 13:56:57 ? 0:00 ora_p006_HSWAPDBoracle 25193 1 0 13:57:03 ? 0:01 oracleHSWAPDB (LOCAL=NO)oracle 25209 1 0 13:57:09 ? 0:00 oracleHSWAPDB (LOCAL=NO)oracle 25176 1 0 13:56:57 ? 0:00 ora_p003_HSWAPDBoracle 25180 1 0 13:56:57 ? 0:00 ora_p005_HSWAPDBoracle 25172 1 0 13:56:56 ? 0:00 ora_p001_HSWAPDBoracle 25178 1 0 13:56:57 ? 0:00 ora_p004_HSWAPDBoracle 25170 1 0 13:56:56 ? 0:00 ora_p000_HSWAPDBoracle 24254 24240 0 12:08:25 pts/2 0:00 -kshoracle 25174 1 0 13:56:56 ? 0:00 ora_p002_HSWAPDBoracle 25244 1 1 13:58:23 ? 0:00 oracleHSWAPDB (LOCAL=NO)oracle 25218 1 0 13:57:23 ? 0:00 oracleHSWAPDB (LOCAL=NO)oracle 25159 1 0 13:56:42 ? 0:02 ora_qmn0_HSWAPDBoracle 25230 1 0 13:57:40 ? 0:01 oracleHSWAPDB (LOCAL=NO)oracle 25161 1 0 13:56:42 ? 0:00 ora_s000_HSWAPDBoracle 25149 1 0 13:56:41 ? 0:01 ora_lgwr_HSWAPDBoracle 25157 1 0 13:56:42 ? 0:00 ora_cjq0_HSWAPDBoracle 24906 3698 0 13:47:47 pts/4 0:00 -kshoracle 25153 1 0 13:56:42 ? 0:01 ora_smon_HSWAPDBoracle 25058 7464 0 13:55:14 pts/1 0:00 -kshoracle 25163 1 0 13:56:42 ? 0:00 ora_d000_HSWAPDBoracle 25155 1 0 13:56:42 ? 0:00 ora_reco_HSWAPDBoracle 25151 1 0 13:56:41 ? 0:00 ora_ckpt_HSWAPDBoracle 25145 1 0 13:56:41 ? 0:00 ora_dbw0_HSWAPDBoracle 25199 1 15 13:57:04 ? 0:49 ora_j000_HSWAPDBoracle 4149 4146 0 12:05:11 pts/5 0:00 -kshoracle 25232 1 0 13:57:41 ? 0:00 oracleHSWAPDB (LOCAL=NO)oracle 25119 1 0 13:56:29 ? 0:00 oraclehswapdb (LOCAL=NO)oracle 25075 1 0 13:55:34 ? 0:00 /export/home1/oracle/app/bin/tnslsnr LISTENER -inheritoracle 24374 4149 0 12:21:56 pts/5 0:00 sqlplus /nologoracle 25143 1 0 13:56:41 ? 0:00 ora_pmon_HSWAPDBoracle 25258 25242 0 13:58:31 pts/3 0:00 -ksh/export/home1/oracle/admin>ps -efgrep ora_oracle 25275 25258 0 13:58:42 pts/3 0:00 grep ora_oracle 25184 1 0 13:56:57 ? 0:00 ora_p007_HSWAPDBoracle 25182 1 0 13:56:57 ? 0:00 ora_p006_HSWAPDBoracle 25176 1 0 13:56:57 ? 0:00 ora_p003_HSWAPDBoracle 25180 1 0 13:56:57 ? 0:00 ora_p005_HSWAPDBoracle 25172 1 0 13:56:56 ? 0:00 ora_p001_HSWAPDBoracle 25178 1 0 13:56:57 ? 0:00 ora_p004_HSWAPDBoracle 25170 1 0 13:56:56 ? 0:00 ora_p000_HSWAPDBoracle 25174 1 0 13:56:56 ? 0:00 ora_p002_HSWAPDBoracle 25159 1 0 13:56:42 ? 0:02 ora_qmn0_HSWAPDBoracle 25161 1 0 13:56:42 ? 0:00 ora_s000_HSWAPDBoracle 25149 1 0 13:56:41 ? 0:01 ora_lgwr_HSWAPDBoracle 25157 1 0 13:56:42 ? 0:00 ora_cjq0_HSWAPDBoracle 25153 1 0 13:56:42 ? 0:01 ora_smon_HSWAPDBoracle 25163 1 0 13:56:42 ? 0:00 ora_d000_HSWAPDBoracle 25155 1 0 13:56:42 ? 0:00 ora_reco_HSWAPDBoracle 25151 1 0 13:56:41 ? 0:00 ora_ckpt_HSWAPDBoracle 25145 1 0 13:56:41 ? 0:00 ora_dbw0_HSWAPDBoracle 25199 1 13 13:57:04 ? 0:51 ora_j000_HSWAPDBoracle 25143 1 0 13:56:41 ? 0:00 ora_pmon_HSWAPDB3.检查Alert.log警报日志文件/export/home1/oracle/admin>lshswapdb/export/home1/oracle/admin>cd */export/home1/oracle/admin/hswapdb>lsbdump cdump create pfile udump/export/home1/oracle/admin/hswapdb>cd bdump/export/home1/oracle/admin/hswapdb/bdump>/export/home1/oracle/admin/hswapdb/bdump>ls -l *.log-rw-r--r-- 1 oracle dba 813396 Mar 23 13:57 alert_HSWAPDB.log/export/home1/oracle/admin/hswapdb/bdump>vi *.log'alert_HSWAPDB.log' 18888 lines, 813396 characters (115 null) Tue Jun 24 21:17:14 2003Starting ORACLE instance (normal)LICENSE_MAX_SESSION = 0LICENSE_SESSIONS_WARNING = 0SCN scheme 3Using log_archive_dest parameter default valueLICENSE_MAX_USERS = 0SYS auditing is disabledStarting up ORACLE RDBMS Version: 9.2.0.3.0.System parameters with non-default values:processes = 400timed_statistics = TRUEshared_pool_size = 117440512large_pool_size = 83886080Java_pool_size = 33554432control_files = /export/home1/oracle/oradata/hswapdb/control01.ctl, /export/home1/oracle/oradata/hswapdb/control02.ctl,/export/home1/oracle/oradata/hswapdb/control03.ctldb_block_size = 8192db_cache_size = 352321536compatible = 9.2.0.0.0db_file_multiblock_read_count= 16fast_start_mttr_target = 300undo_management = AUTOundo_tablespace = UNDOTBS1undo_retention = 10800remote_login_passwordfile= EXCLUSIVEdb_domain = eygle.cominstance_name = hswapdbdispatchers = (PROTOCOL=TCP) (SERVICE=hswapdbXDB)job_queue_processes = 10hash_join_enabled = TRUEbackground_dump_dest = /export/home1/oracle/admin/hswapdb/bdumpuser_dump_dest = /export/home1/oracle/admin/hswapdb/udumpcore_dump_dest = /export/home1/oracle/admin/hswapdb/cdumpsort_area_size = 524288db_name = hswapdbopen_cursors = 300star_transformation_enabled= FALSEquery_rewrite_enabled = FALSEpga_aggregate_target = 154140672aq_tm_processes = 1.................Tue Mar 23 13:40:45 2004skgpspawn failed:category = 27142, depinfo = 12, op = fork, loc = skgpspawn3skgpspawn failed:category = 27142, depinfo = 12, op = fork, loc = skgpspawn3skgpspawn failed:category = 27142, depinfo = 12, op = fork, loc = skgpspawn3skgpspawn failed:category = 27142, depinfo = 12, op = fork, loc = skgpspawn3skgpspawn failed:category = 27142, depinfo = 12, op = fork, loc = skgpspawn3skgpspawn failed:category = 27142, depinfo = 12, op = fork, loc = skgpspawn3skgpspawn failed:category = 27142, depinfo = 11, op = fork, loc = skgpspawn5skgpspawn failed:category = 27142, depinfo = 12, op = fork, loc = skgpspawn3skgpspawn failed:category = 27142, depinfo = 12, op = fork, loc = skgpspawn3Tue Mar 23 13:42:02 2004skgpspawn failed:category = 27142, depinfo = 12, op = fork, loc = skgpspawn3skgpspawn failed:category = 27142, depinfo = 12, op = fork, loc = skgpspawn3skgpspawn failed:category = 27142, depinfo = 12, op = fork, loc = skgpspawn3skgpspawn failed:category = 27142, depinfo = 12, op = fork, loc = skgpspawn3Tue Mar 23 13:55:38 2004Starting ORACLE instance (normal)Shutting down instance: further logons disabledTue Mar 23 13:56:20 2004Shutting down instance (abort)License high water mark = 26Instance terminated by USER, pid = 25112Tue Mar 23 13:56:37 2004Starting ORACLE instance (normal)LICENSE_MAX_SESSION = 0LICENSE_SESSIONS_WARNING = 0SCN scheme 3Using log_archive_dest parameter default valueLICENSE_MAX_USERS = 0SYS auditing is disabledStarting up ORACLE RDBMS Version: 9.2.0.3.0.System parameters with non-default values:processes = 400timed_statistics = TRUEshared_pool_size = 117440512large_pool_size = 83886080java_pool_size = 33554432control_files = /export/home1/oracle/oradata/hswapdb/control01.ctl, /export/home1/oracle/oradata/hswapdb/control02.ctl,/export/home1/oracle/oradata/hswapdb/control03.ctldb_block_size = 8192db_cache_size = 352321536compatible = 9.2.0.0.0db_file_multiblock_read_count= 16fast_start_mttr_target = 300undo_management = AUTOundo_tablespace = UNDOTBS1undo_retention = 10800remote_login_passwordfile= EXCLUSIVEdb_domain = eygle.cominstance_name = hswapdbdispatchers = (PROTOCOL=TCP) (SERVICE=hswapdbXDB)remote_dependencies_mode = SIGNATUREjob_queue_processes = 10hash_join_enabled = TRUEbackground_dump_dest = /export/home1/oracle/admin/hswapdb/bdumpuser_dump_dest = /export/home1/oracle/admin/hswapdb/udumpcore_dump_dest = /export/home1/oracle/admin/hswapdb/cdumpsort_area_size = 524288db_name = hswapdbopen_cursors = 300star_transformation_enabled= FALSEparallel_automatic_tuning= TRUEquery_rewrite_enabled = FALSEpga_aggregate_target = 154140672aq_tm_processes = 1PMON started with pid=2DBW0 started with pid=3LGWR started with pid=4CKPT started with pid=5SMON started with pid=6RECO started with pid=7CJQ0 started with pid=8QMN0 started with pid=9Tue Mar 23 13:56:42 2004starting up 1 shared server(s) ...Tue Mar 23 13:56:42 2004starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...Tue Mar 23 13:56:43 2004ALTER DATABASE MOUNTTue Mar 23 13:56:47 2004SUCcessful mount of redo thread 1, with mount id 3253076635.Tue Mar 23 13:56:47 2004Database mounted in Exclusive Mode.Completed: ALTER DATABASE MOUNTTue Mar 23 13:56:47 2004Current log# 2 seq# 2136 mem# 0: /export/home1/oracle/oradata/hswapdb/redo02.logSuccessful open of redo thread 1.Tue Mar 23 12:24:54 2004SMON: enabling cache recoveryTue Mar 23 12:24:56 2004Undo Segment 1 OnlinedUndo Segment 2 OnlinedUndo Segment 3 OnlinedUndo Segment 4 OnlinedUndo Segment 5 OnlinedUndo Segment 6 OnlinedUndo Segment 7 OnlinedUndo Segment 8 OnlinedUndo Segment 9 OnlinedUndo Segment 10 OnlinedSuccessfully onlined Undo Tablespace 1.Tue Mar 23 12:24:56 2004SMON: enabling tx recoveryTue Mar 23 12:24:56 2004Database Characterset is ZHS16GBKTue Mar 23 12:25:01 2004SMON: Parallel transaction recovery triedTue Mar 23 12:25:01 2004replication_dependency_tracking turned off (no async multimaster replication found)Completed: ALTER DATABASE OPENTue Mar 23 12:28:26 2004/* OracleOEM */ ALTER DATABASE DATAFILE '/export/home1/oracle/oradata/hswapdb/users01.dbf' RESIZE 2501760KTue Mar 23 12:28:26 2004ORA-3297 signalled during: /* OracleOEM */ ALTER DATABASE DATAFILE '/export/h...Tue Mar 23 12:28:32 2004/* OracleOEM */ ALTER DATABASE DATAFILE '/export/home1/oracle/oradata/hswapdb/users01.dbf' RESIZE 2501760KORA-3297 signalled during: /* OracleOEM */ ALTER DATABASE DATAFILE '/export/h...Tue Mar 23 12:28:53 2004/* OracleOEM */ ALTER DATABASE DATAFILE '/export/home1/oracle/oradata/hswapdb/users01.dbf' RESIZE 3501760KTue Mar 23 12:28:53 2004ORA-3297 signalled during: /* OracleOEM */ ALTER DATABASE DATAFILE '/export/h...Tue Mar 23 13:40:45 2004skgpspawn failed:category = 27142, depinfo = 12, op = fork, loc = skgpspawn3skgpspawn failed:category = 27142, depinfo = 12, op = fork, loc = skgpspawn3skgpspawn failed:category = 27142, depinfo = 12, op = fork, loc = skgpspawn3skgpspawn failed:category = 27142, depinfo = 12, op = fork, loc = skgpspawn3skgpspawn failed:category = 27142, depinfo = 12, op = fork, loc = skgpspawn3skgpspawn failed:category = 27142, depinfo = 12, op = fork, loc = skgpspawn3skgpspawn failed:category = 27142, depinfo = 11, op = fork, loc = skgpspawn5skgpspawn failed:category = 27142, depinfo = 12, op = fork, loc = skgpspawn3skgpspawn failed:category = 27142, depinfo = 12, op = fork, loc = skgpspawn3Tue Mar 23 13:42:02 2004skgpspawn failed:category = 27142, depinfo = 12, op = fork, loc = skgpspawn3:q发现数据库多次重起,并记录了部分错误信息该提示说明数据库无法spawn a new session. quote Yong Huang's comment:The number in 'skgpspawn failed:category = 27142' is probably ORA error:$ oerr ora 2714227142, 0000, 'could not create new process'// *Cause: OS system call// *Action: check errno and if possible increase the number of processesOSD (OS-dependent) errors are almost always shown as an skg... error (probably means 'system, kernel generic'). I don't know what 'depinfo = 12' means.4.尝试连接数据库收到错误信息,无法连接数据库$ sqlplus '/ as sysdba'SQL*Plus: Release 9.2.0.3.0 - Production on 星期二 3月 23 14:14:06 2004Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.ERROR:ORA-12540: TNS: 超出内部限制请输入用户名: ERROR:ORA-12540: TNS: 超出内部限制请输入用户名: ERROR:ORA-12540: TNS: 超出内部限制SP2-0157: 在3次尝试之后无法 CONNECT 到 ORACLE, 退出 SQL*Plus内部限制超过,通常说明某些系统资源不足.5.检查监听器发现部分连接被拒绝/export/home1/oracle>lsnrctl servicesLSNRCTL for Solaris: Version 9.2.0.3.0 - Production on 23-3月 -2004 14:37:23Copyright (c) 1991, 2002, Oracle Corporation. All rights reserved.正在连接到 (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=EXTPROC)))服务摘要..服务 'PLSExtProc' 包含 1 个例程。 例程 'PLSExtProc', 状态 UNKNOWN, 包含此服务的 1 个处理程序...处理程序:'DEDICATED' 已建立:0 已被拒绝:0LOCAL SERVER服务 'hswapdb.eygle.com' 包含 2 个例程。例程 'hswapdb', 状态 UNKNOWN, 包含此服务的 1 个处理程序...处理程序:'DEDICATED' 已建立:6 已被拒绝:0LOCAL SERVER例程 'hswapdb', 状态 READY, 包含此服务的 1 个处理程序...处理程序:'DEDICATED' 已建立:21 已拒绝:6 状态:readyLOCAL SERVER服务 'hswapdbXDB.eygle.com' 包含 1 个例程。例程 'hswapdb', 状态 READY, 包含此服务的 1 个处理程序...处理程序:'D000' 已建立:0 已被拒绝:0 当前: 0 最大: 972 状态: readyDISPATCHER <machine: wapplatform, pid: 25839>(ADDRESS=(PROTOCOL=tcp)(HOST=wapplatform)(PORT=32869))命令执行成功在listener.log中找到了相关错误信息23-3324302 -2004 12:19:40 * (CONNECT_DATA=(SID=hswapdb)(CID=(PROGRAM=C:WINNTMicrosoft.NETFrameworkv1.1.4322ASPnet_wp.exe)(HOST=SWAN)(USER=SYSTEM))) * (ADDRESS=(PROTOCOL=tcp)(HOST=172.16.30.125)(PORT=1291)) * establish * hswapdb * 12500TNS-12500: TNS243272274340314375306367316264304334306364266257327250323303265304267376316361306367275370263314TNS-12540: TNS243272263254263366304332262277274253317336317336326306TNS-12560: TNS: 320255322351312312305344306367264355316363TNS-00510: 263254263366304332262277274253317336317336326306Solaris Error: 12: Not enough space23-3324302 -2004 12:19:50 * (CONNECT_DATA=(SID=hswapdb)(CID=(PROGRAM=C:Program FilesPLSQL DeveloperPLSQLDev.exe)(HOST=SWAN)(USER=Administrator))) * (ADDRESS=(PROTOCOL=tcp)(HOST=172.16.30.125)(PORT=1292)) * establish * hswapdb * 12500TNS-12500: TNS243272274340314375306367316264304334306364266257327250323303265304267376316361306367275370263314TNS-12540: TNS243272263254263366304332262277274253317336317336326306TNS-12560: TNS: 320255322351312312305344306367264355316363TNS-00510: 263254263366304332262277274253317336317336326306Solaris Error: 12: Not enough space/export/home1/oracle/app/network/log>grep -w 12 /usr/include/sys/errno.h#define ENOMEM 12 /* Not enough core quote Yong Huang's comment:$ grep -w 12 /usr/include/sys/errno.h#define ENOMEM 12 /* Not enough core */Here 'core' means memory, including real RAM memory and swap space.6.退出Oracle用户检查检查系统日志信息,发现大量失败的su操作有swap区不足的报告/export/home1/oracle/admin/hswapdb/bdump>exit wapplatform:/>dmesg2004年03月23日 星期二 14时00分32秒 CSTMar 22 22:52:36 wapplatform elfexec: [ID 700856 kern.notice] ps: Cannot find ^?ELF^A^B^AMar 22 22:53:00 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system fullMar 22 22:53:09 wapplatform elfexec: [ID 700856 kern.notice] w: Cannot find ^?ELF^A^B^AMar 22 22:53:53 wapplatform last message repeated 4 timesMar 22 22:56:28 wapplatform elfexec: [ID 700856 kern.notice] ipnat: Cannot find ^?ELF^B^B^AMar 22 22:58:00 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system fullMar 22 22:59:54 wapplatform elfexec: [ID 700856 kern.notice] ipnat: Cannot find ^?ELF^B^B^AMar 22 23:02:26 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system fullMar 22 23:03:00 wapplatform last message repeated 1 timeMar 22 23:08:00 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system fullMar 22 23:08:34 wapplatform elfexec: [ID 700856 kern.notice] ipnat: Cannot find ^?ELF^B^B^AMar 22 23:10:27 wapplatform last message repeated 3 timesMar 22 23:11:49 wapplatform elfexec: [ID 700856 kern.notice] ipnat: Cannot find ^?ELF^B^B^AMar 22 23:11:52 wapplatform last message repeated 1 timeMar 22 23:13:01 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system fullMar 22 23:18:01 wapplatform last message repeated 1 timeMar 22 23:23:01 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system fullMar 22 23:28:01 wapplatform last message repeated 1 timeMar 22 23:33:01 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system fullMar 22 23:38:01 wapplatform last message repeated 1 timeMar 22 23:43:01 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system fullMar 22 23:48:01 wapplatform last message repeated 1 timeMar 22 23:53:01 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system fullMar 22 23:58:01 wapplatform last message repeated 1 timeMar 23 00:00:00 wapplatform ufs: [ID 213553 kern.notice] NOTICE: realloccg /export/home1: file system fullMar 23 00:00:00 wapplatform sendmail[3075]: [ID 702911 mail.crit] My unqualified host name (wapplatform) unknown; sleeping for retryMar 23 00:01:00 wapplatform sendmail[3075]: [ID 702911 mail.alert] unable to qualify my own domain name (wapplatform) -- using short nameMar 23 00:02:36 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system fullMar 23 00:03:02 wapplatform last message repeated 1 timeMar 23 00:08:02 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system full....Mar 23 10:18:15 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system fullMar 23 10:20:41 wapplatform ufs: [ID 213553 kern.notice] NOTICE: realloccg /export/home1: file system fullMar 23 10:20:47 wapplatform last message repeated 1 timeMar 23 10:23:15 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system fullMar 23 10:24:38 wapplatform ufs: [ID 213553 kern.notice] NOTICE: realloccg /export/home1: file system fullMar 23 10:24:43 wapplatform last message repeated 1 timeMar 23 10:24:55 wapplatform ufs: [ID 213553 kern.notice] NOTICE: realloccg /export/home1: file system fullMar 23 10:25:06 wapplatform last message repeated 2 timesMar 23 11:09:31 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 3118 (su)Mar 23 11:09:39 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 3121 (su)Mar 23 11:10:48 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 3137 (su)Mar 23 11:18:02 wapplatform sshd[3620]: [ID 800047 auth.error] error: grantpt: Not enough spaceMar 23 11:18:02 wapplatform sshd[3620]: [ID 800047 auth.error] error: session_pty_req: session 0 alloc failedMar 23 11:18:43 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 3636 (su)Mar 23 11:19:47 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 3672 (su)Mar 23 11:20:20 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 3694 (su)Mar 23 11:22:23 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 3736 (sshd)Mar 23 11:23:17 wapplatform tmpfs: [ID 518458 kern.warning] WARNING: /tmp: File system full, swap space limit exceededMar 23 11:23:40 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 3804 (su)Mar 23 11:23:40 wapplatform last message repeated 8 timesMar 23 11:23:56 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 3806 (ps)Mar 23 11:23:56 wapplatform last message repeated 12 timesMar 23 11:24:01 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 3808 (w)Mar 23 11:24:01 wapplatform last message repeated 8 timesMar 23 13:40:56 wapplatform su: [ID 810491 auth.crit] 'su root' failed for root on /dev/pts/2Mar 23 13:46:26 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 24888 (sqlplus)Mar 23 13:49:18 wapplatform su: [ID 810491 auth.crit] 'su oracle' failed for root on /dev/pts/6Mar 23 13:54:03 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 25035 (su)Mar 23 13:54:08 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 25036 (su)现在基本可以判定是交换区的问题,当然和Oracle SGA设置有关.7.检查系统内存及交换区使用/export/home1/oracle/admin/hswapdb/bdump>exit wapplatform:/>dmesg2004年03月23日 星期二 14时00分32秒 CSTMar 22 22:52:36 wapplatform elfexec: [ID 700856 kern.notice] ps: Cannot find ^?ELF^A^B^AMar 22 22:53:00 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system fullMar 22 22:53:09 wapplatform elfexec: [ID 700856 kern.notice] w: Cannot find ^?ELF^A^B^AMar 22 22:53:53 wapplatform last message repeated 4 timesMar 22 22:56:28 wapplatform elfexec: [ID 700856 kern.notice] ipnat: Cannot find ^?ELF^B^B^AMar 22 22:58:00 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system fullMar 22 22:59:54 wapplatform elfexec: [ID 700856 kern.notice] ipnat: Cannot find ^?ELF^B^B^AMar 22 23:02:26 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system fullMar 22 23:03:00 wapplatform last message repeated 1 timeMar 22 23:08:00 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system fullMar 22 23:08:34 wapplatform elfexec: [ID 700856 kern.notice] ipnat: Cannot find ^?ELF^B^B^AMar 22 23:10:27 wapplatform last message repeated 3 timesMar 22 23:11:49 wapplatform elfexec: [ID 700856 kern.notice] ipnat: Cannot find ^?ELF^B^B^AMar 22 23:11:52 wapplatform last message repeated 1 timeMar 22 23:13:01 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system fullMar 22 23:18:01 wapplatform last message repeated 1 timeMar 22 23:23:01 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system fullMar 22 23:28:01 wapplatform last message repeated 1 timeMar 22 23:33:01 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system fullMar 22 23:38:01 wapplatform last message repeated 1 timeMar 22 23:43:01 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system fullMar 22 23:48:01 wapplatform last message repeated 1 timeMar 22 23:53:01 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system fullMar 22 23:58:01 wapplatform last message repeated 1 timeMar 23 00:00:00 wapplatform ufs: [ID 213553 kern.notice] NOTICE: realloccg /export/home1: file system fullMar 23 00:00:00 wapplatform sendmail[3075]: [ID 702911 mail.crit] My unqualified host name (wapplatform) unknown; sleeping for retryMar 23 00:01:00 wapplatform sendmail[3075]: [ID 702911 mail.alert] unable to qualify my own domain name (wapplatform) -- using short nameMar 23 00:02:36 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system fullMar 23 00:03:02 wapplatform last message repeated 1 timeMar 23 00:08:02 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system full....Mar 23 10:18:15 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system fullMar 23 10:20:41 wapplatform ufs: [ID 213553 kern.notice] NOTICE: realloccg /export/home1: file system fullMar 23 10:20:47 wapplatform last message repeated 1 timeMar 23 10:23:15 wapplatform ufs: [ID 845546 kern.notice] NOTICE: alloc: /export/home1: file system fullMar 23 10:24:38 wapplatform ufs: [ID 213553 kern.notice] NOTICE: realloccg /export/home1: file system fullMar 23 10:24:43 wapplatform last message repeated 1 timeMar 23 10:24:55 wapplatform ufs: [ID 213553 kern.notice] NOTICE: realloccg /export/home1: file system fullMar 23 10:25:06 wapplatform last message repeated 2 timesMar 23 11:09:31 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 3118 (su)Mar 23 11:09:39 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 3121 (su)Mar 23 11:10:48 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 3137 (su)Mar 23 11:18:02 wapplatform sshd[3620]: [ID 800047 auth.error] error: grantpt: Not enough spaceMar 23 11:18:02 wapplatform sshd[3620]: [ID 800047 auth.error] error: session_pty_req: session 0 alloc failedMar 23 11:18:43 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 3636 (su)Mar 23 11:19:47 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 3672 (su)Mar 23 11:20:20 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 3694 (su)Mar 23 11:22:23 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 3736 (sshd)Mar 23 11:23:17 wapplatform tmpfs: [ID 518458 kern.warning] WARNING: /tmp: File system full, swap space limit exceededMar 23 11:23:40 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 3804 (su)Mar 23 11:23:40 wapplatform last message repeated 8 timesMar 23 11:23:56 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 3806 (ps)Mar 23 11:23:56 wapplatform last message repeated 12 timesMar 23 11:24:01 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 3808 (w)Mar 23 11:24:01 wapplatform last message repeated 8 timesMar 23 13:40:56 wapplatform su: [ID 810491 auth.crit] 'su root' failed for root on /dev/pts/2Mar 23 13:46:26 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 24888 (sqlplus)Mar 23 13:49:18 wapplatform su: [ID 810491 auth.crit] 'su oracle' failed for root on /dev/pts/6Mar 23 13:54:03 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 25035 (su)Mar 23 13:54:08 wapplatform genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 25036 (su)现在基本可以判定是交换区的问题,当然和Oracle SGA设置有关.7.检查系统内存及交换区使用$ toplast pid: 25456; load averages: 0.67, 0.70, 0.69 14:10:0393 processes: 91 sleeping, 2 on cpuCPU states: 72.7% idle, 14.9% user, 2.7% kernel, 9.7% iowait, 0.0% swapMemory: 1024M real, 34M free, 752M swap in use, 10M swap freePID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND25199 oracle 1 40 0 674M 631M cpu/2 8:03 16.32% oracle25209 oracle 1 30 0 675M 630M sleep 0:03 0.13% oracle25159 oracle 1 48 0 674M 628M sleep 0:03 0.06% oracle25384 oracle 1 58 0 2632K 1736K cpu/0 0:01 0.05% top25145 oracle 143 58 0 682M 630M sleep 0:01 0.03% oracle25446 oracle 1 58 0 674M 628M sleep 0:00 0.03% oracle25149 oracle 15 58 0 682M 626M sleep 0:00 0.02% oracle25075 oracle 1 48 0 17M 7208K sleep 0:00 0.01% tnslsnr25151 oracle 11 58 0 676M 624M sleep 0:00 0.01% oracle25366 oracle 1 10 0 674M 628M sleep 0:00 0.00% oracle25356 oracle 1 18 0 674M 628M sleep 0:00 0.00% oracle25360 oracle 1 20 0 674M 628M sleep 0:00 0.00% oracle25364 oracle 1 20 0 674M 628M sleep 0:00 0.00% oracle25362 oracle 1 20 0 674M 628M sleep 0:00 0.00% oracle25330 oracle 1 28 0 674M 628M sleep 0:00 0.00% oracle发现物理内存仅为1G,free部分为34M,交换区使用了752M,仅10M free系统内存严重不足,Swap区不足8. 检查数据库的SGA设置发现SGA设置为: 622299344 bytes接近600Mwapplatform:/>su - oracleSun Microsystems Inc. SunOS 5.8 Generic Patch October 2001You have new mail./export/home1/oracle>sqlplus '/ as sysdba'SQL*Plus: Release 9.2.0.3.0 - Production on 星期二 3月 23 14:02:30 2004Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.连接到: Oracle9i Enterprise Edition Release 9.2.0.3.0 - 64bit ProductionWith the Partitioning, OLAP and Oracle Data Mining optionsJServer Release 9.2.0.3.0 - ProductionSQL> show sgaTotal System Global Area 622299344 bytesFixed Size 731344 bytesVariable Size 268435456 bytesDatabase Buffers 352321536 bytesRedo Buffers 811008 bytesSQL> 对于RAM小于1G的系统,Dedicated模式下,Oracle的SGA一般不应超过1/2物理内存.9.第一步调整减小SGA,为系统保留足够的内存.10.增加swap区wapplatform:/>df -k文件系统 千字节 用了 可用 容量 挂接在/dev/dsk/c0t1d0s0 3099093 105421 2931691 4% //dev/dsk/c0t2d0s0 10325760 8359637 1862866 82% /usr/proc 0 0 0 0% /procfd 0 0 0 0% /dev/fdmnttab 0 0 0 0% /etc/mnttab/dev/dsk/c0t1d0s3 1018382 285914 671366 30% /varswap 3904 24 3880 1% /var/runswap 3936 56 3880 2% /tmp/dev/dsk/c0t1d0s5 1671823 459202 1162467 29% /opt/dev/dsk/c0t2d0s7 7087473 6068462 948137 87% /export/home/dev/dsk/c2t1d0s7 17413250 15900222 1338896 93% /export/home2/dev/dsk/c0t3d0s7 17413250 13749782 3489336 80% /export/home1/dev/dsk/c0t1d0s1 771110 382410 334723 54% /usr/openwin/export/home/wapgw/luke7087473 6068462 948137 87% /home/wapwapplatform:/var/swap>cd /export/home1wapplatform:/export/home1>lsTT_DB lost+found oracle oracli9wapplatform:/export/home1>mkdir swapwapplatform:/export/home1>cd swapwapplatform:/export/home1/swap>mkfile -v 1g swapfile1swapfile1 1073741824 byteswapplatform:/export/home1/swap>iduid=0(root) gid=1(other)wapplatform:/export/home1/swap>swap -a /export/home1/swap/swapfile1wapplatform:/export/home1/swap>swap -s总数:分配了 623160k 字节 + 保留 162704k = 已使用 785864k,1010936k 可用11.连接测试系统恢复正常,问题解决wapplatform:/export/home1/swap>su - oracleSun Microsystems Inc. SunOS 5.8 Generic Patch October 2001You have new mail./export/home1/oracle>sqlplus '/ as sysdba'SQL*Plus: Release 9.2.0.3.0 - Production on 星期四 3月 25 11:56:28 2004Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.连接到: Oracle9i Enterprise Edition Release 9.2.0.3.0 - 64bit ProductionWith the Partitioning, OLAP and Oracle Data Mining optionsJServer Release 9.2.0.3.0 - ProductionSQL> exit从Oracle9i Enterprise Edition Release 9.2.0.3.0 - 64bit ProductionWith the Partitioning, OLAP and Oracle Data Mining optionsJServer Release 9.2.0.3.0 - Production中断开/export/home1/oracle>toplast pid: 5372; load averages: 0.25, 0.22, 0.29 11:57:58148 processes: 137 sleeping, 9 zombie, 2 on cpuCPU states: 98.8% idle, 0.2% user, 0.7% kernel, 0.2% iowait, 0.0% swapMemory: 1024M real, 17M free, 824M swap in use, 934M swap freePID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND5363 root 1 58 0 2680K 1736K sleep 0:00 0.24% top5370 oracle 1 58 0 514M 469M sleep 0:00 0.18% oracle5366 oracle 1 28 0 514M 469M sleep 0:00 0.11% oracle5341 oracle 1 58 0 2680K 1736K cpu/2 0:00 0.10% top5372 oracle 1 48 0 61M 3288K cpu/3 0:00 0.06% oracle1288 oracle 1 48 0 514M 468M sleep 5:33 0.05% oracle607 root 12 48 0 2768K 2312K sleep 1:48 0.03% mibiisa25075 oracle 1 48 0 17M 7208K sleep 0:16 0.02% tnslsnr1278 oracle 15 58 0 522M 466M sleep 0:49 0.02% oracle374 root 11 53 0 3504K 2888K sleep 0:16 0.01% nscd1280 oracle 19 58 0 518M 466M sleep 0:28 0.00% oracle5361 root 1 46 0 1024K 680K sleep 0:00 0.00% sleep5362 root 1 46 0 1024K 680K sleep 0:00 0.00% sleep5469 root 1 36 0 1952K 1176K sleep 30:09 0.00% monithttp4167 oracle 1 40 0 515M 471M sleep 29:38 0.00% oracle问题总结:Oracle数据库问题的解决从来就离不开操作系统很多时候我们必须通过操作系统一级的手段来诊断并解决问题.关于操作系统一般Swap区的推荐值为2XRAM假如Ram很大,不一定非要把Swap设置为2xSwap但是通常至少设置Swap = Ram假如Swap区过小,在系统繁忙期间产生大量交换无法换到磁盘,就会出现问题.如本案例就是这样。 另外,假如系统Ram较小通常设置SGA < 1/2 Ram要为Server process及OS保留足够的内存空间.
排行榜