Troubleshooting the Scheduler

This section contains the following troubleshooting topics:

A Job Does Not Run
A Program Becomes Disabled
A Window Fails to Take Effect

A Job Does Not Run

A job may fail to run for several reasons. To begin troubleshooting a job that you suspect did not run, check the job state by issuing the following statement:

SELECT JOB_NAME, STATE FROM DBA_SCHEDULER_JOBS;

Typical output will resemble the following:

JOB_NAME                       STATE
------------------------------ ---------
MY_EMP_JOB                     DISABLED
MY_EMP_JOB1                    FAILED
MY_NEW_JOB1                    DISABLED
MY_NEW_JOB2                    BROKEN
MY_NEW_JOB3                    COMPLETED

About Job States

There a four states that a job could be in if it does not run:

Failed Jobs
Broken Jobs
Disabled Jobs
Completed Jobs

Failed Jobs

If a job has the status of FAILED in the job table, it was scheduled to run once but the execution has failed. If the job was specified as restartable, all retries have failed.

If a job fails in the middle of execution, only the last transaction of that job is rolled back. If your job executes multiple transactions, then you must be careful about setting restartable to TRUE. You can query failed jobs by querying the *_SCHEDULER_JOB_RUN_DETAILS views.

Broken Jobs

A broken job is one that has exceeded a certain number of failures. This number is set in max_failures, and can be altered. In the case of a broken job, the entire job is broken, and it will not be run until it has been fixed. For debugging and testing, you can use the RUN_JOB procedure.

You can query broken jobs by querying the *_SCHEDULER_JOBS and *_SCHEDULER_JOB_LOG views.

Disabled Jobs

A job can become disabled for the following reasons:

The job was manually disabled
The job class it belongs to was dropped
The program, chain, or schedule that it points to was dropped
A window or window group is its schedule and the window or window group is dropped

Completed Jobs

A job will be completed if end_date or max_runs is reached. (If a job recently completed successfully but is scheduled to run again, the job state is SCHEDULED.)

Viewing the Job Log

An important troubleshooting tool is the job log. For details and instructions, see "Viewing the Job Log".

Troubleshooting Remote Jobs

Remote jobs must successfully communicate with a Scheduler agent on the remote host. If a remote job does not run, check the DBA_SCHEDULER_JOBS view and the job log first. Then perform the following tasks:

Check that the remote system is reachable over the network with tools such as nslookup and ping.
Check the status of the Scheduler agent on the remote host by calling the GET_AGENT_VERSION package procedure.
```
DECLARE 
  versionnum VARCHAR2(30);
BEGIN
  versionnum := DBMS_SCHEDULER.GET_AGENT_VERSION('remote_host.example.com');
  DBMS_OUTPUT.PUT_LINE(versionnum);
END;
/
```
If an error is generated, the agent may not be installed or may not be registered with your local database. See "Enabling and Disabling Databases for Remote Jobs" for instructions for installing, registering, and starting the Scheduler agent.

About Job Rec overy After a Failure

The Scheduler attempts to recover jobs that are interrupted when:

The database abnormally shuts down
A job slave process is killed or otherwise fails
For an external job, the external job process that starts the executable or script is killed or otherwise fails. (The external job process is extjob on UNIX. On Windows, it is the external job service.)
For an external job, the process that runs the end-user executable or script is killed or otherwise fails.

Job recovery proceeds as follows:

The Scheduler adds an entry to the job log for the instance of the job that was running when the failure occurred. In the log entry, the OPERATION is 'RUN', the STATUS is 'STOPPED', and ADDITIONAL_INFO contains one of the following:
- REASON="Job slave process was terminated"
- REASON="ORA-01014: ORACLE shutdown in progress"
If restartable is set to TRUE for the job, the job is restarted.
If restartable is set to FALSE for the job:
- If the job is a run-once job and auto_drop is set to TRUE, the job run is done and the job is dropped.
- If the job is a run-once job and auto_drop is set to FALSE, the job is disabled and the job state is set to 'STOPPED'.
- If the job is a repeating job, the Scheduler schedules the next job run and the job state is set to 'SCHEDULED'.

When a job is restarted as a result of this recovery process, the new run is entered into the job log with the operation 'RECOVERY_RUN'.

A Program Becomes Disabled

A program can become disabled if a program argument is dropped or number_of_arguments is changed so that all arguments are no longer defined.

See "Creating and Managing Programs to Define Jobs" for more information regarding programs.

A Window Fails to Take Effect

A window can fail to take effect for the following reasons:

A window becomes disabled when it is at the end of its schedule
A window that points to a schedule that no longer exists is disabled

See "Managing Job Scheduling and Job Priorities with Windows" for more information regarding windows.