Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BackgroundJobServer Dispose doesn't requeue jobs #2452

Open
uler3161 opened this issue Oct 7, 2024 · 1 comment
Open

BackgroundJobServer Dispose doesn't requeue jobs #2452

uler3161 opened this issue Oct 7, 2024 · 1 comment

Comments

@uler3161
Copy link

uler3161 commented Oct 7, 2024

I have a thread on the forum at https://discuss.hangfire.io/t/job-shutdown-vs-delete/10952/3, but not getting any responses, so I decided to post here.

I'm using version 1.8.14 of Hangfire. Per the documentation on cancellation tokens here: https://docs.hangfire.io/en/latest/background-methods/using-cancellation-tokens.html, it sounds like if I shutdown a BackgroundJobServer by disposing it, all running jobs should be re-queued. However, it appears they are going into Succeeded status. I don't understand why.

The application is a Windows service. For testing purposes, I'm running from the command line rather than creating a service in Windows. Here is my code:

First, the Windows service class. My understanding is that stoppingToken is set when the service shuts down. In my case, on the console I'm using Ctrl+C to do this.

public sealed class WindowsBackgroundService(
    IServiceProvider serviceProvider,
    ILogger<WindowsBackgroundService> logger) : BackgroundService
{
    private ILogger<WindowsBackgroundService> Logger
    {
        get;
    } = logger;

    private IServiceProvider ServiceProvider
    {
        get;
    } = serviceProvider;

    protected override async Task ExecuteAsync(
        CancellationToken stoppingToken)
    {
        try
        {
            await ServiceProvider.GetRequiredService<IHangfireProcessor>()
                .RunAsync(stoppingToken);
        }
        catch (Exception ex)
        {
            Logger.LogError(ex, "{Message}", ex.Message);
            Environment.Exit(1);
        }
    }

The service class then executes the RunAsync method of the HangfireProcessor class. This is what runs the BackgroundJobServer. The stoppingToken from the service class gets passed into this method and is checked occasionally to determine if the service is shutting down. When set, it should dispose BackgroundJobServer when it exits the RunAsync method.

public class HangfireProcessor(
    ILogger<HangfireProcessor> logger) : IHangfireProcessor
{
    private ILogger<HangfireProcessor> Logger
    {
        get;
    } = logger;

    public async Task RunAsync(CancellationToken token)
    {
        var options = new BackgroundJobServerOptions
        {
            ServerTimeout = TimeSpan.FromHours(24)
        };

        var server = new BackgroundJobServer(options);
        try
        {
            
            Logger.LogInformation("Hangfire server started.");

            while (!token.IsCancellationRequested)
            {
                await Task.Delay(1000, token);
            }
        }
        catch (OperationCanceledException)
        {
            Logger.LogWarning("Hangfire shutdown requested.");
        }
        catch (Exception ex)
        {
            Logger.LogError("Serious error occurred with Hangfire server.");
            Logger.LogError(ex, ex.Message);
        }
        finally
        {
            Logger.LogWarning("Hangfire server stopped. Disposing");
            server.Dispose();
        }
    }
}

The RecordingTask class is the actual job that's executing. The jobCancellationToken should be the token that Hangfire provides behind the scenes when the job starts. I believe this gets toggled when the BackgroundJobServer gets disposed. When the token is toggled, this should kill the recording process that it started when the job started.

public class RecordingTask(ILogger<RecordingTask> logger, Camera camera, CancellationToken jobCancellationToken)
{
    private Camera? Camera
    {
        get;
    } = camera;

    private CancellationToken JobCancellationToken
    {
        get;
    } = jobCancellationToken;

    private ILogger<RecordingTask> Logger
    {
        get;
    } = logger;

    public void Run()
    {
        try
        {
            if (Camera == null)
            {
                throw new Exception("Camera was null");
            }

            var process = StartRecordingProcess();
            MonitorRecordingProcess(process);
        }
        catch (Exception ex)
        {
            Logger.LogError(ex, "Error running recording task: {message}", ex.Message);
            throw;
        }
    }

    private void MonitorRecordingProcess(
        Process process)
    {
        while (!JobCancellationToken.IsCancellationRequested && !process.HasExited)
        {
            Thread.Sleep(1000);
        }

        if (JobCancellationToken.IsCancellationRequested)
        {
            Logger.LogWarning("Cancellation requested. Killing process.");
            process.Kill(true);
        }

        if (process.ExitCode > 0)
        {
            throw new Exception(
                $"Unexpected end of recording. Hangfire job cancelled? {JobCancellationToken.IsCancellationRequested}. Process result: {process.ExitCode}");
        }
    }

    private Process StartRecordingProcess()
    {
        var startInfo = new ProcessStartInfo
        {
            Arguments = $"--cameraid {Camera!.ID}",
            FileName = "SecurityCameraRecorderProcess.exe"
        };

        var process = Process.Start(startInfo);
        return process
            ?? throw new Exception($"Couldn't start process {startInfo.FileName} with args {startInfo.Arguments}");
    }
}

And finally, this is what starts the job...

BackgroundJob.Enqueue(() => StartRecordingTask(camera, CancellationToken.None))

        [AutomaticRetry(Attempts = int.MaxValue, DelaysInSeconds = new[] { 15 })]
        public void StartRecordingTask(
            Camera camera,
            CancellationToken token)
        {
            ServiceProvider.GetRequiredService<IRecordingTaskFactory>()
                .Create(camera, token)
                .Run();
        }

For some reason, the job ends up in Succeeded status. I've seen some older discussions about async code, but I thought that had to do with async code being used inside the job rather than how I'm using it to start and manage the BackgroundJobServer. I can't spot anything obvious. It's almost like the server doesn't know it's shutting down when the job ends.

One very strange oddity is the logging output. I'm expecting the following:

Hangfire shutdown requested.
Hangfire server stopped. Disposing.
Cancellation requested. Killing process.

I have seen this:

Cancellation requested. Killing process.
Hangfire shutdown requested.
Hangfire server stopped. Disposing.

This:

Hangfire shutdown requested.
Hangfire server stopped. Disposing.
Cancellation requested. Killing process.

And this:

Hangfire shutdown requested.
Hangfire server stopped. Disposing.

Edit: Was wrong in what I was expecting to see, so I fixed it. Apparently sometimes the cancellation message happens in the same order, but sometimes it doesn't, and sometimes it doesn't seem to happen at all.

@uler3161
Copy link
Author

Here is some updated code:

public class RecordingTask(ILogger<RecordingTask> logger, Camera camera, CancellationToken jobCancellationToken, IHostApplicationLifetime hostApplicationLifetime)
{
    private Camera? Camera
    {
        get;
    } = camera;

    private CancellationToken JobCancellationToken
    {
        get;
    } = jobCancellationToken;

    private IHostApplicationLifetime HostApplicationLifetime
    {
        get;
    } = hostApplicationLifetime;

    private ILogger<RecordingTask> Logger
    {
        get;
    } = logger;

    public void Run()
    {
        try
        {
            if (Camera == null)
            {
                throw new Exception("Camera was null");
            }

            var process = StartRecordingProcess();
            MonitorRecordingProcess(process);
        }
        catch (Exception ex)
        {
            Logger.LogError(ex, "Error running recording task: {message}", ex.Message);
            throw;
        }
    }

    private void MonitorRecordingProcess(
        Process process)
    {
        while (!JobCancellationToken.IsCancellationRequested && !process.HasExited)
        {
            Thread.Sleep(1000);
        }

        if (JobCancellationToken.IsCancellationRequested)
        {
            process.Kill(true);
        }

        if (process.ExitCode > 0)
        {
            throw new Exception(
                $"Unexpected end of recording. Hangfire job cancelled? {JobCancellationToken.IsCancellationRequested}. Process result: {process.ExitCode}");
        }

        if (HostApplicationLifetime.ApplicationStopping.IsCancellationRequested)
        {
            throw new Exception("Application shutting down. Throwing exception so Hangfire knows to re-queue");
        }

        Logger.LogWarning("Recording stopped, presumably due to user stopping recording.");
    }

    private Process StartRecordingProcess()
    {
        var startInfo = new ProcessStartInfo
        {
            Arguments = $"--cameraid {Camera!.ID}",
            FileName = "SecurityCameraRecorderProcess.exe"
        };

        var process = Process.Start(startInfo);
        return process
            ?? throw new Exception($"Couldn't start process {startInfo.FileName} with args {startInfo.Arguments}");
    }
}

So, basically I have to check whether the application is shutting down. If it is, I need to throw an exception of some sort in order to convince Hangfire to requeue. Is that really what I have to do?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant