Bugzilla – Bug 409881
way to stop/drop workers gracefully
Last modified: 2011-04-18 22:10:25 UTC
Atm. the only way to stop a worker is "rcobsworker stop" on the worker host. This should happen ideally when this worker(s) has(have) no jobs running - or the jobs need to be manually deleted from the backends /srv/obs/jobs/*/. This works, but will likely loose unfinished jobs, need manual intervention and triggered rebuilds or restarts of the scheduler. My probosal for an enhancement is therefore to implement some switches (possibly/longterm on the backend) to e.g. shutdown the workers on request _after_ a job has finished. Quick solution: e.g. "touch /tmp/root_1/SHUTDOWN" and evaluate existance of the file in worker and stop after build and don't fetch new data (backend will try to assign new job, but fail as now but no! "stale" jobs) Long term idea: Make this possible in the admin-backend via webinterface . Checkboxes for activating/deactivating discovered workers?
Michael, what is a proper api call to discard a job, which I can call in the init script ? Or do you want to handle this within the worker ? We could have two different methods: * Shutdown immediatly, discard the job. * Shutdown after this job. For the init script, we need the first one, I think.
Somehow the dispatcher needs to be informed for the first option to reassign the jobs to another worker, which then fails. Then the worker will be deleted from the worker list also on the server.
adrian has written a tool bs_admin. Should bs_admin have some new command like "check workers"?
to #3: the command would be good to check for stale/no longer existant workers, but if we exit gracefully this would be the better way.
Is this still considered ? Possibly for resource-management code ?
Yes, but we're currently a bit overloaded with the sles11/11.1 beta1 preparation...
Tnx, Michael! Consider it just a bump/reminder ;).
ping ;).
pong ;)