Let's admit this, shell scripts sometimes can be hard to write. Especially if you got 1000s shell commands to run and would like to restrict the resources they can use on with a 10-core CPU. Whether it be test programs or simulations, it would be nice if we have a manager to control the throttle and don't starve them.
Background
I once was asked to write a python script to run over 500 simulations on a 40-core server, with each one of them taking several hours to complete on a single core. I did not want to run them all at the same time as each one of them will only get 0.08 core, stretching the completion time to several days.
I also did not want my script to be "dumb": it should put the next task to run once a previous one finishes. Nor I wanted it to be complicated with synchronization between threads. Thus I adapt to the asyncio
library in Python, which will run coroutines asynchronously in a comprehensible manner. The result was the parallel-manager
package, which is super easy to use and even the simplest manager it has can automatively schedule the tasks.
parallel-manager
to the rescue!
Using the package is quite simple. First we download it from pypi with:
pip install parallel-manager
Then we initialize the manager and add shell commands to its task queue, all in a single async function:
from parallel_manager.manager import BaseShellManager
from parallel_manager.workerGroup import ShellWorkerGroup
import logging
import asyncio
async def Main():
# Init the manager
simpleShellWorkergroup = ShellWorkerGroup("simpleShellWorkergroup",
logging.getLogger(),
"./log",
10)
simpleShellManager = BaseShellManager("simpleShellManager")
simpleShellManager.add_workergroup("shell", simpleShellWorkergroup)
await simpleShellManager.init()
# Adding tasks
# Here we just run 100 echo
for i in range(100):
simpleShellManager.add_shell_request(f"echoing loop-{i}", f"echo {i}")
# Wait for the manager to finish
await simpleShellManager.done()
# That's it! We can run the manager now for these commands!
asyncio.run(Main())
You can add any shell commands to the manager, just replace the for loop with your own.
We also add an await simpleShellManager.done()
at the end to wait for the requests to be finished.
Then we run the manager with asyncio.run(Main())
, which will create the manager, add shell commands, and wait for them to finish. All the requests will have their stdout
and stderr
redirected under ./log
folder.
Explanation
The above script will create 10
workers, where each one of them will start a single process for the shell commands their acquired. Thus even we have 100
echos to do, there will be at most 10
echos at any given time.
You could also increase the worker count based on your computer configuration:
# Creating 40 workers in a worker group
simpleShellWorkergroup = ShellWorkerGroup("simpleShellWorkergroup",
logging.getLogger(),
"./log",
40)
Links
You can visit the package github page here for any bugs or detailed instructions: https://github.com/William-An/parallel-manager