-
-
Notifications
You must be signed in to change notification settings - Fork 7
Home
Entangle intends to be a lightweight, multi-functional parallel workflow framework. It's a great starting point to add functionality specific to your needs on top yet comes with a set of usable decorators out-of-the-box! Unlike other heavyweight frameworks, entangle is designed to be extended. Workflows often involve a variety of task types and infrastructure destinations and mixing and matching them to accomplish complex or specific workflows is the point of entangle.
Entangle tries to find a niche that isn't already occupied by an established framework like Dask or Parsl. So let's look at some of the key differences.
Dask
Dask is a great parallel compute framework for python! It is designed to run on established infrastructure where you are able to set up dask servers on them. Dask uses a client/server approach to distributed computation which requires open ports, firewall rules, etc.
Entangle takes a different approach. It does not require an pre-running services (clients or servers) on remote machines and thus no open ports, firewall rules. It only uses port 22 ssh via mutually trusted certificates. Entangle promotes the notion of declarative infrastructure, and thus you are able to design your workflows against dynamic infrastructure destinations. Dask does not do this.
Dask uses invocation idioms like dask.compute() to evaluate delayed objects. Entangle dispenses with most idiomatic usage like this and each workflow or task behaves just as a normal python function would behave, as a callable. This is important because you can pass entangle tasks or workflows through 3rd party libraries that operate on python callables. Extra wrappers and coding would be required for your dask delayeds.
Entangle computations can crawl CPUs across machines and be modified along the way. Dask prevents modifying delayeds (Delayed Best Practices) after creation or embedding new delayed invocations within a function that is being invoked by dask scheduler already. Entangle has no curent limitation like this.