Web lists-archives.com

Re: Beowulf gone?




Down to it's basic, rendering videos is nothing more than a simple
map-reduce, partioning a workload in a bunch of identical bits of
processing.  That could be done with N machines and a few simple shell
scripts.  Not really any need for anything fancy.  What the fancier
software gives you is stuff like automatic retries, fault tolerance,
fancy ways of tracking the process of a job, and so on.  But they
likely require more management to maintain.

Other reasons for a cluster might be web serving.  You want to be able
to handle some amount of queries per second with a reasonable latency
for most of those queries.  That may require you to scale in different
ways.  You need more machines to handle serving up the web content, a
bit more to handle the various types of processing that needs to be
done, some sort of scaling solution for your data store.  Still
parallelism, but each thing happening in parallel is likely a
different type of task.  A simple thing like Puppet or Chef could help
you keep your machines in order.

Maybe you don't need the computational power of a bunch of CPUs, but
you need all of the memory for something?  Maybe you need a bunch of
machines with 128G of ram to look like a single machine with 1P of
ram.  So you need to have something specialized for moving bits around
efficiently (either code to the data, or data to the code).

Maybe you want to have different types of pipelines that do completely
different things, and you want to make efficient use of the machinery
you have.  Say, you're doing both of the above, and during some hours
of the day, you need more processing power to go towards web serving,
you take away resources from you map-reduce.  During off hours, you
can let the MR have more processing power.  Now you're dynamically
trading off resources, and this is where some sort of clustering
management might become useful.

There is not likely to be any one solution for everyone.  So no longer
having anything exactly like Beowulf or some sort of direct
replacement is not suprising.  Likely stuff was learned from Beowulf
and friends.  Some things worked well, some things no, some things
were never used, some things were needed.  Folks take that experience
and build new tools and systems.  Old ones languish.  And no direct
replacement exists.

Likely you need to break down what you really want to do, then look to
see what solutions might work best for you, and experiment with the
various ones, and see which one you like best.

mrc