A cautionary tale

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

A cautionary tale

Douglas Roberts-2
A group of people I associate with found a SW development environment that
they fell in love with.  After a year or so they outgrew the environment
(performance-wise; not, unfortunately, experience-wise).  They rushed
headlong into a campaign to expand the computational capabilities of their
"happy place".  The resulting undirected activities produced a year's worth
of mostly wasted effort, after having pursued an ill-advised parallelization
scheme that was intended to increase the size of problem they could run with
a "New HPC-Happy Place".

So here's the cautionary bit:  what the above-mentioned group *should* have
done once they bumped up against the computational wall that was an
intrinsic feature of their selected development environment was to stop.
Regroup.  Determine what the next performance level was intended or desired
to be.  Then, they should have developed a set of requirements that bounded
the desired performance characteristics of the new improved system.  Then,
they should have researched what existing and near-term parallel
technologies had a good chance at satisfying those requirements.  Instead of
doing this, unfortunately, they rushed head-long into the first
attractive-sounding opportunity that surfaced (not a a multi-core box, in
this case; rather, one of the team members decided to attempt to produce a
Java-based re-imp of MPI).  These guys have wasted a year so far with little
to show and few near-term prospects.

Making the transition from a serial application to a parallel version is a
process that requires a fair degree of formalism, if the goal of the project
is to produce a production-ready version that can handle the larger
problems.  On the other hand, if an on-going research project is all that is
intended  (Ohh! Let's see where this goes!  Ooh, I spot another Holy Grail
over there!), than a different approach entirely is suggested.

--Doug
--
Doug Roberts, RTI International
droberts at rti.org
doug at parrot-farm.net
505-455-7333 - Office
505-670-8195 - Cell

On 10/8/06, Marcus G. Daniels <mgd at santafe.edu> wrote:

>
> Carl Tollander wrote:
> >     Buy one or two reasonably well tricked out multi-core machines
> > early, don't go nuts on the HPC
> >     requirements until we have a better handle on how to get one or two
> > machines to make use of those
> >     cores for the kinds of problems we expect to want to address.
> >
> Or something like this that takes Socket F Opterons...
>
> http://www.microway.com/navion8.html
>
> An upgrade to a quad core Opteron would make that a 32 processor system.
> And you'll need to think about air conditioning once you got this many
> processors.
> A couple tons at least.
>
> ============================================================
> FRIAM Applied Complexity Group listserv
> Meets Fridays 9a-11:30 at cafe at St. John's College
> lectures, archives, unsubscribe, maps at http://www.friam.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: /pipermail/friam_redfish.com/attachments/20061008/87891cb5/attachment.html

Reply | Threaded
Open this post in threaded view
|

A cautionary tale

Marcus G. Daniels-3
Douglas Roberts wrote:
> Making the transition from a serial application to a parallel version
> is a process that requires a fair degree of formalism, if the goal of
> the project is to produce a production-ready version that can handle
> the larger problems.
Hmm, there may be a disconnect here between a culture of software
developers and computer scientists who consider the engineering of
toolkits as that formalism, conducted some distance from the activity of
use, and a culture of scientists who don't have any interest in software
systems but want to get certain experiments performed.

To me, there is an interesting and useful problem of the finding the
nicest to use, highest performance, most versatile hardware/software
system for ABM (and other scientific/analytical applications)   Using
such candidate programs for a while gives ideas for shortcomings in the
design and implementation, but when I use such an instrument I don't
really think about changing it except to fix bugs.   The tool
development itself is open-ended and both a means and an end.  No ABM
system should not expose MPI to model code, yet the system should have a
clear model for concurrency.   Many big Science codes directly expose
these low-level mechanisms to users.   Thus in that community is a
culture of technological curmudgeons!



Reply | Threaded
Open this post in threaded view
|

A cautionary tale

Douglas Roberts-2
I'm always looking for areas where I can agree with others, while sometimes
simultaneously disagreeing with other bits of what is being said.  I agree
with this bit:

On 10/8/06, Marcus G. Daniels <mgd at santafe.edu> wrote:
>
>
> >  No ABM system should [...] expose MPI to model code, [...]


 To this end, the TRANSIMS and EpiSIMS developers produced an
object-oriented API to MPI that hid the ugliness of the nuts and bolts of
the MPI toolkit from the ABM developers.  We've been using it for about 10
years now.  Objects that needed to travel between cpus inherited the
necessary functionality from an MPI-aware C++ class.  We also developed the
practice of hiding the internals of some the functional object
representations of agents in the models.  Serialization, for example, which
is necessary when packing an object into a message for sending to another
cpu was hidden behind inherited C++ methods and specializations.

--Doug

--
Doug Roberts, RTI International
droberts at rti.org
doug at parrot-farm.net
505-455-7333 - Office
505-670-8195 - Cell
-------------- next part --------------
An HTML attachment was scrubbed...
URL: /pipermail/friam_redfish.com/attachments/20061009/7c178e2d/attachment.html