Science. Shared. Data.: manybody

Saturday, August 3, 2013

What is it with this linear scaling stuff anyway?

Enormous amounts of research time has gone into researching computational methods that are linear scaling with respect to the system size. That is, double the size of your system and you only double the computation time. If just all methods were as such, the queue on your local super computer cluster would be easier to guess when computers were available instead of seeing a wall of 200+ hours of jobs just sitting there because people don't give a crap.

Inspired by +Jan Jensen and a recent blog post of his (which I was reminded of when I wrote another blog post on the subject of many-body expansions), I set out to actually do the calculations on timings myself albeit with a different goal in mind.

2-body calculations
Even if you use the many-body expansion of the energy, I showed that the accumulated number of calculations one would need increases dramatically for large N-body. If we only focus on doing one- and two-body calculations, the effect is barely visible in the previous plot, but calculating the computational time from Jan's linear model (only do nearest neighbors) together with one where we do all pairs, we see that even at the two-body level, there is no linear scaling unless you do some approximations.

Here, I have assumed a computational scaling of $\alpha=2.8$ and uniform monomer sizes. I've assumed that a monomer calculation takes 1s and there is no overhead nor interaction at the monomer level.

Admittedly, the linear model is crude, but it shows the best scaling you could hope for by including the minimum amount of two-body calculations. In a more realistic case, you would end up somewhere between the red and the black line, but that is the subject for a future post.

This is why we need linear scaling!

3-body calculations
Just for the fun of it, here is the 3-body scaling

and I dare not think of what the time would be for the calculation without approximations for higher n-body calculations.

I think that we can all agree on that approximations must be made or else we are doomed.

We need linear scaling!

This work is licensed under a Creative Commons Attribution 3.0 Unported License.

Wednesday, July 31, 2013

would you do FMO-10 if you could?

The many-body expansion of the energy is a popular way to reduce the cost of computing the energy of an entire system too large to fit into your computer. The strategy is quite simple: chop your large system into several small pieces, do your quantum mechanics on each piece. In practice you can stop here (1-body), but often people to pair-corrections (2-body) to get good results and triples-corrections (3-body) if they want to be fancy. There are many implementations of this, including the eXplicit-POLarization method, the Fragment Molecular Orbtital method and the electrostatically-embedded many-body method, just to name a few.

My question is: would you do a 10-body calculation if you could?

Here is an argument against it, and it does not involve anything to do with accuracy, but rather computational cost.

The number of n-mers (1-mers, 2-mers, 3-mers and so on) increase quite drastically if one does not include approximations, but I can hear you ask: how bad is it?

For 16 water molecules, each water molecule is a 1-mer, the total number of calculations one would need to perform in order to calculate the n-body calculation is presented in the figure below

where we see that a 10-body calculation would require a total 58650 unique calculations whereas the 3-body calculation would require a "mere" 696.

Edit: For a discussion about timings, +Jan Jensen wrote a blog post on the subject.

That's at least an argument against dreaming of large n-body calculations unless you severely sit down and think about eliminating some of these calculations.

This work is licensed under a Creative Commons Attribution 3.0 Unported License.