Friday, August 23, 2013

Be careful when you extrapolate to the complete basis set limit!

In our soon-to-be-finised manuscript on shielding constants obtained through the QM/MM with the MM-region described by a high-quality polarizable embedding potential (PE), I've done a thing I should have done a while ago. But, as it is always the case with these things, you end up doing it in the wrong order.

I read the blogpost on basis set extrapolation by +Jan Jensen and thought I should try. Here is what I got by calculating the shielding constants for oxygen-17 in acrolein using KT3/pcS-n//B3LYP/aug-cc-pVDZ solvated by a 12 Å sphere of explicit water molecules described by the PE potential.

pcS-0 pcS-1 pcS-2 pcS-3
-309.2 -216.9 -191.6 -188.2

How do you continue from these values to an estimate of the basis set limit at infinity? According to Jans blog post, you should fit your data according to
$$Y(l_{max}) = Y_{CBS} + a \cdot e^{b \cdot l_{max}}$$
to get the most accurate result. I have done so in using WolframAlpha and I have obtained the following plot (blue dashed line) with $Y_{CBS}=-176.5$ ppm.

"That is a terrible fit!", I can hear you say. And indeed it is. What is going on? It turns out (and please give me some references if you know them!) that the pcS-0 results are actually too bad to be taken seriously. The single zeta basis set is not enough to even get a qualitatively correct description of that wave function, i.e. what you get is just wrong. If you remove that point you get the solid blue curve which is a really good fit with $Y_{CBS}=-187.7$ ppm.

If you want to use the alternative extrapolation scheme that Jan provides, i.e.
$$Y(l_{max}) = Y_{CBS} + b\cdot l_{max}^{-3}$$
one obtains the red solid curve with $Y{CBS}=-182.9$ ppm which 5 ppm off from the exponential fit.

As Frank (and Grant) are commenting below, one should not trust numbers from SZ basis sets and +Anders Steen Christensen noted that even the DZ results could/should be disregarded. The only problem is that the size of my calculations are increasing, and a 5Z calculation is pretty much out of reach beyond the TZ basis set. Jans post does mention that one should use a DZ quality basis set, shame on me I guess for even trying with pcS-0.

Just be careful out there and remember to extrapolate!

edit1: fixed the last formula

edit2: Frank Jensen has given a lengthy comment on the matter which gives a lot of insigth. Read it below. I've added some clarifying text here and there based on his comments and will likely follow up on it in a later blog post.