Showing posts with label extrapolation. Show all posts
Showing posts with label extrapolation. Show all posts

Friday, August 23, 2013

Be careful when you extrapolate to the complete basis set limit!

In our soon-to-be-finised manuscript on shielding constants obtained through the QM/MM with the MM-region described by a high-quality polarizable embedding potential (PE), I've done a thing I should have done a while ago. But, as it is always the case with these things, you end up doing it in the wrong order.

I read the blogpost on basis set extrapolation by +Jan Jensen and thought I should try. Here is what I got by calculating the shielding constants for oxygen-17 in acrolein using KT3/pcS-n//B3LYP/aug-cc-pVDZ solvated by a 12 Å sphere of explicit water molecules described by the PE potential.

pcS-0 pcS-1 pcS-2 pcS-3
-309.2 -216.9 -191.6 -188.2

How do you continue from these values to an estimate of the basis set limit at infinity? According to Jans blog post, you should fit your data according to
$$
Y(l_{max}) = Y_{CBS} + a \cdot e^{b \cdot l_{max}}
$$
to get the most accurate result. I have done so in using WolframAlpha and I have obtained the following plot (blue dashed line) with $Y_{CBS}=-176.5$ ppm.



"That is a terrible fit!", I can hear you say. And indeed it is. What is going on? It turns out (and please give me some references if you know them!) that the pcS-0 results are actually too bad to be taken seriously. The single zeta basis set is not enough to even get a qualitatively correct description of that wave function, i.e. what you get is just wrong. If you remove that point you get the solid blue curve which is a really good fit with $Y_{CBS}=-187.7$ ppm.

If you want to use the alternative extrapolation scheme that Jan provides, i.e.
$$
Y(l_{max}) = Y_{CBS} + b\cdot l_{max}^{-3}
$$
one obtains the red solid curve with $Y{CBS}=-182.9$ ppm which 5 ppm off from the exponential fit.

As Frank (and Grant) are commenting below, one should not trust numbers from SZ basis sets and +Anders Steen Christensen noted that even the DZ results could/should be disregarded. The only problem is that the size of my calculations are increasing, and a 5Z calculation is pretty much out of reach beyond the TZ basis set. Jans post does mention that one should use a DZ quality basis set, shame on me I guess for even trying with pcS-0.

Just be careful out there and remember to extrapolate!

edit1: fixed the last formula

edit2: Frank Jensen has given a lengthy comment on the matter which gives a lot of insigth. Read it below. I've added some clarifying text here and there based on his comments and will likely follow up on it in a later blog post.