Science. Shared. Data.

Friday, June 3, 2016

Generate CUBE files in GAMESS

GAMESS can generate CUBE files of various computable properties so you can visualize the results in programs like Avogadro or GaussView.

Here, I'll show how to compute the density of a methanol molecule since I struggled to generate the grid in the right way (the INPUT description was not exactly helpful).

The input file is

if run through GAMESS you'll get the usual output file, but more importantly in your user scratch space you'll get a .dat file too. This file contains the CUBE data we requested. Just search for START to locate where the CUBE file starts, extract that and change the file extension to .cube. This file can automatically be loaded into Avogadro where you can create a surface of the data to get something like the following

EDIT: A note on the $GRID section in the input file. One selects an origin through the ORIGIN(1) keyword. When selecting the vectors for the box, one must provide the coordinate of the X, Y and Z corners and not just the change in vector. You can see that XVEC(1)=6,5,4 which tells GAMESS that the vector that takes us from ORIGIN(1) to XVEC(1) is 8,0,0. I spent a good 20 minutes trying to figure out what the problem was since I thought that one should specify 8,0,0 and not 6,5,4 in XVEC. Oh well.

Tuesday, August 4, 2015

The basis set project continues

In the last blog post, we were discussing both the saturation and the contraction. In the end, I've settled on a (26s16p12d2f) basis set for the aug-cc-pVTZ-Juc basis set (uc is for uncontracted) and now I turn to the contraction of the basis set.

So why do we contract the basis set? - well – for every primitive s-function have in our system, we increase the size of the Fock matrix by one. For p-functions, we have $p_x$, $p_y$ and $p_z$ and thus for every primitive p-function we increase the size of the Fock matrix by three. For d-fucntions it's five and for f-functions is 10. So, by just including one atom using our newly constructed aug-cc-pVTZ-Juc basis set increases the size of the Fock matix by

\[ 26 \times 1 + 3 \times 16 + 12 \times 5 + 2 \times 10 = 156 \]

and remember that the methods you want to use scales really bad with the number of basis functions.

So a contraction of the basis set can be seen as a reasonable reduction of the basis set size without compromising its quality (too much).

Above you see a figure I 've made with the error in percent to aug-cc-pVTZ-Juc as the contraction level is increased. The further you go to the right, the more s-functions (see the number on the x-axis) you have in your contracted basis set. Obviously, the fewer the better because of computational speed but as you go further to the left, the error starts increasing. Initially not so much, but below 10 some errors are > 100 %.

I've decided that an error below 1 % in the contraction when compared to the uncontracted basis set is acceptable (at least for the s-functions), but as observed from the figure above, this threshold is reached at quite different points depending on the which element is investigated.

And this is where it stops being: there is one answer only and it turns into voodoo. So personally, I think that a contraction level of 14 would be quite nice, but since that is the first point where all five elements are below or very close to 1 % (Bromine is not quite there) I don't feel that this should be the contraction level. Going to a contraction level of 15, all calculations give results that are below 1 % but somehow it saddens me that the error for Gallium shoots up to 1 % while Bromine goes down. Obviously I should not concern myself with Germanium, Arsenic or Selenium as their errors are well below 0.2 % for the same contraction level. I guess for s-functions I could probably settle with a contraction level of 17 since reduces the error to 0.6 % in the worst case (Gallium) without adding a lot of computational expense, but then again – a difference of 0.4 % is not really a whole lot.

Tuesday, July 21, 2015

Revival of an old basis set project

During the stillness in the summer months I've revived an old project regarding the construction of a new basis set optimized for calculating spin-spin coupling constants. It is the aug-cc-pVTZ-J basis set family I hope to contribute to and I am deriving basis sets for the elements Gallium, Germanium, Arsenic, Selenium and Bromine.

I've already constructed the uncontracted version of the basis set (aptly named aug-cc-pVTZ-Juc) by saturating it with additional primitive s, p and d functions following the even-tempered approach were the ratio between exponents are kept constant. It is a pretty straightforward and standard way of increasing the size of a basis set.

If you do this for the simplest hydride of Bromine, I.e. HBr, you end up with something that looks like the following figure when calculating spin-spin coupling constants as the basis is saturated with tight s-functions (black), five tight s-functions and tight p-functions (medium grey) and five tight s-functions, two tight p-functions and tight d-functions (light gray):

It can be seen that there is a nice convergence as the basis set is saturated with s-functions (owing to a convergence of the Fermi contact term). Adding tight p- and d-functions are observed to yield a small increase and a bit of wobbling back and forth as convergence is reached.

I've settled on the the following primitive set of basis functions for the basis sets across all tested elements: [26s16p14d2f] which means I've added 5s, 2p and 4d primitive basis functions to the original uncontracted aug-cc-pVTZ basis set.

However, this is much too large to be useful in any sort of real calculation except for the very basic hydrides I'm using. So we need to contract the basis set. So far, the approach with the J-family of the aug-cc-pVTZ basis sets has been to use the molecular obital coefficients from the simple hydrides as contraction coefficients up to some level (I should rather use down to I suppose but it feels wrong).

Contracting the 26 primitive s-functions first into ns contracted s-functions (while letting the p and d-functions remain uncontracted) gives the solid black line with cirles in the following figure where the error compared to the fully uncontracted basis set is given in percent.

Here we observe that we must use a rather loose form of contraction of at least 15 (I.e. the 26 primitives are fixed in such a way that there is only 15 contracted basis functions left) to obtain an error below 1 % compared to the uncontracted level.

I've furthermore shown data for using a contraction of either 17 or 18 for the s-functions (to get a really low error) and contracted the p-functions on top of that (orange for 17 and blue for 18, respectively). Here we observe that saturation in both cases is obtained by contracting the 16 primitive p-functions into 10 contracted p-functions. Notice that the difference between the orange and blue curve is equal to the difference between the error we make in the contraction of the s-functions. Going above contraction level 10 for the p-functions is not giving us much improvement overall.

Finally, the dashed gray lines represent the contraction of d-functions on top of either the 17s10p or 18s10p results. We observe that a contraction level of 10 is needed for convergence (although it could be argued that a contraction level 9 is more than enough). Again, the difference between 17s10p10d and 18s10p10d is equal to the error from the contraction of the s-functions and p-functions.

As a final note, we see from the 17s11p and 17s12p that using either of these as a base for the contraction of the d-functions, the errors amounts to the difference betwen the contraction level of the p-functions.

Monday, December 2, 2013

Release of FragIt 1.3

I'm proud to announce the immediate availability of FragIt 1.3!

From the release notes:

* Stabilization of the API for 3rd party uses

* API has support for a QM-region in the XYZ-MFCC
   writer. This QM-region can be expanded by including
   covalently bound or hydrogen bound neighbours.

   This support is slated to be included more generally
   in a future fragit 1.4 release series.

* A visually more pleasing color scheme. We are no
   longer living in an 8-bit world.

* FragIt now exits more cleanly if it cannot find itself.

The visually more pleasing color scheme was discussed in a previous post.

Friday, November 8, 2013

Calculating Shieldings for Proteins

I'm just going to show some recent work on extending the Polarizable Embedding from being a focused QM/MM model into an approach to generate parameters for entire proteins or DNA. Here is a snapshot of the human protein GB3 (PDB: 2OED) with the central QM-fragment we are interested in - here LYS4 - shown in vdw-spheres. Also in the QM-region are neighbouring residues and stuff (side-chains and water) that binds to the central fragment of interest) all highlighted in cyan in sticks. The rest of the protein along with a lot of water is the MM-region. Only the protein is shown in a classical cartoon rendering.

Enjoy

Wednesday, October 30, 2013

FragIt has a better look!

Active development with FragIt is still going strong as I work my way through interfacing it with other software that we are building in-house to allow you to run these high-quality QM/MM calculations with relative ease.

I've just added a visually more pleasing color scheme to the development version which you can see in the image below

The old colorscheme is to the left and the new is to the right.

it just became a lot cooler to visually inspect your molecule of interest.

Friday, October 18, 2013

Release of FragIt 1.2

After a ferocious development spree, I am proud to announce the release of FragIt 1.2.

Under the hood, it has primarily been reworking some of the finer details of the API to allow for better integration with programs wishing to use it.

Directly from the release-notes:

This release provides Molcular Fragmentation with Conjugate Caps (MFCC) support through FragIt. Currently, it dumps all capped fragments and all caps as separate .xyz files so you can process them further on in the quantum chemistry code of your choice. You can access it through the writer = XYZ-MFCC option in the [fragmentation] group.
This release also includes some updated API for better integration with other codes that wants to interface with FragIt.

FragAway!