28 August 2015

623. Comparing frequency calculations in G09 and NWChem -- the importance of grid density

The vibrational entropy term will differ between calculations done in gaussian and nwchem unless you use "grid xfine" in nwchem. That the grid density is important is nothing new, but the magnitude of the effect on the entropy surprised me.

The difference can be large -- 30 cal/molK for [PPh4]+ at pbe0/cc-pvdz -- which becomes quite significant when multiplied by T (e.g. 298.15 K).

Note that the rotational entropy term also may differ, but that this would be due to different uses of symmetry in the calculations: http://molecularmodelingbasics.blogspot.com.au/2012/12/conformational-and-rotational-entropy.html

If you turn off symmetry (noautosym) in nwchem the rotational entropy will not be corrected. I've noticed that Gaussian, on the other hand, will sneakily apply correction if it finds an acceptable symmetry even if you request nosymm, so make sure that you scan through the output carefully.

Either way, vibrational entropy is not symmetry dependent. Instead you will have to worry about the grid fine-ness when comparing outputs.

If your molecule is very small, such as benzene or tetramethylphosphonium, it seems that you don't have to worry about this. However, even fairly small molecules such as [PPh4]+ will be affected.

Conv. Dens. = Convergence Density


CodeSymmGridConv. Dens.DFT EnergyZPEHCorrS(tot)S(trans)S(rot)S(vib)
G09NF1E-8-1266.584241520.3705160.389751147.07643.35835.00968.708
G09NU1E-8-1266.584303740.3704640.389691146.82943.35835.00968.461
NWNX1E-8-1266.584552230.3703480.389549146.69743.33934.99468.365
NWNX1E-5-1266.584552220.3703480.389549146.70443.33934.99468.371
NWYF1E-5-1266.584536840.3700340.385683118.02343.33933.61741.067
NWYF1E-8-1266.584536840.3700340.385683118.02343.33933.61741.067
NWNF1E-5-1266.584549280.3700340.385683119.39443.33934.99441.062
NWNF1E-8-1266.584549290.3700340.385683119.39443.33934.99441.061
NWYX1E-5-1266.584552740.3703480.389549145.33143.33933.61768.376
NWYX1E-8-1266.584552750.3703480.389549145.33743.33933.61768.382

F=fine. X=Extrafine. U=Ultrafine.
S(rot) values in blue are symmetry corrected. That's completely normal.

With "grid fine" NWChem gives a very different result to G09.

You can see the difference in the predicted IR spectra as well.

Fine (NWChem) (blue rings) vs G09 (red circles):
"grid xfine" (NWChem) (blue rings) vs G09 (red circles):


21 August 2015

622. Crude comparison of a simple frequency calculation in different computational packages

I've given the input files at the bottom of the post.

The job is a simple benzene frequency calculation at PBE0/def2-tzvp. The jobs were run on an AMD FX 8150 with 32 gb ram (debian jessie 64 bit linux)

Orca 3.0.3 and G09 (AM64L rev D.01) were supplied as precompiled binaries.

Nwchem 6.5 and Gamess US 5 Dec 2014 (R1) were compiled by me, and the poor performance of either code may thus be due solely to something that I've done. Both codes were linked against ACML 5.3.1 and compiled with gfortran.

I'm also not familiar with orca and gamess, and so I don't know how to get the best performance from either code. I defined the basis set explicitly in all codes.

Also note that for other frequency calculations Orca seems orders of magnitude slower than NWChem -- in fact they are so slow that I haven't let them finish after waiting for a week, when in Nwchem they take a day.

Basically, my empirical but poorly-supported-by-data view is that G09 is by far the fastest, then Nwchem, then Orca and finally Gamess. In the case of Gamess this may well be due to how I compiled it.

Nwchem was compiled roughly as was shown here: http://verahill.blogspot.com.au/2014/09/593-nwchem-65-on-debian-jessie-and.html

I'll post the compilation of Gamess US 5/12/2014 R1 later.

Results
Note: Orca used a symmetry number of 3 in the calculation of S(rot). I've 'corrected' it back to the non-symmetry corrected value so that it can be compared with the output from the other codes. Likewise, I've divided the entropy terms from Orca by 298.15 K.

Overall the results agree very well across the codes although the electronic energy in Orca is noticably different from the others.

Code Runtime 'DFT' energy (H) ZPE (H) Hcorr (H) S(tot) (Cal) S(trans) (Cal) S(rot) (Cal) S(vib) (Cal)
G09 12 min -232.04511372 0.100682 0.106028 69.033 38.979 25.625 4.428
Orca 17 min -232.04530127 0.100605 0.105960 69.062 38.974 25.628 4.461
Nwchem 41 min -232.04516624 0.100836 0.106162 68.950 38.962 25.614 4.375
Gamess 76 min -232.04517728 0.100701 0.10604 69.011 38.979 25.625 4.407

G09 input:
%nprocshared=8
%Mem=3500000000
%Chk=benzene_freq.chk
#P rPBE1PBE/GEN 5D 7F Freq=() NoSymm Integral(UltraFine )  Punch=(MO) Pop=() 

benzene freq

0 1 ! charge and multiplicity
 C     1.20188     0.693923     0.00000
 C     9.00000e-06     1.38776     0.00000
 C     -1.20188     0.693842     0.00000
 C     -1.20188     -0.693933     0.00000
 C     -1.60000e-05     -1.38777     0.00000
 C     1.20188     -0.693829     0.00000
 H     2.14078     1.23611     0.00000
 H     5.10000e-05     2.47197     0.00000
 H     -2.14085     1.23591     0.00000
 H     -2.14082     -1.23604     0.00000
 H     -2.60000e-05     -2.47197     0.00000
 H     2.14082     -1.23594     0.00000

 C  0
 S   6  1.00
   13575.34968200      0.00022246
    2035.23336800      0.00172327
     463.22562359      0.00892557
     131.20019598      0.03572798
      42.85301589      0.11076260
      15.58418577      0.24295628
 S   2  1.00
       6.20671385      0.41440263
       2.57648965      0.23744969
 S   1  1.00
       0.57696339      1.00000000
 S   1  1.00
       0.22972831      1.00000000
 S   1  1.00
       0.09516444      1.00000000
 P   4  1.00
      34.69723224      0.00533337
       7.95826228      0.03586411
       2.37808269      0.14215873
       0.81433208      0.34270472
 P   1  1.00
       0.28887547      1.00000000
 P   1  1.00
       0.10056824      1.00000000
 D   1  1.00
       1.09700000      1.00000000
 D   1  1.00
       0.31800000      1.00000000
 F   1  1.00
       0.76100000      1.00000000
 ****
 H  0
 S   3  1.00
      34.06134100      0.00602520
       5.12357460      0.04502109
       1.16466260      0.20189726
 S   1  1.00
       0.32723041      1.00000000
 S   1  1.00
       0.10307241      1.00000000
 P   1  1.00
       0.80000000      1.00000000
 ****



Orca input
%pal nprocs 8 end
! DFT pbe0 def2-tzvp printbasis
! freq
%basis
newgto H
S   3
  1     34.0613410              0.60251978E-02   
  2      5.1235746              0.45021094E-01   
  3      1.1646626              0.20189726       
S   1
  1      0.32723041             1.0000000        
S   1
  1      0.10307241             1.0000000        
P   1
  1      0.8000000              1.0000000        
end
newgto C
S   6
  1  13575.3496820              0.22245814352E-03      
  2   2035.2333680              0.17232738252E-02      
  3    463.22562359             0.89255715314E-02      
  4    131.20019598             0.35727984502E-01      
  5     42.853015891            0.11076259931    
  6     15.584185766            0.24295627626    
S   2
  1      6.2067138508           0.41440263448    
  2      2.5764896527           0.23744968655    
S   1
  1      0.57696339419          1.0000000        
S   1
  1      0.22972831358          1.0000000        
S   1
  1      0.95164440028E-01            1.0000000        
P   4
  1     34.697232244            0.53333657805E-02      
  2      7.9582622826           0.35864109092E-01      
  3      2.3780826883           0.14215873329    
  4      0.81433208183          0.34270471845    
P   1
  1      0.28887547253          1.0000000        
P   1
  1      0.10056823671          1.0000000        
D   1
  1      1.09700000             1.0000000        
D   1
  1      0.31800000             1.0000000        
F   1
  1      0.76100000             1.0000000      
end
end
* xyz 0 1
C             0.03998          -1.38721           0.00000
C             1.22135          -0.65898           0.00000
C             1.18137           0.72823           0.00000
C            -0.03998           1.38721           0.00000
C            -1.22135           0.65898           0.00000
C            -1.18137          -0.72823           0.00000
H             0.07121          -2.47097           0.00000
H             2.17552          -1.17381           0.00000
H             2.10431           1.29715           0.00000
H            -0.07121           2.47097           0.00000
H            -2.17552           1.17381           0.00000
H            -2.10431          -1.29715           0.00000
*

nwchem input
cratch_dir /home/me/scratch
Title "benzene freq"

Start  freq

echo

charge 0

geometry noautosym units angstrom
 C     1.20188     0.693923     0.00000
 C     9.00000e-06     1.38776     0.00000
 C     -1.20188     0.693842     0.00000
 C     -1.20188     -0.693933     0.00000
 C     -1.60000e-05     -1.38777     0.00000
 C     1.20188     -0.693829     0.00000
 H     2.14078     1.23611     0.00000
 H     5.10000e-05     2.47197     0.00000
 H     -2.14085     1.23591     0.00000
 H     -2.14082     -1.23604     0.00000
 H     -2.60000e-05     -2.47197     0.00000
 H     2.14082     -1.23594     0.00000
end

basis "ao basis" spherical print
H    S
    34.061341000000     0.006025197800
     5.123574600000     0.045021094000
     1.164662600000     0.201897260000
H    S
     0.327230410000     1.000000000000
H    S
     0.103072410000     1.000000000000
H    P
     0.800000000000     1.000000000000
C    S
 13575.349682000000     0.000222458144
  2035.233368000000     0.001723273825
   463.225623590000     0.008925571531
   131.200195980000     0.035727984502
    42.853015891000     0.110762599310
    15.584185766000     0.242956276260
C    S
     6.206713850800     0.414402634480
     2.576489652700     0.237449686550
C    S
     0.576963394190     1.000000000000
C    S
     0.229728313580     1.000000000000
C    S
     0.095164440028     1.000000000000
C    P
    34.697232244000     0.005333365781
     7.958262282600     0.035864109092
     2.378082688300     0.142158733290
     0.814332081830     0.342704718450
C    P
     0.288875472530     1.000000000000
C    P
     0.100568236710     1.000000000000
C    D
     1.097000000000     1.000000000000
C    D
     0.318000000000     1.000000000000
C    F
     0.761000000000     1.000000000000
END

dft
  mult 1
  direct
  XC pbe0
  grid xfine
  convergence density 1e-08
  mulliken
end

task dft energy
task dft freq

gamess US input:
 $SYSTEM MWORDS=3500 $END
 $CONTRL RUNTYP=Hessian $END
 $CONTRL SCFTYP=RHF $END
 $DFT  DFTTYP=PBE0 $END
 $CONTRL ICHARG=0  MULT=1 $END
 $CONTRL ISPHER=1 $END
 $SCF DIRSCF=.TRUE. $END
 $BASIS EXTFIL=.TRUE. GBASIS=DEF2TZVP $END
 $DATA
 Benzene
 C1
 C  6.000000 0.039980 -1.387210 0.000000
 C  6.000000 1.221350 -0.658980 0.000000
 C  6.000000 1.181370 0.728230 0.000000
 C  6.000000 -0.039980 1.387210 0.000000
 C  6.000000 -1.221350 0.658980 0.000000
 C  6.000000 -1.181370 -0.728230 0.000000
 H  1.000000 0.071210 -2.470970 0.000000
 H  1.000000 2.175520 -1.173810 0.000000
 H  1.000000 2.104310 1.297150 0.000000
 H  1.000000 -0.071210 2.470970 0.000000
 H  1.000000 -2.175520 1.173810 0.000000
 H  1.000000 -2.104310 -1.297150 0.000000
 $END

17 August 2015

621. Very briefly: comparison of different hardware for a single G09 calculation

Update: I spotted a few mistakes
* the L5430 job ran on a dual-socket machine, so I've multiplied the passmark by two and have replotted
* the X3480 job use the EM64T version of gaussian, not AMD64. I don't have a license to test that system using AMD64.

Original post:
There are lots of potential flaws when comparing the performance of a computational package on different hardware. Thus, it can be difficult to find examples online comparing different hardware using computational chemistry packages which makes it challenging to decide on what hardware to budget for.

So here's a simple comparison of a few different types of hardware for a geovib calculation in Gaussian.

All systems have spinning (7200 rpm) disks and use debian jessie (64). The systems haven't been optimised in any way.

All systems used G09D rev 01 AMD 64. The amount of time the geometry optimisation took is given within [].

Performance:
2h 15 min. [1h 14 min.] i7-4930K/ 32 Gb ram/ 12 threads
3h 49 min. [2h 12 min.] AMD FX 8350/ 8 Gb ram/ 8 threads
4h 12 min. [2h 19 min.] i5-2400/ 16 Gb ram/ 4 threads
4h 28 min. [2h 16 min.] dual-socket Xeon L5430/ 16 Gb ram/ 8 threads
4h 43 min. [2h 47 min.] AMD FX 8150/ 32 Gb ram/ 8 threads

I also tried the EM64T version of G09 rev 01 on and got:
1h 43 min. [57 min.] i7-4930K/ 32 Gb ram/ 12 threads
3h 03 min. [1h 44 min.] i5-2400/16 Gb ram/ 4 threads
4h 21 min. [2h 27 min.] Xeon X3480/ 8 Gb ram/ 8 threads

Just by switching from the AMD64 to the EM64T version we thus cut the calculation down to 75% of the time for the i7.

Here's a plot of the run times vs Passmark benchmarks:

The performance of the i5-2400 is much better than it should be for some reason.


So what's  the benchmarking job that I used? I actually prefer not to reveal it, as it'd eventually point towards my identity (and you're not supposed to publish gaussian benchmarks...)

Suffice to say that it uses:
rpbe0/def2-svp and 459 functions/759 primitives (46 atoms)