05 September 2013

513. Extracting data from a PES scan with gaussian

There are a few reasons to like gaussian, and many reasons not to. Gaussian is fast, and their whitepapers are great resources for learning computational techniques.

Without going into discussions about the commercial behaviour of Wavefunction inc., the things I don't like about gaussian is the clunky input format (nwchem has a much more readable syntax), the inscrutable error messages, and the unreadable output. Well, it's not unreadable in a literal sense, but it could certainly be clearer. On the other hand, I've having issues with running some of my PES scans in nwchem -- and I can't find a solution (more about that in a later post)

Anyway, here's a python script for extracting optimized structures and energies from a relaxed PES scan in Gaussian 09.

First, an example of a simple scan:
%nprocshared=2 %Chk=methanol.chk #P rB3LYP/6-31g 6D 10F Opt=(modredundant) NoSymm Punch=(MO) Pop=() methanol 0 1 ! charge and multiplicity C 0.0351714 0.00548884 0.0351714 H -0.617781 -0.634073 0.667983 H 0.667983 -0.634073 -0.617781 H -0.605139 0.646470 -0.605139 O 0.839603 0.818768 0.839603 H 1.38912 0.201564 1.38912 1 5 S 10 0.1
And here's the script, pes_parse_g09:
#!/usr/bin/python
import sys

def getrawdata(infile):
        f=open(infile,'r')
        opt=0
        geo=0
        struct=[]
        structure=[]
        energies=[]
        energy=[]
        for line in f:
                
                if opt==1 and geo==1 and not ("---" in line):
                        structure+=[line.rstrip()]
                
                if 'Coordinates (Angstroms)' in line:
                        if opt==0:
                                opt=1
                                structure=[]
                        
                if opt==1 and "--------------------------" in line:
                        if geo==0:
                                geo=1
                        elif geo==1:
                                geo=0
                                opt=0
                if 'SCF Done' in line:
                        energy=filter(None,line.rstrip('\n').split(' '))
                if      'Optimization completed' in line and (opt==0 and geo==0):
                        energies+=[float(energy[4])]
                        opt=0
                        geo=0
                        struct+=[structure]
                        structure=[]
        
        return struct, energies

def periodictable(elementnumber):
        ptable={1:'H',2:'He',\
        3:'Li', 4:'Be',5:'B',6:'C',7:'N',8:'O',9:'F',10:'Ne',\
        11:'Na',12:'Mg',13:'Al',14:'Si',15:'P',16:'S',17:'Cl',18:'Ar',\
        19:'K',20:'Ca',\
        21:'Sc',22:'Ti',23:'V',24:'Cr',25:'Mn',26:'Fe',27:'Co',28:'Ni',29:'Cu',30:'Zn',\
        31:'Ga',32:'Ge',33:'As',34:'Se',35:'Br',36:'Kr',\
        37:'Rb',38:'Sr',\
        39:'Y',40:'Zr',41:'Nb',42:'Mo',43:'Tc',44:'Ru',45:'Rh',46:'Pd',47:'Ag',48:'Cd',\
        49:'In',50:'Sn',51:'Sb',52:'Te',53:'I',54:'Xe',\
        55:'Cs',56:'Ba',\
        57:'La',58:'Ce',59:'Pr',60:'Nd',61:'Pm',62:'Sm',63:'Eu',64:'Gd',65:'Tb',66:'Dy',67:'Ho',68:'Er',69:'Tm',70:'Yb',71:'Lu',\
        72:'Hf', 73:'Ta', 74:'W',75:'Re', 76:'Os', 77:'Ir',78:'Pt', 79:'Au', 80:'Hg',\
        81:'Tl', 82:'Pb', 83:'Bi',84:'Po',85:'At',86:'Rn',\
        87:'Fr',88:'Ra',\
        89:'Ac',90:'Th',91:'Pa',92:'U',93:'Np',94:'Pu',95:'Am',96:'Cm',97:'Bk',98:'Cf',99:'Es',100:'Fm',101:'Md',102:'No',\
        103:'Lr',104:'Rf',105:'Db',106:'Sg',107:'Bh',108:'Hs',109:'Mt',110:'Ds',111:'Rg',112:'Cn',\
        113:'Uut',114:'Fl',115:'Uup',116:'Lv',117:'Uus',118:'Uuo'}
        element=ptable[elementnumber]
        return element

def genxyzstring(coords,elementnumber):
        x_str='%10.5f'% coords[0]
        y_str='%10.5f'% coords[1]
        z_str='%10.5f'% coords[2]
        element=periodictable(int(elementnumber))
        xyz_string=element+(3-len(element))*' '+10*' '+\
        (8-len(x_str))*' '+x_str+10*' '+(8-len(y_str))*' '+y_str+10*' '+(8-len(z_str))*' '+z_str+'\n'
 
        return xyz_string

def getstructures(rawdata):
        
        n=0
        for structure in rawdata:
                
                n=n+1
                num="%03d" % (n,)
                g=open('structure_'+num+'.xyz','w')
                itson=False
                cartesian=[]
                        
                for item in structure:
                        
                        coords=filter(None,item.split(' '))
                        coordinates=[float(coords[3]),float(coords[4]),float(coords[5])]
                        element=coords[1]
                        cartesian+=[genxyzstring(coordinates,element)]
                g.write(str(len(cartesian))+'\n')
                g.write('Structure '+str(n)+'\n')
                for line in cartesian:
                        g.write(line)
                g.close()
                cartesian=[]
        return 0
        
if __name__ == "__main__":
        infile=sys.argv[1]
        rawdata,energies=getrawdata(infile)
        structures=getstructures(rawdata)
        g=open('energies.dat','w')
        for n in range(0,len(energies)):
                g.write(str(n)+'\t'+str(energies[n])+'\n')
        g.close()

And here's what we get from the output:
g09 methanol.in |tee methanol.out
pes_parse_g09 methanol.log
cat structure* > meoh_traj.xyz



And here's a plot of energies.dat:

14 comments:

  1. you are a god among men. thank you!

    ReplyDelete
    Replies
    1. I shall direct the head of the department and the dean of the faculty to this Comment, so that my exulted status will be recognised and tenure unconditionally granted.

      Either way, happy it helped!

      Delete
  2. Thanks for posting this, it's really helpful. One mistake, easily corrected: line 54 should terminate with ,\

    ReplyDelete
    Replies
    1. Thanks for spotting that and letting me know -- fixed now.

      Delete
  3. Thanks, very useful!

    ReplyDelete
  4. Very useful, thanks a lot for sharing.

    ReplyDelete
  5. As an undergrad researcher, I wholeheartedly thank you! Now I don't have to spend lonely nights sifting through .log's with vi. You are a god amongst men!

    ReplyDelete
  6. This code and more can now be found at http://moleculargeo.chem.umu.se/codes/

    Might be of help!

    ReplyDelete
  7. oh my god, thank you so much

    ReplyDelete
  8. oh my god, thank you so much

    ReplyDelete
  9. Thank you, it is really useful.

    ReplyDelete
  10. The 'filter' keyword does not work for me. I had to remove it and replace the phrase "filter(None,line.rstrip('\n').split(' '))" by "line.rstrip('\n').split()". Same for the coordinates also.

    ReplyDelete