[Openchemistry-developers] Computational Chemistry Web Repository discussion

Nitish Garg nitish.garg.6174 at gmail.com
Wed Apr 12 19:29:08 EDT 2017


Hi all,

This mail is in continuation to the discussion over my GSOC proposal for
the "Computational Chemistry Web Repository" project.
(The groundwork for this project can be found at :
https://github.com/nitish6174/cclib-web)

Please note the issue in MOPAC parser as mentioned at bottom of mail.

The schema looks like a good start. Where are you getting the IUPAC names?
> If you haven't found an online resource or library that automatically
> determines them, I don't know how often they'd be used. Out of the hundreds
> of calculations I've ever run, I never figured out the IUPAC name for any.


I am focusing on detecting the InChI (or InChIKey) of the molecule as from
that, a lot of information can be found about that molecule (Using
PubChemPy to get compund from InChI/InChIKey
<http://pubchempy.readthedocs.io/en/latest/api.html#pubchempy.get_compounds>
).
As many log files may not contain enough data to generate InChI, we can
resort to manual input of InChI or common name for molecule (Using
PubChemPy to find compund from common name
<http://pubchempy.readthedocs.io/en/latest/api.html#pubchempy.get_compounds>
).
The found PubChemPy compund can be used to get IUPAC name and other
details. Anyway, won't talk much about IUPAC from now on as its not
important. Also, I guess showing molecule specific properties is not that
important for this project.

I think you should assume that any attribute may potentially be different
> for each log file. For your 'atommasses' example, one could be interested
> in frequency calculations (IR or Raman) that are part of a study using
> isotopic shifts (e.g. 1H -> 2H or 16O -> 18O) so there could be two
> calculations that are essentially identical except for atommasses (and the
> corresponding changes to the vib* attributes).


Yes, I believe almost all the result fields might differ with each log
file. Actually instead of 'atommasses', I should have taken atom numbers as
example that if I have 10 log files for H2O, atom numbers should be stored
just once and not 10 times. Will fix that.

We'd appreciate if you could let us know which files are having problems,
> especially if they are from the cclib or cclib-data repos. Make sure you're
> using an up-to-date version of cclib, and if so, please create an Issue for
> cclib.


The ccread() function failed on these 2 files : "MOPAC/h2o-force.out" and
"regression/Gaussian/Gaussian09/coeffs.log"
This is because MOPAC parser has a typo `line.split[2]` in line 194. Also,
`math` module is missing.
I had fixed these in my fork and was about to submit a PR but saw that this
issue is already opened (#346) but not yet resolved (the referenced PR
(#347) on that issue has a Travis build fail). I have made a PR #365 which
might fix this issue but if #347 can be corrected, my PR will be redundant.
I hope the "coeffs.log" was meant for run_regressions test and not to be
parsed.

Regards
Nitish Garg
GitHub : https://github.com/nitish6174
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://public.kitware.com/pipermail/openchemistry-developers/attachments/20170413/d3841f29/attachment.html>


More information about the Openchemistry-developers mailing list