[Openchemistry-developers] Computational Chemistry Web Repository discussion

Thu Apr 13 18:48:51 EDT 2017

Hi Nitish,

> I am focusing on detecting the InChI (or InChIKey) of the molecule as from
> that, a lot of information can be found about that molecule (Using
> PubChemPy to get compund from InChI/InChIKey
> <http://pubchempy.readthedocs.io/en/latest/api.html#pubchempy.get_compounds>
> ).
> As many log files may not contain enough data to generate InChI, we can
> resort to manual input of InChI or common name for molecule (Using
> PubChemPy to find compund from common name
> <http://pubchempy.readthedocs.io/en/latest/api.html#pubchempy.get_compounds>
> ).
> The found PubChemPy compund can be used to get IUPAC name and other
> details. Anyway, won't talk much about IUPAC from now on as its not
> important. Also, I guess showing molecule specific properties is not that
> important for this project.
>

I didn't mean to suggest that IUPAC wouldn't be useful at all—just that it
might not be something all researchers need. I do think detecting the InChI
is a good idea though. Are you using OpenBabel and/or RDkit for this?

I think you should assume that any attribute may potentially be different
> for each log file. For your 'atommasses' example, one could be interested
> in frequency calculations (IR or Raman) that are part of a study using
> isotopic shifts (e.g. 1H -> 2H or 16O -> 18O) so there could be two
> calculations that are essentially identical except for atommasses (and the
> corresponding changes to the vib* attributes).
>
>
> Yes, I believe almost all the result fields might differ with each log
> file. Actually instead of 'atommasses', I should have taken atom numbers as
> example that if I have 10 log files for H2O, atom numbers should be stored
> just once and not 10 times. Will fix that.
>

What happens if the order of the atoms isn't the same between the 10 log
files? E.g., atomnos might be [1, 8, 1], [1, 1, 8], or [8, 1, 1] depending
on how the calculation was setup.

> We'd appreciate if you could let us know which files are having problems,
> especially if they are from the cclib or cclib-data repos. Make sure you're
> using an up-to-date version of cclib, and if so, please create an Issue for
> cclib.
>
>
> The ccread() function failed on these 2 files : "MOPAC/h2o-force.out" and
> "regression/Gaussian/Gaussian09/coeffs.log"
> This is because MOPAC parser has a typo `line.split[2]` in line 194. Also,
> `math` module is missing.
> I had fixed these in my fork and was about to submit a PR but saw that
> this issue is already opened (#346) but not yet resolved (the referenced PR
> (#347) on that issue has a Travis build fail). I have made a PR #365 which
> might fix this issue but if #347 can be corrected, my PR will be redundant.
> I hope the "coeffs.log" was meant for run_regressions test and not to be
> parsed.
>

Do you know why the Gaussian09/coeffs.log fails?

Best regards,

Adam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://public.kitware.com/pipermail/openchemistry-developers/attachments/20170413/67c9debc/attachment.html>