[Openchemistry-developers] Google Summer of Code 2018 - cclib projects

Karol Langner karol.langner at gmail.com
Thu Mar 8 14:33:53 EST 2018


Hi Tony,

Nice to hear from you. CC'ing two relevant mailing lists.

The ML project might be a bit of a stretch for you, unless you know you can
handle a lot of learning on the fly along with the coding. This particular
project is going to be research-y since we don't really know what will work
(if anything). We expect the student to suggest and explore possible
approaches that are worth trying in this area. So, I would say it's doable,
just depends on how much effort you're willing to put in. The application
needs to convince us there are enough options to try during the summer that
something useful could come out of it.

For the crawler project, the idea is to discover as many compchem logfiles
online as possible, and parse the data out of them. The proposal should at
a minimum cover data discovery, classifying documents as potential compchem
logfiles, and, well, parsing them finally. I think this project needs a
solid design of the process, since there will be challenges with scale. It
would be natural to connect this also with a system that provides access to
the search results, maybe some existing repository/database of compchem
results that already exists.

HTH
- Karol


On Thu, Mar 8, 2018 at 9:47 AM, Yang, Tony <zeyu.yang14 at imperial.ac.uk>
wrote:

> Dear Karol,
>
>
>
> Greetings!
>
>
>
> I am a final year chemistry student at Imperial College London, UK. After
> learning some Python in my first two years of undergraduate study, I became
> interested in programming, and self-taught some deeper programming
> knowledge.
>
>
>
> Last summer, I did a computational organic chemistry research project in
> Prof. Kendall Houk’s lab in UCLA. During this project, I used Gaussian for
> geometry optimisations and energy calculations. And I remember the need for
> additional Python scripts to extract thermodynamic data from Gaussian’s
> output file. This was a tedious task to do and I really appreciate cclib’s
> effort in interpreting a range of computational chemistry softwares’ output.
>
>
>
> I am interested in the ‘Machine learning applied to parsing computational
> chemistry output’ project you proposed. But I only have minimum machine
> learning experience (I have tried Tensorflow’s MNIST tutorial). Would you
> say this project is still suitable for me to do?
>
>
>
> I am also interested in the ‘Discovering computational chemistry content
> online’ project. I think it’s very important that computational resources
> are not wasted to repeat already done calculations. Would you kindly give
> me a bit more details on the crawler in aid of my proposal?
>
>
>
> Hope to hear back from you soon!
>
>
>
> Best wishes,
>
> Tony
>
>
>
> ·        *Name:* Zeyu Tony Yang
>
> ·        *Email:* zy2414 at ic.ac.uk
>
> ·        *Country & timezone:* UK, GMT +0
>
> ·        *School Name & Study:* Imperial College London, Chemistry, Year
> 4 (Final year)
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://public.kitware.com/pipermail/openchemistry-developers/attachments/20180308/053a2380/attachment.html>


More information about the Openchemistry-developers mailing list