[Openchemistry-developers] Sample CSV data for mongochem?

Eric E. Monson emonson at cs.duke.edu
Wed Feb 6 08:22:47 EST 2013


Hey Kyle,

How do I "import the molecule from the SDF file"? All I see as an import option is CSV.

Thanks,
-Eric


On Feb 5, 2013, at 4:17 PM, Kyle Lutz <kyle.lutz at kitware.com> wrote:

> On Tue, Feb 5, 2013 at 2:52 PM, Eric E. Monson <emonson at cs.duke.edu> wrote:
>> Hey Kyle,
>> 
>> I'm trying to get this to work, but I'm stuck trying to get data into the DB. When I do a CSV import, I only end up with the final molecule in the database, as if it's not generating new IDs as it goes, and is just overwriting the same data over and over for each molecule in the CSV file… I'll attach my CSV to see if there were any conversion problems from the SDF. I used the exact same string ("mass tpsa vabc rotatable-bonds") for the descriptor names as you suggested, so if there should have been other names listed, please tell me. (And, BTW, what is the File->Add New Data menu option supposed to do?)
> 
> I see. The problem is that your CSV file lists the molecules by their
> InChIKeys. When MongoChem sees this it attempts to look up the
> molecule in the database using that InChIKey. But, as this is the
> initial import, it can't find the molecule and (for now) silently
> fails. To get around this you can first import the molecule from the
> SDF file so that MongoChem will have a record of it and then be able
> to load more descriptors for it. Also, you could change the identifier
> to InChI or SMILES which would allow MongoChem to construct and insert
> the molecular structure automatically. This process will be
> streamlined in the future.
> 
> Here's a sample CSV file with InChIs that you could look at:
> https://github.com/OpenChemistry/avogadrodata/blob/master/data/boiling_and_melting_points.csv
> 
>> ========================
>> 
>> Just for your notes, here are my build experiences trying to get the chemkit python bindings built: (FYI: I'm on Mac OS X 10.8.2)
>> 
>> I don't know how to force the chemkit python bindings to be generated in the superbuild, so I tried to build from a separate source tree.
> 
> I think we just need to enable the "CHEMKIT_BUILD_BINDINGS_PYTHON"
> variable in the configuration file. I'll look into this.
> 
>> 
>> First I tried the 0.1 stable release. I already had Qt and CMake, and I installed eigen and boost (1.52) with homebrew. This built fine, but when I turned on the python bindings I had to add ${PYTHON_LIBRARY} to the target_link_libraries() command (like I see you fixed in the git master). I couldn't do "import chemkit" with the library that was generated.
>> 
>> Next, I tried the git master, but I ran into some boost linking errors. (BTW, I run into a huge number of these trying to do the openchemistry superbuild, which for now I got around by manually modifying linker commands…) I fixed this by changing
>> 
>> --- a/src/chemkit/CMakeLists.txt
>> +++ b/src/chemkit/CMakeLists.txt
>> @@ -8,7 +8,7 @@ find_package(Boost COMPONENTS system filesystem thread REQUIRED)
>> 
>> # boost.thread in versions 1.50 and later require boost.chrono
>> if(${Boost_VERSION} GREATER 104999)
>> -  find_package(Boost COMPONENTS chrono REQUIRED)
>> +  find_package(Boost COMPONENTS chrono system filesystem thread REQUIRED)
>> endif()
>> 
>> But, I was still having trouble getting the Python bindings to work. Unfortunately, I'm pretty ignorant in this area. For the VTK Python wrapping the system generates .so (what I think are) static library files, and so I tried changing SHARED to MODULE in the chemkit-python library generation command
>> 
>> --- a/bindings/python/CMakeLists.txt
>> +++ b/bindings/python/CMakeLists.txt
>> @@ -66,7 +66,7 @@ add_custom_command(OUTPUT chemkit.cpp
>>                    DEPENDS ${SOURCES})
>> 
>> # build library
>> -add_library(chemkit-python SHARED chemkit.cpp)
>> +add_library(chemkit-python MODULE chemkit.cpp)
>> set_target_properties(chemkit-python PROPERTIES OUTPUT_NAME "chemkit" PREFIX "")
>> target_link_libraries(chemkit-python ${CHEMKIT_LIBRARIES} ${PYTHON_LIBRARIES})
>> 
>> And this worked, so after a "make install" I can do "import chemkit" from python.
> 
> Thanks! I'll get these patches pushed.
> 
> I'm not the best with CMake and so I'm not really sure the difference
> between SHARED and MODULE. Marcus, can you comment on this?
> 
> Cheers,
> Kyle




More information about the Openchemistry-developers mailing list