[Openchemistry-developers] Sample CSV data for mongochem?

Eric E. Monson emonson at cs.duke.edu
Sat Feb 2 17:04:15 EST 2013


Hey Kyle,

Thanks a lot for the guidance. I'll give that a shot on Monday and let you know how it goes. I'll also pass along, then, a couple notes on building and the wiki. I'm not in the chemistry field right now, so I don't have my own data in any format – MongoChem is just a nice example of a flexible vis GUI based around VTK, so I wanted to take a look at how it is architected. I do data vis, organization and analysis, so I'm good with individual visualizations, but I'm not strong in application design.

Best,
-Eric


On Feb 2, 2013, at 9:14 AM, Kyle Lutz <kyle.lutz at kitware.com> wrote:

> On Fri, Feb 1, 2013 at 2:45 PM, Eric E. Monson <emonson at cs.duke.edu> wrote:
>> Hey guys,
>> 
>> I've been using VTK charts for a GUI I built a while ago, and Marcus mentioned to me recently (on the VTK-dev list) that MongoChem had some good example code, so I've been trying to check it out. I was able to build it okay (I have some notes on that experience if you're interested), but I was wondering if there is someplace I can get some sample data to populate my database? I'm not totally ignorant of chemistry, but it's not my field, and it's not clear where I could get a decent-sized CSV file with the right format and data.
> 
> Hi Eric,
> 
> Thanks for trying out MongoChem. Any feedback on building and/or
> running MongoChem would be greatly appreciated!
> 
> Currently, MongoChem supports loading molecular data from SDF files
> and CSV files. To initially setup our database I used one of the
> PubChem SDF files which can be downloaded from
> ftp://ftp.ncbi.nlm.nih.gov/pubchem/Compound/CURRENT-Full/SDF/. Each
> one contains about 25,000 molecular structures along with a few
> descriptors. Once those are loaded I used the
> "import-obabel-diagrams.py" script in the descriptors directory to
> import the 2D images (soon the image import will be done
> automatically).
> 
> To generate CSV files I use the "make-descriptors-csv-file.py" script
> which when given a molecule file (e.g. one of the PubChem SDF file)
> and a list of descriptor names (e.g. "mass tpsa vabc rotatable-bonds")
> will output a file that can then be used with MongoChem's CSV
> importer.
> 
> Also, we are currently working on adding more data importers. Can you
> let us know what type of data files you have? Furthermore, if you
> could provide any sample data sets that would be a great help in
> getting them to work smoothly with MongoChem.
> 
> Let me know if you have any other questions or feedback.
> 
> Cheers,
> Kyle




More information about the Openchemistry-developers mailing list