[Openchemistry-developers] cjson: nested lists instead of 1D lists for multidimensional data?

Marcus D. Hanwell marcus.hanwell at kitware.com
Fri Jun 22 13:50:39 EDT 2012


Hi,

On Wed, Jun 13, 2012 at 5:11 PM, Ian Daniher <ian at nonolithlabs.com> wrote:
> Hi All,
>
> Right now, Chemical JSON uses a 1D list for all of the multidimensional
> information including 3d coordinates for atoms and bonds.
>
> When parsing the format, this results in the following:
>
>> ethane["atoms"]["coords"]["3d"]
>
>
>>
>> [1.18508, -0.003838, 0.987524, 0.751621, -0.022441, -0.020839, 1.166929,
>> 0.833015, -0.569312, 1.115519, -0.932892, -0.514525, -0.751587, 0.022496,
>> 0.020891, -1.166882, -0.833372, 0.568699, -1.115691, 0.932608, 0.515082,
>> -1.184988, 0.004424, -0.987522]
>
>
> Getting the x, y, z coordinates of the first atom is overcomplicated in both
> javascript and python. A reasonable use case might be looping through
> ethane["atoms"] and drawing based on the location of each.
>
> In python, this would look like:
>
> for i in range(len(ethane["atoms"]["elements"]["number"])):
>     start = 0+i
>     end = 3*(1+i)
>     x, y, z = ethane["atoms"]["coords"]["3d"][start:end]
>     draw(x,y,z)
>
> If the list was nested, it would look like the following:
>
> for i in range(len(ethane["atoms"]["elements"]["number"])):
>     x, y, z = ethane["atoms"]["coords"]["3d"][i]
>    draw(x, y, z)
>
Can't you just write,

for i in range(len(ethane["atoms"]["elements"]["number"])):
    x, y, z = ethane["atoms"]["coords"]["3d"][3*i:3*i+4]
   draw(x, y, z)

> Fewer things to keep track of, fewer places to screw up, more implicit
> information. Everyone wins.
>
This seems like a place where we could help by adding in some utility
functions in a few languages such as Python and JavaScript.

> Not sure what sort of performance hit you get with nested array
> serialization, but this might let you remove the "3d" subobject as
> dimensionality of coordinates would be explicit.

Then you would have to verify each item in a list has the
dimensionality you expect, or check the first and assume the rest are
the same? We should do some benchmarking, and it would be great to
write some simple code around these structures to make them easier to
deal with (and liberally licensed so you can include them inline
without worrying).
>
> The same argument applies to the "bonds" subobject.
>
If this was preferable, couldn't we just write a little simple client
code to give higher level abstractions like this too? I don't use
JavaScript much at all, and tend not to write much rendering code in
Python either but it seems like it would be easy enough to write a
reader that would offer a higher level view.

Marcus



More information about the Openchemistry-developers mailing list