[Girder-users] Querying by metadata

Andrés Fortier andres at ekumenlabs.com
Thu Jun 29 17:24:47 EDT 2017


Great, thanks again Zach and Jonathan for your input. So, to summarize:

- While not supported out of the box, it shouldn't be hard to implement a
plugin to do metadata-based queries, either if using one or multiple Grider
collections to store the assets.
- As an alternative, we may consider treating the user like a collection
and implement metadata-based search there as well.

Thanks!
Andy

On Thu, Jun 29, 2017 at 3:30 PM, Jonathan Beezley <
jonathan.beezley at kitware.com> wrote:

> I would also mention that user's can be treated like collections in that
> they can contain folders.  (See my user on data.kitware.com for
> example[1])  You may be able to get away with just using this rather than
> creating collections for each user.
>
> [1] https://data.kitware.com/#user/55087c098d777f5ff55d5e3a
>
> On Thu, Jun 29, 2017 at 2:26 PM, Zach Mullen <zach.mullen at kitware.com>
> wrote:
>
>> Ah, that is quite simple then, you actually have to only query a fixed
>> set of mongo collections, and data can be pulled from every girder
>> collection. If your assets are represented by items, you just need to do
>> one query against the item (mongo) collection. If they can also be
>> represented by folders, you'd look up folders as well. Exposing this via a
>> REST endpoint that respects access control policies is also trivial, so
>> this definitely sounds like a good fit for Girder.
>>
>> Thanks,
>>
>> -Zach
>>
>>
>> On Thu, Jun 29, 2017 at 2:17 PM, Andrés Fortier <andres at ekumenlabs.com>
>> wrote:
>>
>>> Thanks Zach and Jonathan for your replies! To answer Jonathan's
>>> question, I was referring to Girder collections and not mongo (sorry for
>>> the confusion here, it is unfortunate that the term "collection" is
>>> overloaded; I will disambiguate from now on).
>>>
>>> So, maybe stepping back a little sharing the high-level requirements may
>>> be a better approach :). We are trying to build an application to manage
>>> assets (xml, png, maybe source code, dlls, etc. An asset can also be
>>> composed of many files, so basically an asset can be either a girder file
>>> or folder). The application must provide both an administrative back end
>>> (we think girder web client would be a fit) and REST API (again, girder
>>> REST API seems a fit). We will later on roll out a front end (most likely a
>>> Javascript SPA), backed by girder's the REST API.
>>>
>>> Now, the typical workflow we envision is:
>>>
>>> - Someone creates an account and adds assets to the app. Assets have an
>>> associated type, which is independent of the file extension or mime-type,
>>> so this is configured by the user. The user can then decide to keep the
>>> asset private, public or shared with some users / group.
>>> - It should be easy / performant to query all the assets of a given type
>>> (plus some variations when it applies, like "all assets of a given type
>>> that belong to user X", "all private assets of a given type that belong to
>>> user X", etc).
>>> - It should be easy for a user to download all its contents (e.g. for
>>> personal backup).
>>> - It should support asset versioning (I was going to discuss this in
>>> another thread, just mentioning here to give a broad view).
>>>
>>> The approach that we currently have in mind is to have a girder
>>> collection per user (so that all the user assets are stored there) and tag
>>> the assets with metadata to specify its type (hence the need to query by
>>> metadata across different girder collections). Another approach could be to
>>> have a folder per user in a single "assets" collection, pretty much like a
>>> linux /home directory (which at least would remove the
>>> cross-girder-collection query). But again, I'm *very* new to Girder, so it
>>> is most likely that I don't know what the best use for girder collections
>>> are, so any hints are much appreciated.
>>>
>>> Thanks!
>>> Andy
>>>
>>>
>>>
>>>
>>> On Thu, Jun 29, 2017 at 1:40 PM, Jonathan Beezley <
>>> jonathan.beezley at kitware.com> wrote:
>>>
>>>> For clarification, do you mean collections in the sense of mongodb
>>>> collections, or girder collections?  It shouldn't be any problem, for
>>>> example, searching all items regardless of the parent collection they
>>>> belong in.
>>>>
>>>> On Thu, Jun 29, 2017 at 11:27 AM, Zach Mullen <zach.mullen at kitware.com>
>>>> wrote:
>>>>
>>>>> One of the main limitations (or perhaps features?) of mongodb is that
>>>>> a query is applied to only one collection at a time, so your search
>>>>> function would make one query per collection that you wish to search. So,
>>>>> in the case you describe, you'd need to loop over some dynamic set of
>>>>> collections and query each one. Or, create some secondary collection that
>>>>> aggregates all of the data with this metadata field, and contains
>>>>> references to other collections in its documents.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> -Zach
>>>>>
>>>>> On Thu, Jun 29, 2017 at 11:22 AM, Andrés Fortier <
>>>>> andres at ekumenlabs.com> wrote:
>>>>>
>>>>>> Hi Zach,
>>>>>> thanks for the quick reply. What you say makes as lot of sense, my
>>>>>> next question on this was going to be about performance and indexes for
>>>>>> this kind of search :).
>>>>>>
>>>>>> Just to clarify, if we go this road, is there a way to query all
>>>>>> collections at once or we need to run the query for each collection? I'm
>>>>>> asking because we may need to create collections on the fly as the system
>>>>>> runs, so it would be great to be able to query all elements regardless of
>>>>>> the collection they belong to.
>>>>>>
>>>>>> Thanks again,
>>>>>> Andy
>>>>>>
>>>>>> On Thu, Jun 29, 2017 at 11:48 AM, Zach Mullen <
>>>>>> zach.mullen at kitware.com> wrote:
>>>>>>
>>>>>>> Hi Andy,
>>>>>>>
>>>>>>> Thanks for reaching out! This is actually a common sort of use case,
>>>>>>> but Girder out-of-the-box does not support it (yet). The main reason for
>>>>>>> that is because these sorts of queries should typically be performed
>>>>>>> against a database index so that they can scale up to large numbers of
>>>>>>> folders/items. So, the recommended route for this case is to create a small
>>>>>>> plugin that makes sure your desired search field is indexed. To achieve
>>>>>>> that, you'd add a line like:
>>>>>>>
>>>>>>>     ModelImporter.model('item').ensureIndex(['meta.type',
>>>>>>> {'sparse': True}])
>>>>>>>
>>>>>>> Then, you'd want to probably add a small API endpoint to search by
>>>>>>> this field, and perhaps even some UI augmentation to expose it somewhere.
>>>>>>> Let me know if you need help with those other steps.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Zach Mullen
>>>>>>> Kitware, Inc.
>>>>>>> 919-869-8858 <(919)%20869-8858>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Jun 29, 2017 at 9:18 AM, Andrés Fortier <
>>>>>>> andres at ekumenlabs.com> wrote:
>>>>>>> >
>>>>>>> > Hi all,
>>>>>>> > first of all, sorry of this is a trivial question, just getting
>>>>>>> started with Girder. We are currently evaluating using Girder as a backend
>>>>>>> to store resources (either a folder or a file). One of the requirements we
>>>>>>> have is that a resource may have 0, 1 or more "types" (although it will
>>>>>>> most likely be 1 for a starter), which can be plain strings. Also, we want
>>>>>>> to be able to search resources by type, even if they are in different
>>>>>>> collections.
>>>>>>> >
>>>>>>> > Initially we thought on attaching a type property to a resource
>>>>>>> metadata, which could be an array of strings. However the search bar in the
>>>>>>> web front-end doesn't seem to support search by metadata, so I was
>>>>>>> wondering: is metadata search supported on a collection? if yes, how about
>>>>>>> cross-collection search? If no, should I write a plugin to do that?
>>>>>>> >
>>>>>>> > Any hints / pointers are much appreciated.
>>>>>>> >
>>>>>>> > Thanks!
>>>>>>> > Andy
>>>>>>> >
>>>>>>> > _______________________________________________
>>>>>>> > Girder-users mailing list
>>>>>>> > Girder-users at public.kitware.com
>>>>>>> > http://public.kitware.com/mailman/listinfo/girder-users
>>>>>>> >
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Girder-users mailing list
>>>>> Girder-users at public.kitware.com
>>>>> http://public.kitware.com/mailman/listinfo/girder-users
>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://public.kitware.com/pipermail/girder-users/attachments/20170629/b61e3011/attachment-0001.html>


More information about the Girder-users mailing list