Crow

Computational Representation Of Whatever

Crow is a java library for the manipulation and mining of complex and world-wide distributed data. It automatically translates cross-linked data residing in different documents into a network of software objects (elements) and offers easy ways to edit these elements and to group, filter, and sort them by their properties (and properties of properties of ..) using plain java syntax. The library focusses on the work with semantic web documents, however, software interfaces to other data sources can be plugged in at run-time. Referenced documents are opened on demand and processed in parallel background threads. Information that is dispersed over several documents automatically collapses into single elements whereever possible and necessary. Crow objects communicate via events that are decentrally managed so that outside objects can dynamically register for events (or classes of events) directly at their origin. A graphical user interface is missing but the user can interactively querry and manipulate Crow data with the Python programming language (via a very basic Jython shell). The (yet) only available data source plugin reads and writes N3-formatted semantic web documents. The library should be considered in early development. It is freely available under the GPL. We currently use it to select targets for protein-protein docking from sets of predicted protein interactions and various other information.

the Crow data model

The Crow library translates complex cross-linked data from various sources into a net of cross-references java objects (Element). The data integration task (i.e. combining and updating information from different sources) is autommatted as much as possible.

Application example

We have used Crow to integrate different sets of predicted interactions between proteins of Mycobacterium tuberculosis with sequence-, structure- and various other information. From this knowledge source we selected several (suggested) protein pairs for which we can model the structure of both binding partners. These interactions are now, in a first step, validated experimentally and their structure will then be modelled with our flexible protein-protein docking strategy and additional experimental data.

The following example shows how the Jython interface to Crow can be used for interactive work. We load a list of proteins from a semantic web document and select all interactions between proteins with a close homologue in the PDB database of protein structures. This is an example for interactive work. Application programmers would use other, event-based strategies.

## define URL-shortcut
SpaceMap.put( "bio", URI ("file:///home/user/protein_properties.n3#") )
## get some property types from central repository
pt_homology = PropMap.getType("bio:homology")
pt_identity = PropMap.getType("bio:identity")
pt_coverage = PropMap.getType("bio:alnCoverage")
pt_interaction = PropMap.getType("bio:interaction")
pt_score = PropMap.getType("bio:score")
pt_from, pt_to = PropMap.pt_from, PropMap.pt_to

## load list of proteins
prot_store = UriMap.getStore("file:///home/user/mtb_proteins.n3#"))
prot_store.update()
all_prot = prot_store.getAll()

## select homologies to template structures in the PDB, wait until the data
## are loaded from a cross-linked file (at most 1000s)
homologies = all_prot.valuesOf(pt_homology, 1000 )
homologies = homologies.filterBy(pt_identity, 0,3, 1.0)
homologies = homologies.filterBy(pt_coverage, 0.75, 1.0)

## select modelable proteins
model_prot = all_prot.filterBy( pt_homology, homologies )
## get interactions between modelable proteins
interactions = model_prot.valuesOf( pt_interaction )
model_interactions = interactions.filterBy(pt_from, model_prot).filterBy(pt_to, model_prot)

## save target interactions into new semantic web document
new_store = UriMap.getStore("http://.../target_interactions_current.n3#")
new_store.addAll( model_interactions )
new_store.setLocal( "/home/user/target_interactions.n3")
new_store.save()

Availability and status

The library is freely available under the GNU General Public License (GPL) at

http://sf.net/projects/crow

It is in early prototype stage and neither stable nor complete enough for production use! I have postponed the release of a first alpha version until I've finished my thesis (around end of the year). Rigth now, it's just one parallel project too much. Along with a short tutorial and some example files the release will be targetted at developpers, not users. Feel free to have a look at the current code snapshot (it's well documented) but you probably won't get it running without effort. Better wait for the first release (or contact me, if you can't)!

last updated: 10/2004

Author