Hi All,
Just as a clarification for the less informed - myself included -
we're discussing the subtle and extremely difficult aspects of
creating knowledge maps/annotation repositories/KBs/KR repositories
(what have you) ultimately capable of supporting reasoning (simple
classification through more complex reasoning) for both UNIVERSALS
and INSTANCES.
Some DEFINITINS:
CLASSes represent UNIVERSALs or TYPEs. The TBox is the set of
CLASSes and the ASSERTINs associated with CLASSes.
INSTANCEs represent EXISTENTIALs or INDIVIDUALs instantiating a CLASS
in the real world. The ABox is the set of INSTANCEs and the
ASSERTINs associated with those INSTANCEs.
Properly specified CLASSes are defined in the context of the
INSTANCEs whose PRPERTIES and RELATINs they formally represent.
Properly specified INSTANCEs are defined via their reference to an
appropriate set of CLASSes.
Reasoners (RacerPro, Pellet, FACT++) generally have optimizations
specific to either reasoning on the TBox or reasoning on the ABox,
but it's difficult (i.e., no existing examples experts such as Phil
and others can cite) to optimize both for reasoning on the TBox, the
ABox AND - most importantly - TBox + ABox (across these sets).
All of us trying to apply ontology-based formalisms to create machine-
parsable representations of real world biomedical continuants and
occurents have banged our heads bloody against this UNIVERSAL-
EXISTENTIAL border. Even determining which of the many biomedical
informatic resources to employ when you seek to reference relevant
UNIVERSALs can be an very difficult task. We're in the midst an
extended debate within the BIRN Task Force on how best to do
this for proteins relevant to cross-species representation of
neurodegenerative disease such as Glial Fibrillary Acidic Protein
(GFAP)).
I strongly encourage the experts to please clarify, embellish, or
correct the above definitions as they see fit for the edification of
all us disciples. :-)
Cheers,
Bill
Sep 15, 2006, at 8:30 AM, Phillip Lord wrote:
"KV" == Kashyap, Vipul <VKASHYAP1@PARTNERSRGwrites:
KV, if mapping into instances gives better performance
KVfor a given set of inferences, that might be the basis of
KVchoosing the instance-of relationship. Towards this end I have
KVthe following questions for Phil:
KV1. What are the set of Abox inferences implemented in the G
KVexample?
In that example, there aren't any. At that stage, the instance store
was not doing ABox reasoning at all, just TBox, made to look like
ABox.
The system is richer now, and you can express some relationship
between individuals in the ABox (as well as any expressivity you like
in the TBox). But, I don't have details, I am afraid.
--
KV2. What would be the corresponding set of TBox inferences
KVimplemented if the
KVdesign choice proposed by Chris was adopted, i.e., p53 is a
KVsubclass of Gene (assuming a general "Gene" class)
I am presuming by "set of inferences" you mean, what can you express?
The TBox supports WL-DL in full. Actually, as the InstanceStore punts
much of the work to the reasoner, without limits this is constrainted
by the reasoner not the instancestore per se. So it does what ever you
reasoner does.
>
>
>
KV3. What are the performance and scalability implications of (1)
KVvs (2)
ABox reasoning is harder than TBox. As is the way with DL, exactly
what the implications are, depends on exactly what you express and I
am not really an expert.
--
KV4. What are the expressiveness implications of (1) vs (2), i.e.,
KVcan we express
KVsome statements using subclass-of based modeling which are not
KVpossible using instance-of modeling; or vice versa
KVLook forward to a good use case illustrating the above and
KVdiscussing its possible consequences.
--
The limitation is that if you're entities are in the ABox in this
case, there are a very limited number of things that you can say about
their relationships to other entities in the ABox, although you have
the full expressivity of WL to relate them to the TBox. Flip side, is
that if you put everything into the TBox, then you get nothing from
the relational backend of the instancestore. In the G example, for
instance, you could put all the associations into a reason as modelled
as WL classes, but the reasoner will probably not scale to 6 million
instances.
Separating entities into ABox and TBox depending on how many of them
there are is, of course, unsatisfying from an ontological perspective,
but as you are asking about scalability of computational reasoning I
don't think you have any choice but to be pragmatic.
Phil
Bill Bug
Senior Research Analyst/ Engineer
Laboratory for Bioimaging & Anatomical Informatics
www.neuroterrain.org
Department of Neurobiology & Anatomy
Drexel University College of Medicine
2900 Queen Lane
Philadelphia, PA 19129
215 991 8430 (ph)
610 457 0443 (mobile)
215 843 9367 (fax)
Please Note: I now have a new email - William.Bug (AT) DrexelMed (DOT) edu
This email and any accompanying attachments are confidential.
This information is intended solely for the use of the individual
to whom it is addressed. Any review, disclosure, copying,
distribution, or use of this email communication by others is strictly
prohibited. If you are not the intended recipient please notify us
immediately by returning this message to the sender and delete
all copies. Thank you for your cooperation.