Friam - [ SPAM ] Re: Fwd: Share Your Knowledge: Taxonomy Boot Camp

Friam

[ SPAM ] Re: Fwd: Share Your Knowledge: Taxonomy Boot Camp

Posted by Steve Smith on
URL: http://friam.383.s1.nabble.com/Fwd-Share-Your-Knowledge-Taxonomy-Boot-Camp-tp7586090p7586092.html

>> Interesting topic here, at least to me. Has anyone ever attended
>> this?
> Have not. Some folks, like catalogers & librarians are good at this
> sort of thing, it seems very tedious and hard to scale.
>
>
From my limited experience/observation, it is a sticky and subtle problem.

SpindleViz: Over 10 years ago, I worked with a team doing ontology
modeling to help them Visualize ontologies. We produced a prototype,
dynamic 3D visualizer (SpindleViz) which gave some traction on actually
understanding the structure of a given Ontology, but the project more
importantly gave me an understanding how ontologies are used and built
in some communities. In this case we worked with the Gene Ontology
which at the time was perhaps the largest and most mature and
represented a very broad collaborative effort. The effort of building a
shared ontology appeared to me to be the ultimate in compromise.

NSF Scientific Collaboration: Later I found myself working with Dr.
Deana Pennington at UNM on a NSF project for developing formal tools
for Scientific Collaboration called SciDesign. This project included
a study of the problem of normalizing terminologies across a diverse
team of Scientists working on a common problem. In this case climate
change. Contrary to some assumptions, the language across seemingly
related disciplines such as say Atmospheric and Ocean Science or Biology
and Ecology is not just aligned, but perhaps insidiously
counter-aligned, or maybe more to the point in some sense "dissonant".
Science, in it's pursuit of both understanding and precision draws it's
language from existing disciplines for the "similarity" to the topic or
idea at hand but then in the pursuit of precision, changes the meaning
of the terms in often fundamental if subtle ways which are often not
obvious to the discipline from which the terms are adopted. More often
two related disciplines derive terms from a root source and neither
understands how the *other* uses them differently.

In pursuit of a methodology to improve Scientific Collaboration in
general, one of the fundamental problems was to come up with a fairly
simple methodology to normalize these differences in lexicons. Of
course, underneath these lexicons were implicit ontologies, the complex
relationships between the terms. We discussed adapting a technique
developed by Dr. Tim Goldsmith (also UNM) to help with this. The basic
concept was to interview each individual on a collaborative team, first
for a set of "most common terms" used in their domain. Once these
terms were acquired for say 6 individuals with related but different
domains. The pool of terms would be reduced to the subset of those
which recurred in two or more individual's lexicons. Each individual
would then be presented with a matrix of these terms registered against
eachother and they would be asked to provide a measure of correlation
between each pair of terms. The idea of course, was to build a very
rough model of their model as it were, to get a handle on how closely
aligned each practicioner's model of the implicit domain they were
studying was. The result was to be a set of weighted graphs of
overlapping terms used in their domains when applied to the common
problem. While this is not a formal ontology, one might think of it as
a proto-ontology of sorts, a place to begin to build an ontology from.

The point of this was a methodology for "just in time" proto-ontology
building. Of course, the funding for this work ran out, Dr. Pennington
moved to UTEP, and as far as I know things in this area have been on
hold since then.

Most recently, I worked with other UNM Researchers, Dr's Caudell,
Gilfeather, Lugar, Taha, et al on a project ultimately entitled "Faceted
Ontologies" which was primarily about building, from open source
Intelligence, knowledge structures, developing a normalized model for
them, and providing tools for extracting specific aggregate knowledge
*from* those sources, and very specifically presented *as* a structure,
not simply a list of factoids or simple linear report. The tools from
my former two projects were to be developed further to support the
visualization, as it were, from multiple conceptual viewpoints (aka
"facets" of the ontology). This was a *very* ambitious project and the
basic underpinnings (building formal models of ontologies on top of
Category Theory) were done.

I still believe that there is good work to be done in this area, but the
level of sophistication required to develop the mechanisms underlying my
own part is pretty daunting. I occasionally scan the literature and
SBIR solicitations for new developments and funding sources for this
work... It would be very welcome if anyone here happened to have some
traction in this domain... I can provide a few references, unfortunately
most of the results out of the second two projects were merely internal
reports to the customers and very preliminary white-papers.

The domain I find this work most interesting *for* perhaps is
Journalism... but the problem is exacerbated by their being much less
formal languages developed (to my knowledge) across journalism...
perhaps that is changing, or perhaps the demands of scientific
journalism at least lead journalists as "outsiders" and "laymen" to the
fields to not only do this same task intuitively but to have some of
their own formal methodologies and tools?

- Steve

============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com