NBII: General Structure and Understanding
NBII (as seen the the repgen.xml document sent by Lisa Zolly on March 10, 2009) shows only three level hierarchy that makes up the NBII thesaurus xml document.
Level 1: Thesaurus
Level 2: Concept
Level 3:-- Descriptor or Non-Descriptor
--UF, USE
-- BT , RT, NT
-- SC, SN
-- STA
-- TYP
--INP, UPD
Understanding NBII
Level 1:
Thesaurus – root element/wrapper for entire “thesaurus”. Contains many Concept elements
Level 2:
Concept—multiple concepts occur within one thesaurus.
Level 3:
The rest of the tags occur within the tag Concept – flat (on the same level) but Concept can be divided into two ways – a descriptor or non-descriptor. The Level 3 elements are interesting because they are all on the same level, but there seem to be relationships and patterns within the different elements and they appear to occur in a certain order. Below is an example of how these tags can be divided:
Example 1: Example 2:
Preferred concepts Non-preferred concepts
Concept Concept
Descriptor Non-Descriptor
UF USE
BT SC
RT SN
NT STA
SC TYP
SN INP
STA UPD
TYP
INP
UPD
BT -- Broader Term
Descriptor – Used for approved terms that
INP – Date (expressed numerically, YEAR-MO-DA) when concept was input into the system
Non-Descriptor
NT – Narrow Term
RT – Related Term
SC – Source (organization, controlled vocabulary) of term?
SN -- ? (see SN note in next section)
STA – Status
TYP – Type of Concept
UF – Use for
UPD—Date (expressed numerically, YEAR-MO-DA) when concept was updated
USE – Use this concept instead
Example from the xml
HIVE and NBII: Relationship to Consider and Questions
· Descriptor and Non Descriptor show a relationship between preferred and non-preferred terms. These tag are paired with the UF and USE elements respectively.
· UF relates to the element USE. Anything found within the USE element should also be in the UF element of the preferred concept. For example, the relationship between Zygote and Ookinetes (see example below)
Example from the xml:
Preferred Term: Zygotes
<CONCEPT>
<DESCRIPTOR>Zygotes</DESCRIPTOR>
<UF>Ookinetes</UF>
<BT>Ova</BT>
<NT>Oocysts</NT>
<RT>Hemizygosity</RT>
<RT>Reproduction</RT>
<RT>Zygosity</RT>
<SC>ASF Aquatic Sciences and Fisheries</SC>
<SC>LSC Life Sciences</SC>
<STA>Approved</STA>
<TYP>Descriptor</TYP>
<INP>2007-08-14</INP>
<UPD>2007-08-14</UPD>
</CONCEPT>
Non-preferred Term: Ookinetes
<CONCEPT>
<NON-DESCRIPTOR>Ookinetes</NON-DESCRIPTOR>
<USE>Zygotes</USE>
<SC>LSC Life Sciences</SC>
<STA>Approved</STA>
<TYP>Non-descriptor</TYP>
<INP>2007-08-14</INP>
<UPD>2007-08-14</UPD>
</CONCEPT>
· BT, NT, RT are all present in NBII and show relationships between preferred terms.
· SN (which serves a definitional purpose) does not occur in every Concept—this appears to be an optional element and may be linked in some way to the information in SC.
· In the document I have, I could not find a STA element that did NOT have the status as “Approved”. I would assume that there would be another entry available for this?
· Relationship between Type and Descriptor/NonDescriptor—this relationship seems redundant, but the Type element is present in every Concept and just repeats if the first element on the Level 3 is Descriptor” or NonDescriptor. Is there something missing in this relationship that I just don’t understand?
· While INP and UPD are used internally by USGS for tracking and updating, these elements are not completely irrelevant to HIVE. Since USGS has offered to send ups updates, these elements (and the data in them) could be used to update terms in HIVE through scripting, etc. UPD especially could be very valuable