international atomic energy agency october 2013inis training seminar1 subject analysis: computer...

21
International Atomic Energy Agency October 2013 INIS Training Seminar 1 Subject Analysis: Subject Analysis: Computer Assisted Indexing Computer Assisted Indexing 07 – 11 October 2013 Vienna, Austria Bekele Negeri INIS Unit Nuclear Information Specialist (Adapted from A. Nevyjel’s presentation)

Upload: brenda-hicks

Post on 16-Dec-2015

217 views

Category:

Documents


2 download

TRANSCRIPT

International Atomic Energy Agency

October 2013 INIS Training Seminar 1

Subject Analysis:Subject Analysis:

Computer Assisted Indexing Computer Assisted Indexing

07 – 11 October 2013

Vienna, Austria

Bekele NegeriINIS Unit

Nuclear Information Specialist(Adapted from A. Nevyjel’s presentation)

International Atomic Energy Agency

Subject Indexing ToolsSubject Indexing Tools

There are two main INIS products used for indexing: WinFibre and CAI•WinFibre – for input preparation both bibliographic and subject indexing•CAI (Computer Assisted Indexing) – for subject classification and indexing

INIS/ETDE Thesaurus and INIS Subject Category Codes are incorporated in both.

October 2013 INIS Training Seminar 2

International Atomic Energy Agency

Indexing with FIBREIndexing with FIBRE

October 2013 INIS Training Seminar 3

International Atomic Energy AgencyOctober 2013 INIS Training Seminar 4

Computer-assisted Indexing - CAIComputer-assisted Indexing - CAI

• Kick-off Meeting Jan 2004• Implementation and Customisation Jun 2004• Production Indexing from Jun 2004 ongoing• CAI version 1.0 final acceptance Aug 2004• Tuning of the system from Aug 2004 ongoing• CAI batch processing for Member States Dec 2004• CAI online from remote for MS Nov 2007

International Atomic Energy AgencyOctober 2013 INIS Training Seminar 5

International Atomic Energy AgencyOctober 2013 INIS Training Seminar 6

CAI Thesaurus ExtensionCAI Thesaurus Extension

• Thesaurus• Valid Descriptors 22,051

• Forbidden Terms 8,675

• Total 30,726

• CAI • Hidden Terms ~35.000

Terminological Knowledge Base

International Atomic Energy AgencyOctober 2013 INIS Training Seminar 7

CAI Thesaurus extensionCAI Thesaurus extension

“Hidden terms” are character patterns representing the different appearances of a concept in the free text, which is indexed by one or more descriptors. • handled similar to “forbidden terms” with one or more USE

relations• CAI internal only • not exported to INIS production system• not exported to FIBRE • not printed in any appearance of the thesaurus • support identification of descriptors in the free text

International Atomic Energy AgencyOctober 2013 INIS Training Seminar 8

Hidden Terms: Compounds and IsotopesHidden Terms: Compounds and Isotopes

Descriptor hidden term free text

MAGNESIUM BORIDES MgB_2 MgB2

ACETIC ACID C_2H_4O_2 C2H4O2

CESIUM 137 Cesium 137, Cesium-137"1"3"7cs 137Cs137 caesium 137 Caesium, 137-Caesiumcaesium 137 Caesium 137, Caesium-137137 cesium 137 Cesium, 137-Cesium137 cs 137 Cs, 137-Css 137 Cs 137, Cs-137cs"1"3"7 Cs137

cs137 Cs137

International Atomic Energy AgencyOctober 2013 INIS Training Seminar 9

Hidden Terms: Elementary ParticlesHidden Terms: Elementary Particles and countries and countries

Descriptor hidden term free text

ELECTRON NEUTRINOS #nu#_e νe

MUON NEUTRINOS #nu#_#mu# νμ

TAU NEUTRINOS #nu#_#tau# ντ

RHO-770 MESONS #rho#-770 ρ-770

OMEGA-782 MESONS #omega#-782 ω-782Country Names:

CAMBODIA kampucheaCOTE D'IVOIRE ivory coastGREECE hellasMYANMAR burmaTHAILAND siam

International Atomic Energy AgencyOctober 2013 INIS Training Seminar 10

Hidden Terms: UK/US Spellings Hidden Terms: UK/US Spellings Descriptor hidden term

A CENTERS a centresACTIVITY METERS activity metresANALOG COMPUTERS analogue computersANESTHESIA anaesthesiaARCHAEOLOGY archeologyAUSTRIAN ORGANIZATIONS austrian organisationsBALLISTIC MISSILE DEFENSE ballistic missile defenceBAYARD-ALPERT GAGES bayard-alpert gaugesBEAM ANALYZERS beam analysersBEHAVIOR behaviourCATALOGS catalogues

International Atomic Energy AgencyOctober 2013 INIS Training Seminar 11

Hidden Terms: Other Spellings Hidden Terms: Other Spellings

Descriptor hidden termSingular/Plural

FUNGI fungusFUNGI fungusesG MATRIX g matricesG MATRIX g matrixes

Reverse SequenceATOM-MOLECULE COLLISIONS atom-molecule scatteringATOM-MOLECULE COLLISIONS molecule-atom scatteringATOM-MOLECULE COLLISIONS atom-molecule reactionsATOM-MOLECULE COLLISIONS molecule-atom reactionsATOM-MOLECULE COLLISIONS atom-molecule interactionsATOM-MOLECULE COLLISIONS molecule-atom interactions

International Atomic Energy AgencyOctober 2013 INIS Training Seminar 12

Further Improvements necessary Further Improvements necessary • “+” and “-“ signs

• K+ KAONS PLUS, KAONS MINUS, POTASSIUM IONS

• Case sensitivity• TiN TIN (instead of TITANIUM NITRIDES)• gas GALLIUM SULFIDES• “…who is the …” WHO (World Health Organization)

• Verbs versus Nouns• “… this leads us to …” LEAD• “… this leaves it ….” LEAVES

• Homographic terms• Solutions SOLUTIONS or MATHEMATICAL SOLUTIONS

• Nuclear Reactions, e.g. 14N(γ,α)10B • Targets • Beams• Reactions

International Atomic Energy AgencyINIS Training Seminar

INDEXING PROBLEMSINDEXING PROBLEMS

• General terms (energy, physics, materials, uses etc.

• Misleading CAI suggestions:

Thesaurus terms:

PRODUCTIONPRODUCTION and PARTICLE PRODUCTIONPARTICLE PRODUCTION

SOLUTIONSOLUTION and MATHEMATICAL SOLUTIONMATHEMATICAL SOLUTION

IGNITIONIGNITION and THERMONUCLEAR IGNITIONTHERMONUCLEAR IGNITION

WALLS WALLS and THERMONUCLEAR REACTOR WALLSTHERMONUCLEAR REACTOR WALLS

PLANTSPLANTS and NUCLEAR POWER PLANTSNUCLEAR POWER PLANTS

MEMBRANESMEMBRANES (classic) and membranemembrane (in brane theory)

COLORCOLOR and COLOR MODELCOLOR MODEL (elementary particle characteristics)

TRANSPORT, etc.TRANSPORT, etc.

October 2013 13

International Atomic Energy AgencyINIS Training Seminar

INDEXING PROBLEMSINDEXING PROBLEMS

chemical compounds/ case sensitivity/homonyms:

INDIUM IONS for “in ions”

ASTATINE 200 for at 200oC

VISIBLE RADIATION for light (weight)

HELIUM 6 for “consisting of 6 He 3 tubes”

VISIBLE RADIATION for “light weight”

temperature, pressure, etc. range

abbreviations:

TNA for Thermal Neutron Analysis and TRINONYLAMINE

MPA for Maximum Permissible Activity

MPa (Mega Pascal)

October 2013 14

International Atomic Energy AgencyOctober 2013 INIS Training Seminar 15

CAI online for Member StatesCAI online for Member Statesintroduced in July 2007introduced in July 2007

• CAI Batch used by• China• Czech Republic (seldom)• Georgia (only in 2012)

• Germany• Iran• Uzbekistan• Vietnam

• CAI Online in use by• Austria• Bulgaria• Cuba• Israel (registering)

• Japan• Mexico• Netherlands (seldom)

• Uruguay

CAI online and CAI batch are now regular services for CAI online and CAI batch are now regular services for Member StatesMember States

International Atomic Energy AgencyOctober 2013 INIS Training Seminar 16

CAI Batch and Online ProcessingCAI Batch and Online Processing

• Input: MemSt-CC-yymmdd-xxxxxxxxxxx

• MemSt is a standard prefix (meaning “member state”)• CC is the country code • yymmdd is the date when the file was generated • xxxxxxxxxxx is any additional identification

• Examples• MemSt-AR-041203-thisismytestfile• MemSt-FR-041212-fileidentification

International Atomic Energy AgencyOctober 2013 INIS Training Seminar 17

CAI Batch ProcessingCAI Batch Processing

• Output: _MemSt-CC-yymmdd-xxxxxxxxxxx

• These files will carry the CAI suggested descriptors in tag 800, preceded by the string

##CAI suggestions##; • Example:

• 800^##CAI suggestions##; DESCRIPTOR1; DESCRIPTOR2; DESCRIPTOR3; …….

• sent back to the member state for reviewing

International Atomic Energy AgencyOctober 2013 INIS Training Seminar 18

International Atomic Energy AgencyOctober 2013 INIS Training Seminar 19

CAI Batch and Online ProcessingCAI Batch and Online ProcessingReviewing ProcessReviewing Process

• Delete all suggested descriptors which are too general

• Add relevant descriptors which were not found • numerical values, e.g. pressure ranges, temperature

ranges,...

• nuclear reactions

• chemical compounds, alloys, etc.

• CAI is cleaning up BT/NTs clean up BT/NTs from manual additions

• Clean up suggestions from homographic terms

International Atomic Energy AgencyOctober 2013 INIS Training Seminar 20

CAI Batch and Online ProcessingCAI Batch and Online ProcessingFinalisation ProcessFinalisation Process

CAI batch•When reviewing of the record completed:

Delete “##CAI suggestions## “•When reviewing of all records completed:

Submit file to “INIS Input Box” CAI online•When reaching the last record:

press “export and exit” button• File goes directly to INIS production system,

or if required, sent back to Member State for reviewing

International Atomic Energy Agency

Thank you!Thank you!

October 2013 INIS Training Seminar 21