Meccano, molecules, and the organization of knowledge The continuing contribution of S.R. Ranganathan
The impact of facet analysis: facet analysis has become pervasive over recent years today there are few formal knowledge organization systems that do not display some elements of faceted structure there is an evident faceted approach to product information in many commercial websites the idea of facet analysis may be very differently understood by various communities
Example from DDC: 782.6 Women s voices.66 Soprano voices (Treble voices).67 Mezzo-soprano voices.68 Contralto voices (Alto voices).7 Children s voices.76 Soprano voices (Treble voices).77 Mezzo-soprano voices.78 Contralto voices (Alto voices).79 Changing voices.8 Men s voices.86 Treble and alto voices.87 Tenor voices.88 Baritone voices.89 Bass voices Here the age or gender of the singer is subdivided by the pitch of the voice, and this subdivision of the one to the other is carried out absolutely consistently and predictably. Although there is no number building here, and the numbers are not arrived at synthetically, the citation order is very evidently age/gender pitch and it is applied without exception. Simple faceted structures of this type are very common in DDC and in LCC although we don t usually think of them as faceted schemes in any theoretical sense.
Example from UDC:
Examples of research projects:
Facet analysis in the e-commercial environment: many of these tools fail to employ facet analysis in other than a top-level manner effectively, it s used to create a taxonomy based on a variety of attributes of entities such a structure is logical, predictable and well modelled it provides a good mechanism for searching by successive filtering for the most part, concepts other than entities are not involved hence, only a partial view of a domain is provided
What is classical facet analysis: a means of organizing the concepts in a subject domain involves grouping concepts on the basis of shared characteristics uses standard categories as receptacles for concepts in classical facet analysis these are linguistic/functional categories
Types of categories used: earliest set of categories was that of Kaiser (1911) these were used in alphabetical subject indexing to generate pre-coordinated headings concretes (= things/entities/substances) processes (= activities/actions) place
Ranganathan s categories: personality (= entities or systems) matter (= substances) energy (= actions or activities) space time
CRG categories: an expansion of Ranganathan s PMEST: thing kind part property material process operation patient product by-product agent space time
These categories have been used in all classes in BC2: they work well for concepts in most subjects they work best with science and technology some additional categories are needed in the arts e.g. form and genre
Modelling a subject domain: categorization alone does not make a faceted system it must also deal with the relationships between concepts attention should also be paid to sequence or order of concepts in combination the last is less vital in a digital context, but still important where any sort of linear order or display is needed
Molecular models as a pattern: molecular modelling provides a useful modern equivalent to Ranganathan s Meccano analogy molecular systems also try to represent: the nature of the components the relationships between them their relative positions internal and external relationships between particles a syntax or rules for combination
Ontology with concepts and relationships: does dog lives in wears a collar tracking kennel
Faceted approach with categorized concepts: animals actions homes equipment does lives in wears dog tracking kennel collar pig horse parrot rooting jumping flying sty stable cage nose-ring saddle leg-ring Concepts grouped in categories
Relationships between concepts can be expressed in different ways: through facet indicators through relationship indicators through the sequence of concepts, or citation order different faceted languages utilise all of these methods
The Colon Classification: uses the fundamental categories (P M E S T) uses facet indicators in the form of punctuation symbols to denote the categorical status of a concept.t.s :E ;M,P uses a facet formula to combine and express complex content λ [P] [P2] [P3] : [E] 2P : [2E] Y [P] : [E] : [2E]. S. T
Rules for joining concepts together in language (syntax): sometimes meaning is achieved by inflection: homo mordet canem canis mordet hominem ο ανθροπος εσθιει τον κυοντα ο κυων εσθιει τον ανθροπον
Sometimes meaning is achieved by word order: man bites dog dog bites man man eating sausage
Indexing languages can function in both of these ways: Some (like Colon, PRECIS, or UDC) use role operators or facet indicators = symbols which indicate their status others (like BC2) rely on order in the schedule to give meaning to the components
Relational operators in indexing systems: Farradane s system Clamping of hardened steel plates steel /: plates /- clamping /; hardening /: causation or dependence /- reaction /; association
In many modern faceted systems the means of combining terms is controlled by the citation order: citation order is the order of categories with which we re familiar i.e. thing - kind - part - etc. this is the so-called standard citation order facet status determines the combination, but this is implicit in the notation it s a good guide to the best default order of combination, but isn t immutable
Facet analysis as a fundamental theory for structuring subject organization tools: facet analysis provides us with a sufficiently rigorous model we can convert this to alternative formats work at UCL has looked at using modelling in combination with markup to create an all-purpose terminology held as a database this can be output as: a conventional classification an alphabetical subject index a thesaurus
Facet analysis as a basis for classificatory structures: is a well established methodology organizes concepts in a domain into facets, and then into sub-facets (or arrays) within a facet, relationships of hierarchy are identified and visually displayed synonyms (or near synonyms) are collocated, and controlled by means of the notation
Basic classification structure: Hierarchical relationships HKH PO HKH PP HKH PS [Foods] (By physical state) Essences Extracts Pastes Facet label Array labels HKH PY HKH QD HKH QE HKH QF HKH QK Collocation of synonyms HKH QS (By operation/process used) (By utility, etc.) Convenience foods Partly prepared foods Instant foods Artificial foods, synthetic foods (By purpose) (By physiological function) Roughage
Sometimes content becomes much more complex:
Complex repeating structure can be accurately constructed from syntax rules in a faceted system: HUQ W HUQ WH HUQ WMD V HUQ WME HUQ X HUQ XS Thymus gland (Physiology) (Pathology) (Hyperplasia) Lymphatism, status lymphaticus (Causal agents) (Symptoms) (Treatment) (Neoplasms) Thymomas (Products) Thymus hormones (Molecular structure) Thymopoietins [Compound terms pre-synthesized and added to published schedule] [Examples of potential synthesized compounds]
Conversion to thesaurus format: all of the conceptual elements required to generate a thesaurus are implicit in the schedule BT/NT or intra-facet (paradigmatic) relationships (and some RTs) can be determined from the hierarchy other RTs can be identified from inter-facet (syntagmatic) relationships equivalence relationships are present in the synonym collocations
Facet analysis aids the accurate identification of paradigmatic and syntagmatic relationships in the thesaurus: facet name entities facet operations facet agents facet Paradigmatic relationships between terms contained in the vocabulary cereals plant husbandry farm machinery wheat harvesting combine harvesters Syntagmatic relationships between terms assigned to a document
Intra-facet (paradigmatic) relationships in a basic schedule: HKH PY HKH QD HKH QE HKH QF HKH QK (By operation/process used) (By utility, etc.) Convenience foods Partly prepared foods Instant foods Artificial foods, synthetic foods Convenience foods NT Partly prepared foods Partly prepared foods BT Convenience foods NT Instant foods Convenience foods RT Artificial foods Artificial foods UF Synthetic foods Synthetic foods USE Artificial foods
Automatic conversion from classification to thesaurus: BC2 has a suite of programs to generate schedule display and the A/Z index these have been extended to allow for automatic thesaurus generation each term is marked up to show its hierarchical position and position in the sequence, and its class status some difficulties occur as a result of the schedule not having been written with the thesaurus in mind
BC2 source file markup for schedule display and indexing: CLG 06Aluminium, aluminum CLGLNM 07)Compounds with silicon & oxygen( CLGLNMIFN 08Aluminium silicate CLGM 07)Compounds with oxygen( @ 08)Salts( ]IT CLGMIFN 09Aluminates CLGMJHN 08Aluminium oxide, alumina @ 07)Compounds with oxygen & hydrogen( ]IT CLGMKJHN 08Aluminium hydroxide, alumina trihydrate, hydrated aluminium oxide
BT/NTs inferred from the source file: WWH P WWH R S T Male voices Tenor Baritone Bass WWL ET WWL ETM WWL ETP WWL ETS Dance forms Bourée Chaconne Czardas Male voices NT Baritone Bass Tenor Dance forms NT Bourée Chaconne Czardas Baritone Bass Tenor BT Male voices BT Male voices BT Male voices Bourée Chaconne Czardas BT Dance forms BT Dance forms BT Dance forms
RTs derived from source file (but not inferred by software: WWD G WWD GF WWD H J K L M N O PD R Tonality Overtones, intervals Scales Diatonic, octave, 7 tone Major scale Minor scale Chromatic scale 12 tone Whole-tone scale Pentatonic scale Modes Major scale Minor scale Scales Modes RT Minor scale RT Major scale RT Modes RT Scales
Equivalence relationships inferred from source file: WWF O WWF R S T Musical plays Musicals Opera Operetta Opera comique, opera buffa Musical plays, musicals UF Musicals USE Musical plays WWG WWG C WWG F Liturgical music Service music Choral music Religious choral music Liturgical music, service music UF Service music USE Liturgical music WWQ Z WWS B Cello Violoncello Bowed instruments Cello, violoncello UF Violoncello USE Cello
Inter-facet (syntagmatic) relationships in a complex schedule: HUQ W HUQ WH HUQ WMD V HUQ WME HUQ X HUQ XS Thymus gland (Pathology) (Hyperplasia) Lymphatism, status lymphaticus (Neoplasms) Thymomas (Products) Thymus hormones Thymopoietins Thymus gland RT Lymphatism RT Thymomas RT Thymus hormones
Syntagmatic relationships: in theory the relationship between a class and a subclass created by combination with another facet = RT or associative term these can theoretically be identified by the presence of non-classes in the BC2 terminologies these relationships are more precise than in current thesaurus practice e.g. entity-process, entity-product, agent-operation, etc. currently these cannot be inferred
FATKS: Facet analytical theory in knowledge structures: a project carried out at SLAIS an attempt to design a classification that could be managed automatically a database was built to hold the classification data this included the hierarchical position of each class, and its containing category this would enable us to create compound classmarks from extracted terms or keywords
FATKS macrostructure:
Building compound and complex numbers a) Combining concepts within the same facet Notation: Description: Facet: J15 Marriage and family J Religious activities. Practice J1477 Abstinence. Celibacy J Religious activities. Practice J15J1477 Abstinence in marriage b) Combining concepts between facets Notation: Description Facet: 5904 Buddhism 590 Religions and Faiths E31 Originator, founder E Agents. Subfacet: Persons as agents A443 Physical form, appearance A: Theory and Philosophy. Subfacet of God. 5904E31A443 Trikaya. Doctrine of the three bodies in Buddhism Notation: Description: Facet: 59033 Hinduism 590 Religion and faiths 5904 Buddhism 590 Religions and Faiths 5907 Christianity 590 Religion and faiths J14247 Abstinence. Fasting. Prohibition J Religious activities. Practice 59033J14247 Upavasa. Fasting in Hinduism 5904J14247 Abstinence. Fasting in Buddhism 5907J14247 Fasting in Christianity
What FATKS can do: represent hierarchical position represent categorical status support search and navigation of the vocabulary allow automatic synthesis through inbuilt syntax has potential to do more than this
Conclusion: faceted terminologies built on the classical model address: functional status of concepts paradigmatic and syntagmatic relationships ordering rules for combination all the structural elements of a faceted thesaurus are implicit in a faceted classification many of the elements and relationships can be inferred automatically others have the potential to be recognized but need further identification in the source data automatic classification can be supported, and the automatic generation of populated structures currently it is not possible to represent great complexity of structure even though this is regular and predictable