Cascading Citation Indexing in Action *

Similar documents
A New Framework for the Citation Indexing Paradigm

f-value: measuring an article s scientific impact

Bibliometric glossary

Comprehensive Citation Index for Research Networks

Citation analysis: Web of science, scopus. Masoud Mohammadi Golestan University of Medical Sciences Information Management and Research Network

ISSN: ISO 9001:2008 Certified International Journal of Engineering Science and Innovative Technology (IJESIT) Volume 3, Issue 2, March 2014

EVALUATING THE IMPACT FACTOR: A CITATION STUDY FOR INFORMATION TECHNOLOGY JOURNALS

Publishing research. Antoni Martínez Ballesté PID_

UNDERSTANDING JOURNAL METRICS

The problems of field-normalization of bibliometric data and comparison among research institutions: Recent Developments

Percentile Rank and Author Superiority Indexes for Evaluating Individual Journal Articles and the Author's Overall Citation Performance

Enabling editors through machine learning

Bibliometric measures for research evaluation

Journal of American Computing Machinery: A Citation Study

THE USE OF THOMSON REUTERS RESEARCH ANALYTIC RESOURCES IN ACADEMIC PERFORMANCE EVALUATION DR. EVANGELIA A.E.C. LIPITAKIS SEPTEMBER 2014

BIBLIOMETRIC REPORT. Bibliometric analysis of Mälardalen University. Final Report - updated. April 28 th, 2014

Cited Publications 1 (ISI Indexed) (6 Apr 2012)

Journal Citation Reports on the Web. Don Sechler Customer Education Science and Scholarly Research

Finding Influential journals:

INTRODUCTION TO SCIENTOMETRICS. Farzaneh Aminpour, PhD. Ministry of Health and Medical Education

INTRODUCTION TO SCIENTOMETRICS. Farzaneh Aminpour, PhD. Ministry of Health and Medical Education

STI 2018 Conference Proceedings

FROM IMPACT FACTOR TO EIGENFACTOR An introduction to journal impact measures

InCites Indicators Handbook

MEASURING EMERGING SCIENTIFIC IMPACT AND CURRENT RESEARCH TRENDS: A COMPARISON OF ALTMETRIC AND HOT PAPERS INDICATORS

Citation & Journal Impact Analysis

University of Liverpool Library. Introduction to Journal Bibliometrics and Research Impact. Contents

CITATION INDEX AND ANALYSIS DATABASES

What is Web of Science Core Collection? Thomson Reuters Journal Selection Process for Web of Science

Impact Factors: Scientific Assessment by Numbers

Journal Citation Reports Your gateway to find the most relevant and impactful journals. Subhasree A. Nag, PhD Solution consultant

Contribution of Chinese publications in computer science: A case study on LNCS

Promoting your journal for maximum impact

Embedding Librarians into the STEM Publication Process. Scientists and librarians both recognize the importance of peer-reviewed scholarly

CITATION METRICS WORKSHOP (WEB of SCIENCE)

Concise Papers. Comprehensive Citation Index for Research Networks 1 INTRODUCTION 2 COMPREHENSIVE CITATION INDEX

Complementary bibliometric analysis of the Health and Welfare (HV) research specialisation

Introduction to Citation Metrics

Discussing some basic critique on Journal Impact Factors: revision of earlier comments

Eigenfactor : Does the Principle of Repeated Improvement Result in Better Journal. Impact Estimates than Raw Citation Counts?

Professor Birger Hjørland and associate professor Jeppe Nicolaisen hereby endorse the proposal by

Bibliometric Rankings of Journals Based on the Thomson Reuters Citations Database

DICOM medical image watermarking of ECG signals using EZW algorithm. A. Kannammal* and S. Subha Rani

Publishing Your Article in a Journal

Exploring and Understanding Citation-based Scientific Metrics

Rawal Medical Journal An Analysis of Citation Pattern

Navigate to the Journal Profile page

Measuring the Impact of Electronic Publishing on Citation Indicators of Education Journals

Scientometrics & Altmetrics

Measuring Academic Impact

Publication boost in Web of Science journals and its effect on citation distributions

Citation Analysis. Presented by: Rama R Ramakrishnan Librarian (Instructional Services) Engineering Librarian (Aerospace & Mechanical)

The use of bibliometrics in the Italian Research Evaluation exercises

How comprehensive is the PubMed Central Open Access full-text database?

Experiences with a bibliometric indicator for performance-based funding of research institutions in Norway

DISCOVERING JOURNALS Journal Selection & Evaluation

Finding Influential journals:

Research Ideas for the Journal of Informatics and Data Mining: Opinion*

Guidelines for Manuscript Preparation for Advanced Biomedical Engineering

Peter Ingwersen and Howard D. White win the 2005 Derek John de Solla Price Medal

A systematic empirical comparison of different approaches for normalizing citation impact indicators

Is Scientific Literature Subject to a Sell-By-Date? A General Methodology to Analyze the Durability of Scientific Documents

Biography/Bibliography Form Reformatting Implementation Guidelines for 2015 & 2016

Self-citations in Annals of Library and Information Studies

The mf-index: A Citation-Based Multiple Factor Index to Evaluate and Compare the Output of Scientists

Google Scholar and ISI WoS Author metrics within Earth Sciences subjects. Susanne Mikki Bergen University Library

Your research footprint:

1.1 What is CiteScore? Why don t you include articles-in-press in CiteScore? Why don t you include abstracts in CiteScore?

F. W. Lancaster: A Bibliometric Analysis

Citation Indexes and Bibliometrics. Giovanni Colavizza

New Perspectives in Scientific Publishing

JOURNAL IMPACT FACTOR. 3-year calculation window (2015, 2016, and 2017)

Complementary bibliometric analysis of the Educational Science (UV) research specialisation

Scopus in Research Work

INSTRUCTIONS FOR AUTHORS

AN INTRODUCTION TO BIBLIOMETRICS

Where to present your results. V4 Seminars for Young Scientists on Publishing Techniques in the Field of Engineering Science

An Integrated Music Chromaticism Model

Accpeted for publication in the Journal of Korean Medical Science (JKMS)

PUBLICATION OF RESEARCH RESULTS

Author Instructions for Environmental Control in Biology

What is bibliometrics?

Using Bibliometric Analyses for Evaluating Leading Journals and Top Researchers in SoTL

USING THE UNISA LIBRARY S RESOURCES FOR E- visibility and NRF RATING. Mr. A. Tshikotshi Unisa Library

Instructions for Manuscript Preparation

Educated readership. 1 Introduction. 2 Proliferation. oestros 7 (2012)

Scopus. Advanced research tips and tricks. Massimiliano Bearzot Customer Consultant Elsevier

Scientometric and Webometric Methods

Automatically Creating Biomedical Bibliographic Records from Printed Volumes of Old Indexes

A Visualization of Relationships Among Papers Using Citation and Co-citation Information

Publishing Your Research

Simple Gaussian Filter Design for FH-SS Applications

2013 Environmental Monitoring, Evaluation, and Protection (EMEP) Citation Analysis

SCIENTOMETRICS AND RELEVANT BIBLIOGRAPHIC DATABASES IN THE FIELD OF AQUACULTURE

Bibliometric analysis of the field of folksonomy research

Fast thumbnail generation for MPEG video by using a multiple-symbol lookup table

Russian Index of Science Citation: Overview and Review

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

SEARCH about SCIENCE: databases, personal ID and evaluation

A Discriminative Approach to Topic-based Citation Recommendation

Transcription:

Cascading Citation Indexing in Action * T.Folias 1, D. Dervos 2, G.Evangelidis 1, N. Samaras 1 1 Dept. of Applied Informatics, University of Macedonia, Thessaloniki, Greece Tel: +30 2310891844, Fax: +30 2310891800, E-mail:{folias,samaras,gevan}@uom.gr 2 Dept. of Information Technology, Alexander Technology Educational Institute (ATEI), Thessaloniki, Greece Tel: +30 2310791295, Fax: +30 2310791290, E-mail: dad@it.teithe.gr Abstract In this paper we apply the Cascading Citation Indexing Framework (c 2 IF) algorithm and on real-world data and discuss the results obtained. Given a collection of articles and their bibliographic associations (references), the algorithm considers citations at the article level. A relational database management system (RDBMS) is utilized, both for the representation of the citation graph and for the storage of the results. For a given positive integer value k, the algorithm computes for each one article all the 1-gen, 2-gen,, k-gen citations (Reference: ASIS&T article). In addition, the algorithm identifies the selfcitations and the cycles present in the citations graph. Finally, it constructs a citation standings table per each one article considered. To test our approach, we utilize six years of citations data (1999-2005) from the ISI Science Citation Index Expanded (ISI SCIE) made available from Thomson Scientific along the lines of the Cascading Citations Analysis Project (C-CAP, http://www.ccapnet.org/ccap/). 1. Introduction Today, developments like the evolving scholarly communication environment, the open access movement, and the globalization in research advance with a rapid pace. As a result, more intense becomes the need for an improved scheme that assesses the contribution research publications, authors, and scientific collections make in promoting science and technology. The impact factor metric is used to quantify the rate with which a journal receives citations on its articles over time (Garfield 1955, 1999 and 2005). Two more metrics of this type are the immediacy index (Tomer, 1986) and the cited half-life (Glänzel, and Moed, 2002). Although the impact factor comprises a useful indicator of scholarly status, concerns have been expressed over the usefulness and the fairness of its implementation (Coleman, 2006; Moed, 2005). When citations are considered at the published article level, the article s scholarly value is measured by utilizing two major metrics: (a) the number of direct citations received, and (b) the impact factor of the hosting conference/journal. The first metric reflects the popularity of the particular article. The second metric quantifies the scholarly credibility of the article in question, * This work has been funded by the Research Committees of the Alexander Technology Educational Institute (ATEI), and the University of Macedonia, Thessaloniki, Greece, along the lines of the C-CAP project. 290 International Scientific Conference era-2

since acceptance by a widely recognized conference or journal most probably signifies the presence of a pioneering character and expert recognition in what is being reported. Consequently, articles published in high impact factor journals reach a broader audience, and they are likely to receive a larger number of citations. In order to enrich/extend the citation indexing paradigm, indirect citations were introduced originally by (Dervos, and Kalkanis, 2005) and they are further elaborated on in (Dervos et al., 2006). The approach is analogous to the one of the weighted PageRank algorithm (Brin and Page, 1998) in that citation paths of length greater than one are being exploited. In this respect, the scholarly status is assessed not just in terms of popularity of the cited item, but also in terms of the prestige of the citing item(s) (expressed by the number of indirect citations received). Given a collection of articles and their citation graph, our algorithm considers citations at the article level. Each one article is uniquely identified by means of the Digital Object Identifier (DOI). In addition to the citations directly made to a given article, citation paths that target each one citing article are also considered. The c 2 IF algorithm utilizes a relational database management system (RDBMS) both for the representation of the citation graph and for the storage of the results (citation paths). For a given positive integer value k, the algorithm computes for each one DOI all the 1-gen, 2-gen,, k-gen citations and identifies self-citations based on simple author name comparison (in the absence of a Universal Author Identifier System). On the way, cycles present in the citations graph are identified, and a citation standings tabular output is produced, registering one row per article. To test the algorithm, six years of citations data were utilized. The dataset (IS Science Citation Index Epanded: 1999-2005) has been made available from Thomson Scientific (http://scientific.thomson.com/) to be used along the lines of the C-CAP project. The dataset registers 7,364,211 research article records involving 165,822,522 citation instances. Following the data cleaning/preparation stage, 35,503,513 citation instances have been identified to satisfy the requirement that the cited articles are present in the dataset considered. Here we present the c 2 IF algorithm we developed to calculate all the direct and indirect citations for the above dataset, taking into consideration citation results up to level 3, i.e. up to 3-gen (self-) citations. The paper is organized as follows. Following this introduction, in Section 2 some necessary notations and definitions are provided. All the basic concepts involved in the proposed cascading citation indexing framework are explained. An analytic description of the proposed c 2 IF algorithm is presented in Section3. In Section 4 an extensive experimental study on real-world data is presented, obtained from ISI Thomson. Finally, in the last Section we conclude and discuss future work. 2. Cascading Citations Let us consider a small hypothetical collection of five articles labeled, for simplicity, with the integers 1, 2, 3, 4, and 5. Furthermore, let {A, B} be the two authors who have coauthored article 1, A be the author of 2, {B,C} the authors of 3, D the author of 4, and {B,E,F} the authors of 5. A citation graph is a directed graph that represents relationships between articles in terms of citation references. In Figure 1 the citation graph for the hypothetical collection considered is presented. Each one node corresponds to one article. The letter(s) in the box(es) around each node represent the author(s) of the article. References from one article to another are represented by directed arcs. Citations are considered on article level. For example, article 1 is cited by International Scientific Conference era-2 291

3, along the 3 1 citation path, with article 3 being the source and article 1 being the target of the citation. The latter is said to comprise a 1-gen (direct) citation. In the same manner, 2-gen, 3-gen,, k-gen citations are defined to be those that target a given (article) indirectly. For example, article 1 is cited by 4 via a 2-gen citation, along the 4 2 1 citation path. Figure 1: Citation graph of the hypothetical collection. Table 1 lists all the citations present in the citation graph of the hypothetical article collection considered. Article 1 1 1 1 2 2 4 Citation path 2 1 3 1 4 2 1 5 4 2 1 3 2 4 2 5 4 Citation type 1-gen 1-gen 2-gen 3-gen 1-gen 1-gen 1-gen Table 1: Citations, paths, and types present in the hypothetical collection For each one article N, the list of its co-authors is denoted by AL N. For example, in the hypothetical collection considered: AL 5 ={B,E,F}. Table 3 summarizes on the notation used throughout this paper. Notation N N[A,B] AL N S T S T Meaning Article Article N is co-authored by authors A, and B A given article s authors list. Thus, for N[A,B]: AL N = {A,B} The source article of a given k-gen citation path (k=1,2, ) The target article of a given k-gen citation path (k=1,2, ) k-gen citation path : S cites T (k=1,2, ) Table 2: Notation used 2.1 Self Citations Today s practice is to consider citations at the (cited) article level. In this respect, a self citation is said to occur when the set of co-authors of the cited and citing papers are not disjoint (Snyder and Bonzi, 1998). 292 International Scientific Conference era-2

Definition: A k-gen (k=1,2, ) citation path S T represents a self-citation for a given article T, when author A appears in the authors lists of both the target- and source- articles of the citation path considered (i.e. when A AL T AL S ). For example, considering the citation graph of the hypothetical collection shown in Figure 1, 2 1 represents a 1-gen self-citation on article 1, 3 2 1 represents a 2-gen self-citation on article 1, and 5 4 2 1 represents a 3-gen self-citation on article 1. 2.2 Chords For the purpose of increasing the granularity of the citation indexing paradigm, the concept of the chord is introduced and it is defined as follows: Definition: A k-gen (k=2, ) citation path S T represents a chord for a given article T, when the path co-exists with a 1-gen citation path involving the same source (S) and target (T) articles. A chord is considered to be important in the citation indexing paradigm for the following reason: the scheme is indicative of an increased probability the target article in question stands in being one of increased impact in promoting science and technology. This is justified by the fact that the source article in question cites the target article both indirectly, and directly. 2.3 Citations Standings Type Output Considering the above, the citation standings table of the c 2 IF algorithm in the proposed cascading citation indexing framework need be one whereby each one row lists the following: (a) the article in question, (b) the number of 1-gen,, k-gen citations received, (c) the number of 1-gen,, k-gen self-citations received (s-citations), and (d) the number of 2-chord, 3-chord received. In this respect, the c 2 IF output for the article collection shown in Figure 1 is presented in Table 3. Citations Self-citations Chords Article 1-gen 2-gen 3-gen 1-gen 2-gen 3-gen 2-chord 3-chord 1 2 2 1 1 0 0 1 1 2 2 1 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 4 1 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 Table 3: Citation Standings Output (for the hypothetical collection) 2.4 Cycles along the citation paths One would ideally expect that a citation graph does not involve any cycles, since each one citing article is expected to be posterior to the one(s) it cites. Yet, this is not always the case; for example, it is possible for a journal preprint to receive a citation from an article that is published at an earlier date than the cited article. International Scientific Conference era-2 293

3. The c 2 IF algorithm The c 2 IF algorithm runs on top of an RDBMS and consists of three distinct modules: (a) the Citation Paths Module (CPM), (b) the Self-Citations Filter (SCF) and (c) the Citation Standings Module (CSM). Consequently, the computation of the final citation standings output involves three distinct stages. In stage 1, for a given (article) CSM calculates all the citation paths encountered on the way to each and every one citing article. In doing so, the algorithm includes paths that involve cycles. The citation paths are then stored in the RDBMS and they are used in order to be further processed during stage 2. In stage 2, SCF operates on the citation paths and identifies potential self citations up to level k. Next comes stage 3 whereby CSM calculates the total number of 1-gen,2-gen and 3- gen (self-) citations along with the 2-chords and 3-chords. Figure 2: c 2 IF algorithm components c 2 IF{ for each article ai (i = 1..N) in a citations database { // Step 1 create all citation paths c j (j = 1..a i[m] ) with a i as target up to depth=k; // Step 2 :calculate filtered citation paths (fc j s) for each citation path c j { if (cycle) remove it elseif (self-citation)tag it // Step 3 for each filtered citation path fc j { compute citation standings for a i } } Figure 3: c 2 IF algorithm 3.1 Citation Paths Module (CPM) During the first step of the algorithm, the CPM is responsible for the creation of the citation paths up to the pre-specified depth k (in our case k=3). We assume that the given database includes direct citations data and that the citing articles are also present in the dataset (i.e., we have a closed system). Citation paths including articles not present in the given dataset cannot be traced and are omitted from the output of our 294 International Scientific Conference era-2

algorithm. The CPM assumes that citation data are given in the form of a relational table where each row corresponds to a direct (1-gen) citation from one article to another. We name this table Citations Table. In the simplest case the Citations Table is consisted of two fields: CitingArticle and CitedArticle, both referring to a unique article identifier (i.e. DOI). The CPM takes as input the Citations Table and produces a new table containing the Citation Paths up to the specified depth k. This is the most critical part of our algorithm in terms of performance and time to compute since the amount of data grows exponentially as k increases. For optimization purposes and due to the enormous amount of data involved we implemented this module entirely with SQL. After this initial step, our algorithm produces the Citation Paths table (Table 1) which is a relational representation of the citation graph up to depth=k. 3.2 Self Citation Filter (SCF) SCF is applied on the Citation Paths table during step 2, aiming at identifying potential self-citations up to level k. Paths that comprise self-citations for each article examined are marked as such in the Citation Paths table using an additional binary field in each row. Since citations are considered on article level simple author name comparison is used. For each k-gen citation in the Citation Paths table, SCF queries the database and checks if the author list of the cited article and the author list of the citing article are disjoint. If not then the examined citation path is considered as a k-gen self-citation and is marked so in the Citation Paths table. The SCF module is also able to identify and remove cycles present in the citations paths. 3.3 Citation Standings Module (CSM) After having identified the self-citations, the CSM constructs the Citation Standings table in the form presented in Table 4. The module counts for each article in the database the total number of 1-gen, 2-gen,,k-gen (self-) citations recorded in the Citations Paths table and adds a single row per article in the Citation Standings table. 4. Experimental results To test our algorithm, we used six years of citations data (1999-2005) from the ISI Science Citation Index Expanded (ISI SCIE) made available from Thomson Scientific, along the lines of the Cascading Citations Analysis Project (C-CAP). The dataset included 7,364,211 research article records involving 165,822,522 citation instances. Following the data cleaning/preparation stage, 35,503,513 citation instances identified to satisfy the requirement that the cited articles are present in the dataset considered. We decided to calculate all the direct and indirect citations for the above dataset, taking into consideration citation results up to level k=3, i.e. up to 3-gen (self-) citations. The experiment was conducted on a 2.4 Ghz Quad Core Intel workstation with 4Gb of RAM, 750Gb of hard drive running Windows XP Professional. The RDBMS used was IBM DB2 v.8.1.9.710. For the provided dataset the algorithm initially identified 35,503,513 1-gen citations, 291,238,196 2-gen citations and 1,164,952,784 3-gen citations including cycles and self-citations. After tagging/removing the self-citations and cycles, the algorithm constructed the Citation Standings table from which the top-20 articles are shown in International Scientific Conference era-2 295

Table 4. (Because of the small number of self-citations the 1-gen, 2-gen and 3-gen selfcitations columns of the Citation Standings table are omitted.) Rank# Article 1-GEN 2-GEN 3-GEN 2-CHORD 3-CHORD YEAR 1 Mechanisms of disease 4894 69820 451133 13884 66552 1999 2 Initial sequencing and 4192 63156 335778 9417 48973 2001 3 The Protein Data Ban 3572 30439 148106 4521 12337 2000 4 Executive summary o 3368 28030 54060 7100 10034 2001 5 The sequence of the 3274 53920 320298 4824 25792 2001 6 The hallmarks of can 2600 44145 272552 4057 15728 2000 7 Risks and benefits of 2504 14517 25073 4899 7002 2002 8 Review of Particle Ph 2322 28691 270274 3579 14191 2000 9 Duplexes of 21-nucle 2180 44114 383049 13783 122172 2001 10 The genome sequen 2081 49262 513304 4325 31816 2000 11 Effects of an angiote 2026 26918 98521 4311 8906 2000 12 Molecular classificati 2024 41169 305329 8637 35014 1999 13 Large mass hierarch 1993 29041 282900 14569 117633 1999 14 Review of particle ph 1979 14559 34221 4579 9552 2002 15 MEGA2: molecular e 1976 7590 11130 705 519 2002 16 Distinct types of diffu 1942 39094 271815 6975 24730 2000 17 First-year Wilkinson 1932 12024 29991 6051 18793 2003 18 An alternative to co 1923 27891 249184 14313 111961 1999 19 SIR97: a new tool f 1922 6321 20041 808 870 1999 20 Measurements of O 1886 30993 287378 11380 84383 1999 Table 4: Citation Standings table for the top-20 highly cited articles sorted by 1-gen The entries in Table 4 are sorted first by the 1-gen, then by the 2-gen, and then by the 3-gen value. The Year column values were entered manually while the Rank column is used to simplify our reference to the results. Commenting on the data of Table 4, one easily notes that article ranked 5 th with regard to 1-gen citations, has almost twice 2-gen and 3-gen citations compared to article ranked 3 rd despite the fact article ranked 5 th was published one year after. Also, article ranked 10 th appears to have received 62,000 more 3-gen citations than article ranked 1 st 600000. The way citations are Figure 4: Top 20 2-gen and 3-gen citations rated, the authors of the 1 st 500000 article are a clear winner over the authors of the 10 th 400000 article for having obtained 4894 2-GEN 300000 next to 2081 1-gen citations. 3-GEN Figure 4 provides a concise view on how 2-gen, and 3-gen citations go for the top 20 articles in the Thomson ISI database. On the horizontal axis, the 20 articles are encoded in accordance with their ranked position in the output. The vertical axis lists the number of citations made. Observing the graph in Fig. 4, one can easily identify the articles that fall behind in the number of 2-gen and 3- gen citations obtained: they are the ones labelled as numbers 3, 4, 7, 11, 14, 15, 17, 19. This one alone not 200000 100000 0 140000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 120000 100000 80000 60000 40000 20000 0 Figure 5: Top 20 2-chord and 3-chord citations 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 2-CHORDS 3-CHORDS 296 International Scientific Conference era-2

being a reason for reaching a conclusion on the impact a publication has made in promoting science, the information obtained may probably be utilized for the publications that represent the peaks in Fig 4 to be given more credit for the research interest they have triggered. In an analogous approach Figure 5 provides a concise view on how 2-chord, and 3- chord citations go for the top 20 articles. Observing the graph in Fig. 5, one can easily identify the articles that seem to have spurred the scientific interest in their field by the number of 2-chord and 3-chord citations obtained: they are the ones labelled as numbers 1, 9, 13, 18, 20. The fact that articles 9, 13, 18 and 20 have received a significant amount of 3-chords implies that these articles have contributed a lot in their scientific field because too many scientists cite them both directly and indirectly. This implies that the 2-chord/3-chord values comprise an even more accurate metric indicating high quality in scientific work. 5. Conclusions A set of c 2 IF results on real data obtained is indicative of the usefulness of the information obtained by implementing the new cascading citations indexing framework. One possible future improvement is the design and development of a weighted variation of the c 2 IF algorithm. The scheme is expected to make possible the calculation of a single value reflecting the impact/contribution each one actor represents in the context of the citation data space : an actor being an (article, author) pair, an individual article, an author, or a hosting journal. In addition to the calculation of a single impact factor metric, the granularity of the cascading citation indexing paradigm data content facilitates effective analytical processing of the data mining type to be conducted, in order to identify regions of increased research activity, as well as interesting trends in the citations data space. References [1] Brin, S., and Page, L. (1998). The Anatomy of Large-Scale Hypertextual Web Search Engine. [2] Coleman, A. (2006). Assessing the Value of a Journal Beyond the Impact Factor. Journal of Education for Library and Information Science. Submitted to Journal of the American Society for Information Science & Technology. [3] Dervos, D., Samaras, N., Evangelidis, G., and Folias, T. (2006). A New Framework for the Citation Indexing Paradigm, Proceedings, 2006 Annual Meeting of the American Society of Information Science and Technology (ASIS&T). [4] Dervos, D.A., and Kalkanis, T. (2005). cc-iff: A Cascading Citations Impact Factor Framework for the Automatic Ranking of Research Publications. Proc. of the 3 rd IEEE Int. Workshop on Intelligent Data Acquisition and Advanced Computer Systems: Technology and Applications (IDAACS), p. 668-673, Sofia, Bulgaria, 5-7 September, 2005. [5] Garfield, E. (1955). Citation Indexes to Science: a New Dimension in Documentation through Association of Ideas. Science 122(3159), 108-111. [6] Garfield, E. (2005). The Agony and the Ecstasy - The History and the Meaning of the Journal Impact Factor. Presented at the International Congress on Peer Review and Biomedical Publication, Chicago, USA, September 16, 2005. [7] Garfield, E.. (1999). Journal Impact Factor: a brief review. Canadian Medical Association Journal, 161(8), 979-980. [8] Glänzel, W., and Moed, H. F. (2002), Journal impact measures in bibliometric research. Scientometrics, 53(2), 171 193. [9] Moed, H.F. (2005). Citation Analysis of scientific journals and journal impact measures. Current Science 89(12),1990-1996. International Scientific Conference era-2 297

[10] Snyder, H. and Bonzi, S. (1998). Patterns of Self-citations across disciplines (1980-1989). Journal of Information Science, 24(6), 431-435. [11] Tomer, C. (1986). A statistical assessment of two measures of citation: the impact factor and the immediacy index, Information Processing and Management, 22(3): 251-258. 298 International Scientific Conference era-2