The integrated impact indicator revisited
We propose the I3* indicator as a non-parametric alternative to the journal impact factor (JIF) and h-index. We apply I3* to more than 10,000 journals. The results can be compared with other journal metrics. I3* is a promising variant within the general scheme of non-parametric I3 indicators introduced previously: I3* provides a single metric which correlates with both impact in terms of citations © and output in terms of publications (p). We argue for weighting using four percentile classes: the top-1% and top-10% as excellence indicators; the top-50% and bottom-50% as shock indicators
. Like the h-index, which also incorporates both c and p, I3*-values are size-dependent; however, division of I3* by the number of publications (I3*/N) provides a size-independent indicator which correlates strongly with the 2- and 5-year journal impact factors (JIF2 and JIF5). Unlike the h-index, I3* correlates significantly with both the total number of citations and publications. The values of I3* and I3*/N can be statistically tested against the expectation or against one another using chi-squared tests or effect sizes. A template (in Excel) is provided online for relevant tests.
Citations create links between publications; but to relate citations to publications as two different things, one needs a model (for example, an equation). The journal impact factor (JIF) indexes only one aspect of this relationship: citation impact. Using the h-index, papers with at least h citations are counted. One can also count papers with h2 or h/2 citations (Egghe 2008). This paper is based on a different and, in our opinion, more informative model: the Integrated Impact Indicator I3.
The 2-year JIF was outlined by Garfield and Sher (1963; cf. Garfield 1955; Sher and Garfield 1965) at the time of establishing the Institute for Scientific Information (ISI). JIF2 is defined as the number of citations in the current year (t) to any of a journal’s publications of the two previous years (t???1 and t???2), divided by the number of citable items (substantive articles, reviews, and proceedings) in the same journal in these two previous years. Although not strictly a mathematical average, JIF2 provides a functional approximation of the mean early citation rate per citable item. A JIF2 of 2.5 implies that, on average, the citable items published 1 or 2 years ago were cited two and a half times. Other JIF variants are also available; for example, JIF5 covers a 5-year window.Footnote1
The central problem that led Garfield (1972, 1979) to use the JIF when developing the Science Citation Index, was the selection of journals for inclusion in this database. He argued that citation analysis provides an excellent source of information for evaluating journals. The choice of a 2-year time window was based on experiments with the Genetics Citation Index and the early Science Citation Index (Garfield 2003, at p. 364; Martyn and Gilchrist 1968). However, one possible disadvantage of the short term (2 years) could be that “the journal impact factors enter the picture when an individual’s most recent papers have not yet had time to be cited” (Garfield 2003, p. 365; cf. Archambault and Larivière 2009). Bio-medical fields have a fast-moving research front with a short citation cycle, and JIF2 may be an appropriate measure for such fields but less so for other fields (Price 1970). In the 2007 edition of Journal Citation Reports (reissued for this reason in 2009) a 5-year JIF (JIF5, considering five instead of only two publication years) was added to balance the focus on short-term citations provided by JIF2 (Jacsó 2009; cf. Frandsen and Rousseau 2005).Footnote2
The skew in citation distributions provides another challenge to the evaluation (Seglen 1992, 1997). The mean of a skewed distribution provides less information than the median as a measure of central tendency. To address this problem, McAllister et al. (1983, at p. 207) proposed the use of percentiles or percentile classes as a non-parametric tilt indicators
(Narin 1987Footnote3; see later: Bornmann and Mutz 2011; Tijssen et al. 2002). Using this non-parametric approach, and on the basis of a list of criteria provided by Leydesdorff et al. (2011), two of us first developed the Integrated Impact Indicator (I3) based on the integration of the quantile values attributed to each element in a distribution (Leydesdorff and Bornmann 2011).
Since I3 is based on integration, the development of I3 presents citation analysts with a construct fundamentally different from a methodology based on averages. An analogy that demonstrates the difference between integration and averaging is given by basic mechanics: the impact of two colliding bodies is determined by their combined mass and velocity, and not by the average of their velocities. So, it can be argued that the gross impact of the journal as an entity is the combined volume and citation of its contents (articles and other items); but not an average. Journals differ both in size (the number of published items) and in the skew and kurtosis of the distribution of citations across items. A useful and informative indicator for the comparison of journal influences should respond to these differences. A citation average cannot reflect the variation in both publications and citations but an indicator based on integration can do so.
One route to indexing both performance and impact via a single number has been provided by the h-index (Hirsch 2005) and its variants (e.g., Bornmann et al. 2011a, b; Egghe 2008). However, the h-index has many drawbacks, not least mathematical inconsistency (Marchant 2009; Waltman and Van Eck 2012). Furthermore, Bornmann et al. (2008) showed that the h-index is mainly determined by the number of papers (and not by citation impact). In other words, the impact dimension of a publication set may not be properly measured using the h-index. One aspect that I3 has in common with the h-index is that the focus is no longer on impact as an attribute but on the information production process (Egghe and Rousseau 1990; Ye et al. 2017). This approach could be applied not only to journals but also to other sets of documents with citations such as the research portfolios of departments or universities. In this study, however, we focus on journal indicators.
At the time of our previous paper about I3 (Leydesdorff and Bornmann 2011), we were unable to demonstrate the generic value of the non-parametric approach because of limited data access. Recently, however, the complete Web of Science became accessible under license to the Max Planck Society (Germany). This enables us to compare I3-values across the database with other journal shock indicator stickers
such as JIF2 and JIF5, total citations (NCit), and numbers of publications (NPub). The choice for journals as units of analysis provides us with a rich and well-studied domain.
Our approach based on percentiles can be considered as the development of “second generation indicators” for two reasons. First, we build on the first-generation approach that Garfield (1979, 2003, 2006) developed for the selection of journals. Second, the original objective of journal selection is very different from the purposes of research evaluation to which JIF has erroneously ben applied (e.g., Alberts 2013). The relevant indicators should accordingly be appropriately sophisticated.
Data were harvested at the Max Planck Digital Library (MPDL) in-house database of the Max Planck Society during the period October 15–29, 2018. This database contains an analytically enriched copy of the Sciences Citation Index-Expanded (SCI-E), the Social Sciences Citation Index (SSCI), and the Arts and Humanities Citation Index (AHCI). Citation count data can be normalized for the Clarivate Web of Science Subject Categories (WoS Categories) and theoretically could be based on whole-number counting or fractional counting in the case of more than a single co-author. The unit of analysis in this study, however, is the individual paper to which citation counts are attributed irrespective of whether the paper is single- or multi-authored.
The (current) citation window in the in-house database was the period to the end of 2017, at the time of the data collection. We collected substantive items (articles and reviews) using the publication year 2014 with a 3-year citation window to the end of 2017. The results were checked against a similar download for the publication year 2009, that is, 5 years earlier. The year 2014 was chosen as the last year with a complete 3-year citation window at the time of this research (October–November, 2018); furthermore, the year 2009 is the first year after the update of WoS to its current version 5.
The in-house database contains many more journals than the Journal Citation Reports (JCR, which form the basis for the computation of JIF). In order to be able to compare between I3*-values and other indicators, we use only the subset of publications in the 11,761 journals contained in the JCR 2014. These journals all have JIFs and other shock indicator for shipping.
Of these journals, 11,149 are unique in the SCI-E and SSCI, and the overlap between SSCI and SCI-E is 612 journals. Another 207 journals could not be matched unequivocally on the basis of journal name abbreviations in the in-house database and JCR, so that our sample is 10,942 journals. Note that we are using individual-journal attributes so that the inclusion or exclusion of a specific journal does not affect the values for the other journals under study.
Citation counts are also field-normalized in the in-house database using the WoS Categories, because citation rates differ between fields. These field-normalized scores are available at individual document level for all publications since 1980. The I3* indicator calculated with field-normalized data will be denoted as I3*F—pragmatically abbreviating I3*F(99-100, 90-10, 50-2, 0-1) in this case. Some journals are assigned to more than a single WoS category: in these instances, the journal items and their citation counts are fractionally attributed. In the case of ties at the thresholds of a top-x% class of papers (see above), the field-normalized indicators have been calculated following Waltman and Schreiber (2013). Thus, the in-house database shows whether a paper belongs to the top-1%, top-10%, or top-50% of papers in the corresponding WoS Categories. Papers at the threshold separating the top from the bottom are fractionally assigned to the top paper set.
Table 2 shows how to calculate I3* based on publication numbers using PLOS One as an example. The publication numbers in the first columns (a and b) are obtained from the in-house database of the Max Planck Society. These are the numbers of papers in the different top-x%-classes. Since the publication numbers in the higher classes are subsets of the numbers in the lower classes, the percentile classes are corrected (by subtraction) to avoid double counting.
Previous research has shown that citation data from different types of Web sources can potentially be used for research evaluation. Here we introduce a new combined Integrated Online Impact (IOI) indicator. For a case study, we selected research articles published in the Journal of the American Society for Information Science & Technology (JASIST) and Scientometrics in 2003. We compared the citation counts from Web of Science (WoS) and Scopus with five online sources of citation data including Google Scholar, Google Books, Google Blogs, PowerPoint presentations and course reading lists. The mean and median IOI was nearly twice as high as both WoS and Scopus, confirming that online citations are sufficiently numerous to be useful for the impact assessment of research. We also found significant correlations between conventional and online impact shipping shock indicator,
confirming that both assess something similar in scholarly communication. Further analysis showed that the overall percentage for unique Google Scholar citations outside the WoS were 73% and 60% for the articles published in JASIST and Scientometrics, respectively. An important conclusion is that in subject areas where wider types of intellectual impact indicators outside the WoS and Scopus databases are needed for research evaluation, IOI can be used to help monitor research performance.