The Last of the Litter : "Netometrics"

Marcia J. BOSSY


Face au développement des réseaux et aux processus de différenciation des espaces-temps de la recherche, des pratiques intellectuelles, des agencements désirants qui vont avec, on se propose d'indiquer la nécessité, au-delà des approches infométriques, de réfléchir à de nouvelles métriques visant l'appréhension de ces comportements émergents. On montre, au passage, la difficulté et l'intérêt qu'il y a, à tenter de penser ensemble la co-existence complexe de pratiques de recherche, expression et exprimé de dispositifs informationnels-communicationnels hétérogènes ainsi que la co-existence de pratiques d'Écritures et de pratiques Orales très variées.




  1. Internet : an extended scientific community

  2. On Internet we can observe "science in action"

  3. The new face of scientometrics


          
For the last two years, I have been working intensively with (and on) the Internet - the largest existing electronic network. I am particularly interested in its repercussion on scientific communities and, consequently, on scientometrics. Internet-mediated scientific interaction has not been, up to now, object of systematic observation by scientometricians. I will try to show that we should consider it our main field of observation and source of data since it is the agent of qualitative changes in the way that science is made.


          

1. Internet : an extended scientific community

Because of Internet's distributed organization, it is hard to assess it's actual magnitude. The more conservative estimates give us 2 millions hosts (serving one or, more typically, up to hundreds of users) not counting modem equiped individual machines. These numbers have been doubling every year for the past five years and are not expected to slow down soon. The construction of high-capacity infra-structures in U.S.A. and in Europe ("information superhighway") attests that this trend will continue. Internet was originally put together for the use of the american academic community and later extended for the international community. They still represent about 50% of the users.

Researchers go on Internet because of what they find there : abundant, easily accessed information and prompt, far-reaching communication with his/hers peers. The latter may take more traditional forms of reference and factual databases, full-text directories, software code, ... Among communication features we find mailing lists, electronic conferences and bulletin boards, virtual conferencing and many more. More recent developements are the "users produced" applications answering to the resource discovery problem [1] : Gopher, Wais and WWW. These patterns of international circulation have been devised by the users themselves, sometimes by trial and error methods, and they define the network as the site of a new model of self-organization by scientific communities.


          

2. On Internet we can observe "science in action"

Bibliometrics are still widely used as a generic term for the correlated fields of sciento-, info-, techno- metrics where publications are considered the elementary units of scientific information and the main source of indicators. This points to one of our biggest problems today. Michel Callon [2] has stressed the fact that biblometrics origins are linked to a positivistic model of science. Let's go a little bit further. Using Bruno Latour's [3] frame of analysis, we may say that published material offers us only one face of science. Before assuming the published form, information has been object of intense socio-cognitive activity that is entirely left out of our conclusions. Also absent of our field of observation is the interaction of science production with technology, industry, the economic/politic establishment.The causes of the crisis in our discipline(s) are not to be found in ethics but in the inadequacy of our model. It is the limited definition of our object of study that leads to some of the problems mentioned : over-simplification, misuse or bad interpretation of the results ...

Efforts have been made to integrate patent litterature and/or grey litterature, of course. They are in fact at the origin of the techno-, sciento-metrics sub-fields. That they remain separated sub-fields is due in part to the fact that, on paper, these sources are object of dissimilar sets of professional practices and take different paths of production and circulation.

The diversity of new patterns of communication on the electronic network blurrs sometimes the frontiers between formal and informal circulation, between activities taking place inside and outside laboratories. The consequences of this inevitable situation for scientific communities and it's relationship with the "outside world" have been developed by William Turner [4]. Here, I would like to suggest that this state of facts offers us the possibility of incorporating the different faces of the continuos process of scientific production as our study field.

The term "collaboratory" is used to describe this situation. It evokes researchers working in cooperation at distant sites via electronic network. Collaboratories are new assembling grounds for scientific communities. Some examples :

  1. Stevan Harnad, of the Cognitive Science Laboratory/Princeton University, is co-editor of the refereed electronic journal Psycoloquy (psychology and related fields) : ..."PSYCOLOQUY is explicitly devoted to scholarly skywriting, the radically new form of communication made possible by the Net, in wich authors post a brief account of current ideas and findings on wich they wish to elicit feedback from fellow-specialists[...] The refereeing of each original posting and each item of peer feedback on it is to be done very quickly,[...] so as to maintain the momentum and interactiveness of this unique medium[...] Skywriting [.is.] conducted through the discipline of the written medium, monitored by peer review and permanently archived for future reference[.It.] is intended for for the pre-publication "pilot" stage of scientific inquiry in wich peer communication and feedback is are still critically shaping the final intellectual outcome." [5] [6]

  2. Users on Internet can subscribe to an electronic conference, sometimes called discussion list. Postings from subscribers are send by an e-mail supported software to all it's member who can in turn react to them by posting their answers. Most scholarly conferences are moderated ; the moderator may have different degrees of editorial authority, depending of the consensus of it's members. There are 1155 scholarly electronic conferences (on subjects ranging from Anthropology to Computer Sciences to Geography to Zoology and more...) on the last list compiled by Diane K. Kovacs and the Directory Team of Kent State University [7].

  3. The World-Wide-Web initiative is a CERN project that merges techniques of networked information and hypertext to make an easy but powerful global information system. The first version was put on the Net about July 1993. Soon, users around the world plugged in, proposing improvements, establishing gateways to others information servers, creating convivial graphic interfaces. Those were incorporated in the original project. The WWW team explicitly encourages these collaborative efforts by asking users to suggest new features - or even better - to produce them [8].


          

3. The new face of scientometrics

The developement of a global academic information/communication system also suggests new ways of measuring the impact of scientific contribution that take into account the cooperative aspect of science. The American Physical Society's Task Force's Report on Electronic Information Systems, cited by Harnad, notes that : ..."Unlike inert publication counts or even citation counts, sensitive measures of "air-time" and "flight-route" for new ideas and findings (how often they are accessed, by whom, and where they lead in subsequent electronic and paper litterature) would be helpful not only to those who are trying to evaluate the importance of a given scholar's contribution but also to historian of ideas trying to make sense of the evolution of knowledge." [9].
Candidates for indicators could be remote files retrieval counts (since archiving published or pre-print work on ftp sites is now common practice) or clients hypertext links counts. These can point to "classical" links like citations or author's addresses but they can also assume new forms like the "annotation" or the "is-interested-in" links available on WWW. Similar indicators can be devised in order to assess the impact of information servers or services offered.

Just now, this is still wishfull thinking. Moreover, if our work is to have scientific validity, we must realise that there is much work to be done. Here are some of the problems we will have to confront :

We will not be counting personal signatures but "electronic addresses". There is no assurance that they represent only one person or indeed always the same person. Some information services offer only "hosts access count" and, as we have seen, a host may serve up to hundreds of users.

Even after this problem is solved, we will have to decide what counts can be used as indicators ("of what?"). Publishing and citation counts are used as indicators because of our understanding about a set of practices of scientific communities. The changes brought about by the network are profound and must be integrated in a new model for sientific communities behavior. This should be our first task - the foundation on wich we will be able to build means of observing the flow of academic information and circulation.

One problem we will not have is that of gathering data. All this activity is taking place inside electronic networks and computers produce them automatically. But what do they mean ? How do we make sense of data in the absence of any knowledge about the social practices underlying them ? Most sociological methods of gathering information suppose the laboratory (in our case) as a site. Our collaboratory is, by definition, a geographically distributed "site" (somewhere on the cyberspace, in Internet parlance). New methods of gathering and validating information about the people behind the machines must be thought of.

In its present state, the flow of information on the network (academic or otherwise) is unstructured, even chaotic. One of the most frequently voiced cause of dissatisfaction is the glut of irrelevant information ("info-junk"). Our results can be of invaluable help to the process of structuring and filtering this information flow. Our goals must not be limited to inert observation but must instead be expressed in useful tools that will help users/producers to make sense of the information universe they work in. This is the sense of CERESI's work on the "intelligent mediator".


          

Conclusion

Scientometrics have now a whole new field of study, one that offers us the possibility to integrate different aspects of the socio-cognitive activity of science and its relationship with technology transfer. We must integrate the electronic network reality in our model for information flow among scientists, detect the links that consolidates collaboratories, devise means of understanding the practices of these global scientific communities, establish relevant indicators for the socio-cognitive activity taking place on the Net and finally, embody our results in?into information flow management tools aimed at the new scientific communities.


          

References

[1]
Danzig P. B., Obraczka K., Li S-H, Internet Resource Discovery Services, University of Southern California, 1993. Retrievable as Katia_Obraczka_paper.ps by anonymous ftp on info.cern.ch at /pub/www/doc.

[2]
Callon M., "Teething trouble or premature senility ? Scientometrics is dying, long live to scientometrics".(à paraître)

[3]
Latour B.,La science en action, éditions de la découverte,1989.

[4]
Turner, W.A., "What's in an R : InfoRmetrics or Infometrics?" (à paraître).

[5]
Harnad S., "Post-Gutemberg Galaxie : the Fourth Revolution in the Means of Production of Knowledge". Public-Access Computer Systems Review, 2(1), 3953, 1991. Retrievable as harnad91.postgutemberg by anonymous ftp on princeton.edu at /pub/harnad/Harnad.

[6]
PSYCOLOQUY (ISSN 1055 - 0143) subscription address :
listserv@pucc.bitnet
sub psyc <firstname> <lastname>.

[7]
Kovacs Diane K., Directory of Scholarly Electronic Conferences, The Directory Team, Kent State University. Retrievable as ACADLIST.FILE[1-8] by anonymous ftp on KSUVXA.KENT.EDU at /library.

[8]
Berners-Lee T., Cailliau R., Groff J.-F., Pollemann B., World-Wide-Web : the Information Universe. Electronic Networking : Research, Application and Policy, 2(1) 52-58, Spring 1992, Meckler, USA. Retrievable as ENRAP_9202.ps by anonymous ftp on info.cern.ch at /pub/www/doc.

[9]
Harnad,S "Interactive Publication : Extending the American Physical Society's Discipline Specific Model for Electronic Publishing". Serials Review, Special Issue on Economics Models for Electronic Publishing, 58-61,1992.
Retrievable as harnad92.interactivpub by anonymous ftp on princeton.edu at /pub/harnad/Harnad. Implementing peer review on the net : Scientific quality control in scholarly electronic journals. In : Peek R. and Newby G. (eds). Electronic Publishing Confronts Academia. Cambridge MA : MIT Presse. On the Internet :


© "Les sciences de l'information : bibliométrie, scientométrie, infométrie". In Solaris, nº 2, Presses Universitaires de Rennes, 1995