The first china-united states library conference

Standards for the Global Information Infrastructure (GII) - A Review of Recent Developments, Ongoing Efforts, Future Directions and Issues

Mark H. Needleman

University of California
Office of the President
Kaiser Center Room 854
300 Lakeside Drive
Oakland, CA 94612-3550
USA
(510) 987-0530
(510) 839-3573 (fax)
email: Mark.Needleman@ucop.edu

Introduction

This paper provides a review of activities in the area of standards and standardization that are of importance to librarians and information professionals. It is not intended to be a comprehensive coverage of all such standards and standardization activities, but rather to focus on some recent developments and activities in the area of electronic and networked based information. It will cover work going on in both the traditional library and information communities as well as activities occurring in the internet community that have implications for the library and information world.

Attention will be focused both on the standards themselves, and on some implementations that are going on to take those standards and build real world working applications based on them, and to promote the development of the infrastructure needed to support the world of on-line networked information.

Space constraints preclude, however, any detailed technical description of how these protocols operate. Pointers to more information on them can be found in both the references and bibliography at the end of the paper.

Finally, the paper will conclude with a discussion of some other infrastructure issues that go beyond just standards that are important building blocks for the development of a world of electronic and networked information. The context for that discussion are lessons learned from a research project at the University of California that made available to its users a large amount of full text electronic content. While there is an attempt in this paper to cover international developments, much of the paper will cover work going on in the United States, simply due to the fact that this is where the bulk of the authorís own knowledge and involvement has been.

Z39.50 Developments

One of the most important areas in which development activities have been occurring is with the Z39.501 protocol. Z39.50 is a client server oriented protocol that defines capabilities for information retrieval systems to communicate with each other to search and retrieve information. The protocol, which originally was approved in 1992 is a US national standard developed and maintained by the National Information Standards Organization (NISO)2. NISO is the American National Standards Institute (ANSI) accredited standards developing body in the United States that produces standards in the areas of Libraries, Information Service Providers, and Publishers.

The NISO voting members approved a new version of Z39.50 in 1995. This version significantly extends the functionality of the original protocol. Among the new capabilities incorporated in the new version, are ones for sorting, scanning, a segmentation facility that allows the transfer of larger data objects than was practically possible in the 1992 version, and an Extended Services Facility that allows the initiation of services within a Z39.50 session that get executed outside of it. This allows standard definitions of services that are associated with search and retrieval process but not explicitly part of that activity to be requested using the Z39.50 protocol.

Among the Extended Services defined in the 1995 document are services for requesting printing or other delivery methods of records from result sets, initiation and maintenance of periodic queries, initiation of document delivery requests, and database update. Additional new functionality in the 1995 version include an Explain Facility that allows a client to use the Z39.50 protocol itself to dynamically discover characteristics about a particular server, support for concurrent operations to allow multiplexing requests in a single connection, better support for international character sets, and new record syntaxís including a generalized record syntax called GRS-1 which defines a very generalized facility for extremely precise requests, including the ability to retrieve specific portions of items, variant forms of the same item, and meta-data about an item.

Because Z39.50 is intended to be used in a wide variety of application domains, it does not directly define or constrain either the attributes used to search data, or the format of the data returned on retrievals. There is defined and registered in the standard a basic attribute set known as Bib-1 that is intended to be used to search basic bibliographic type data, and most of the major flavors of MARC are registered for use with Z39.50. This is an area in which a lot of work has been going on in the last couple of years. An attribute set known as STAS, the Scientific and Technical Attribute Set, has been defined and registered for use with scientific type data, and has been implemented in several systems. Other attributes sets have also been defined or are under development.

The intent behind Z39.50 is that the various communities that make use of the protocol will define sets of rules for how Z39.50 is to be used in those applications domains. These sets of rules can define things such as what Z39.50 features must be supported, what attribute sets and actual attribute combinations are to be used, the format of the data to be returned on retrieval, and various other aspects of the Z39.50 protocol, as well as things that are beyond the scope of Z39.50 but may be important for particular types of data or applications. These sets of rules are known as profiles or implementors agreements.

Several profiles for the use of Z39.50 have already been defined and others are under development at this time. There is a profile called ATS-1 (Author, Title, Subject) that defines a basic use of Z39.50 and the Bib-1 attribute set to provide access to bibliographic data for such things as on-line library systems.

GILS3, the Government Information Locator System, is a profile that defines the use of Z39.50 for providing access to government data. GILS is a US federal government standard and is being looked at by other governments both in the US and internationally. There is work going on to define the use of Z39.50 to access geo-spatial data, and some folks in the research community are working on trying to add natural language support to Z39.50.

Work is also going on to define the use of Z39.50 for accessing digital collections, and in the museum community, an organization called the Consortium for the Computer Interchange of Museum Information (CIMI)4 has a project known as CHIO (Cultural Heritage Information On-line). CHIO is building demonstration systems to demonstrate on-line access to museum information. As part of that project a profile for the use of Z39.50 to provide access to museum information is being developed.

There is an international version of Z39.50 known as International Standards Organization (ISO) 10162/101635. The US standard is a compatible superset of the ISO version, and many of the enhancements in the US version have been proposed for inclusion in it. Additionally, because of the desire to ensure future compatibility and international interoperability, there is now a proposal within the ISO committee responsible for 10162/10163 to replace the ISO version with Z39.50. it is possible that this will be voted on in late 1996 or early 1997. Finally, while officially Z39.50 is defined by NISO, the actual technical development of the 1995 standard was done by voluntary committee known as the Z39.50 Implementors Group (ZIG)6 . The ZIG is a group of people representing organizations that are actually developing implementations of the Z39.50 protocol. This includes most of the major library automation vendors, major library utilities like OCLC and RLG, major universities and library consortiums, and major on-line database vendors. The ZIG typically meets 2 to 3 times a year to work on enhancements to the protocol, and on interoperability issues. There has been in recent years an increasing international participation both in those meetings, and also in the on-line mailing list over which most of the discussions occur. Because of the increasing international use of Z39.50, there has been a commitment in the ZIG to holding one of its meetings each year In Europe.

Z39.56 Serial Item/Contribution Identifier Standard

Another standard that will play an important role in the world of electronic networked information is NISO Z39.567 Serial Item/Contribution Identifier (SICI). Z39.56 defines data elements and a structure for a standardized code to identify serial items (issues of journals) and contributions (articles) in them. The original version of the standard was approved by NISO in 1991. Normally NISO standards only come up for review every five years, but in 1994, due to the growth of on-line indexing and abstracting databases and document delivery services, NISO decided to form a committee to look at revising the standard to make it more usable in an on-line electronic environment. A new version of Z39.56 was developed and is currently in the ballot process, as of this writing. This new version incorporates some new mechanisms including a Code Structure Identifier (CSI) to define the type of SICI being dealt with, a Media Format Identifier (MFI) to allow designation of the type of media the SICI is referring to in cases where serials may be published in multiple formats, and a Derivative Part Identifier (DPI) which facilitates references to portions of a serial or contribution such as the table of contents of an issue or the abstract of an article. The DPI is seen as being especially useful in applications such as on-line document delivery. Some other changes that were made include standardizing the punctuation used internally in SICI codes, lengthening the title code and simplifying the rules for its construction. It should be noted that the SICI code was added as a search attribute to the basic bibliographic search attribute set Bib-1 in Z39.50. At the present time there is not, as far as the author is aware, a corresponding international version of Z39.56, which, of course, does not prevent the US version from being used internationally.

The Interlibrary Loan (ILL) Protocol and Developments

The Interlibrary Loan Protocol (ILL) (ISO 10160/10161)8 was approved by ISO in 1991. It was developed to permit the exchange of ILL messages between systems that use different hardware and ILL systems. Its goals are to overcome barriers to ILL communications by providing a standardized message set and format, to facilitate ILL automation and provide the foundation for automation of requesting and supplying material and tracking requests, and to support resource sharing. It supports multiple models of ILL networking and defines a full set of services representing all stages of an ILL transaction. It supports both an electronic mail mode of operation using EDIFACT and a direct connect model.

Canada, especially the National Library of Canada, has been heavily involved in both the development and implementation of the ILL protocol, and there are other implementations in Europe as well. Due in part to the centralized nature of ILL using the large utilities, implementation of the ILL protocol in the United States up until recently has been slow. However, in 1993 the Association of Research Libraries formed the North American Interlibrary Loan and Document Delivery (NAILDD) Project to promote developments that will improve the delivery of library materials to users at costs that are sustainable to libraries. One of the objectives of the NAILDD project is to promote standards and automation efforts that will improve the efficiency of the ILL process. In the fall of 1995 NAILDD formed a ILL Protocol Implementors Group (IPIG) to promote the use of the ILL protocol in the United States and to promote the development of the infrastructure required to support it. Using a model similar to what was done with the Z39.50 Implementors Group (ZIG) that promoted the development of Z39.50, the IPIG has put together a group of library vendors and other organizations who have committed to build real working version of the ISO ILL protocol within a defined time frame, and to setup a testbed to do interoperability testing among those implementations. Phase One calls for the participants to implement the ILL Request and Status/Error messages using ISO Basic Encoding Rules in a direct connection mode on top of the TCP/IP transport protocols by the summer of 1996. Later phases of the project will add additional ILL messages based on experiences gained in the interoperability testing of this first phase, and will have additional testbeds created. Since many of the IPIG participants also have Z39.50 implementations, the direct connect model using BER and TCP/IP was chosen to capitalize on investments that had already been made by those organizations in developing their Z39.50 implementations.

Character Set Standards

There are many different character set standards, some defacto, some national standards, and some international in scope. Many of those standards define single character sets or provide rules for the mapping of one character set to another, such as standards for the Romanization of non Roman characters. However, due to their global scope, two standards deserve mention here, ISO 106469, and Unicode10. ISO 10646, the Universal Multiple-Octet Coded Character Set (UCS) was adopted by ISO in 1993. It is the first officially standardized coded character set whose eventual purpose is to include all characters used in all written languages in the world (including all mathematical and other symbols). The first addition covers all major and commercially important languages. There is both a 2 octet and 4 octet form of the coding space defined. The first 128 positions in the 2 octet coding space are used for the basic ASCII character set. In practice, currently only the 2 octet form is in use. Unicode is a coded character set specified by a consortium primarily of major computer and software manufacturers whose goal was to overcome the chaos of different coded character sets in use when creating multilingual program and internationalizing software. As of version 1.1, Unicode has been aligned with ISO 10646, and the intent is to keep it strictly compatible with the international standard. The Unicode consortium is a contributor to the ISO work to further develop 10646. Unicode can be characterized as an implementation of the 2 octet form of the UCS that includes such things as spacing diacritics and other combining characters, and that defines a more precise specification of the bi-directional behavior of characters when used in such things as the Arabic And Hebrew scripts. Version 2.0 of Unicode will extend its functionality to make use of the wider 4 octet character coding space. A number of semantics traditionally thought to be associated with character sets, most notably sorting or collation order, are explicitly excluded from the definition of Unicode/10646. In addition, in a universal character set that unifies different languagesí use of a single script, the order of elements in a language can not be determined simply from the order of elements in the script.

While there has not yet been wide scale deployment or implementation of either ISO 10646 or Unicode, it is expected this will change over the next few years in that, at least for Unicode, there has been a commitment to it by major computer and software manufacturers. This implementation and deployment is extremely important to resolving the long-standing problems with the lack of ability for systems to be able to deal with anything but relatively limited character sets, and will provide the basis for resolving that situation in an internationally standardized way. Unicode and ISO 10646 doe not, by themselves, address all of the requirements for fully supporting the representation and processing of the worldís languages. Other standards must be developed, and existing standards must be extended to make use of the functionality Unicode and 10646 provide. However, Unicode and 10646 do provide the basic foundation on which to start the task of constructing computer systems and software capable of working with many, and perhaps someday, all of the worldís languages.

Text Formatting Standards

There are many text formatting standards in existence. Many of them are proprietary to various vendors of such things as word processing software programs but have become defacto standards due to their widespread use. Two standards that are, however, worthy of discussion here, are SGML11 and PDF12.

SGML (Standard generalized Markup Language) is an ISO standard (ISO 8879) for electronic document exchange, archival, and processing. SGML does not impose any specific structure onto documents, but rather is a language to write formal definitions of document structures. Document structures known as DTDís (Document Type Declarations) are defined for particular categories of documents. SGML software, by being configured to understand a particular DTD, can thus process all documents that have been encoded to conform to that DTD. It should be stressed that SGML DTDís, unlike many other text formatting systems, are not intended to describe the physical layout of a document, but rather its logical structure, hierarchy, and semantics. It also defines the allowable tags that can be used within a particular document type. This is an extremely powerful concept that enables, depending on the richness of the DTD definition, sophisticated searching and navigation within documents, and also allows for such things as automatic indexing of fields within documents depending again on the nature of the tagging the DTD provides. Since physical layout description is not the purpose of the DTD, having this knowledge of the logical structure of the document allows reformatting its physical manifestation to be dependent on the output device, so that same document encoded in SGML could be rendered in different ways depending, for example, if the output device was a computer screen or a printer. It should be noted that HTML13 (Hypertext Markup Language), the language used to define Web pages, is nothing more than a very simple minded SGML DTD, and was in fact expressly designed to conform to SGML rules and to use SGML capabilities in its definition.

There are some interesting activities going on involving SGML. Many major publishers have converted or are in the process of converting their production processes to make use of it. Because of the capabilities discussed above, this will allow them to have a single unified input source, yet create different outputs tailored to different media.

Project CHIO, mentioned above, is defining SGML DTDís to encode museum information which will then be accessed using Z39.50.

There is a project being lead by the University of California at Berkeley, but involving several other universities, called the Electronic Archival Description (EAD) project that is working on developing a DTD to encode things like the finding aids that often accompany special collections in libraries and archives.

ISO 12083 Electronic Manuscript Preparation and Markup14 is an international standard that defines four DTDís for books, serials, articles, and mathematics. ISO 12083 has also been adopted as a US national standard. The TEI (Text Encoding Initiative) has defined DTDís to facilitate the encoding and exchange of machine readable texts intended for literary, linguistic, historical, or other textual research. And, there is much more other work as well going on involving SGML.

Another potentially important text formatting standard is PDF, Portable Document Format. Unlike SGML which is an openly available international standard PDF is proprietary standard of Adobe Systems and is follow on technology to their Postscript printer language. Unlike SGML which is concerned with the logical structure, PDF defines the physical format of a document and page layout, although it does have searching, navigational and other utility type features. PDF makes electronic documents have much of the same qualities of paper. Some of the reasons that PDF is potentially important are that, while a proprietary format, Adobe does make a reader freely available and does have software that can convert many of the popular word processing formats to PDF, as well as software to convert Postscript documents. They also have software that can take paper documents that have been scanned in and convert them to PDF. In additional PDF seems to be an increasingly popular format for making documents available on the Web, and the Netscape browser, through the use of its new plugin technology, can now display PDF directly within in the browser window.

Both SGML and PDF are potentially important technologies and both may have a role to play in making on-line networked information available. SGML, because of its logical structures and search capabilities may be of use for that, and for long term archival storage. PDF may become a defacto standard for on-line display because it preserves much of the familiar and comfortable print metaphor.

Electronic Data Interchange

Electronic Data Interchange (EDI)15 is the exchange of commercially oriented information in standard electronic formats between automated systems. In commerce, industry and government EDI is used to replace paper purchase orders, invoices, price lists, shipping and customs documents, and other business oriented documents. In the library and publishing arena two organizations are working on developing EDI transaction sets. SISAC, the Serials Industry Systems Advisory Council has concentrated on developing EDI transactions for the serials industry and has developed or is in the process of developing EDI transactions for serials orders, order acknowledgments, claims, cancellations, and invoices. SISAC also developed a machine readable bar code representation of the 1991 version of the Z39.56 SICI code which major publishers are printing on the covers of serials publications, and major library systems vendors have developed interfaces for those bar codes for use in automated checkin systems. BISAC, the Book Industry Systems Industry Advisory Committee, plays a similar role to SISAC for the book industry and has developed EDI transaction sets such as purchase order, purchase order acknowledgment, invoice, title status format, advanced ship notice, ship note/invoice. Currently, both SISAC and BISAC EDI transactions use the US national standard EDI format X12, but both have plans to migrate their X12 implementation to the international standard EDIFACT format. The use of EDI, while perhaps not as glamorous as some of the other standards discussed above, is still important In that it allows libraries to operate more efficiently and thus provide better services to their users.

Internet Community Standards

There is a great deal of work going on in the Internet community on protocols and standards that has had and will continue to have profound impact on the global information infrastructure, and it is worth focusing a bit of attention to that work here. Internet Standards are developed by the Internet Engineering Task Force (IETF)16 which provides a forum for working groups to coordinate technical development of new protocols and standards. The IETF mission includes identifying and proposing solutions to pressing operational problems in the Internet, specifying the development or usage protocols and the near-term architecture to solve technical problems for the Internet, and providing a forum for the exchange of relevant information within the Internet community between vendors, users, researchers, government agencies, and network managers. The IETF is organized into several areas: Applications, Internet, IP Next Generation Development, Network Management, Operational Requirements, Routing Security, Transport, and User Services. Working groups are formed within areas to solve specific identified problems or needs. Due to the nature of some of those needs, some working groups are co-sponsored by more than one area. The IETF meets three times a year, but most of the work of the working groups is done by electronic mail on the publicly open mailing lists the groups create. The IETF is, for the most part, consensus driven, and for something to become a standard it must represent, not only the consensus of those attending the meetings, but also those contributing to electronic discussions. In addition, standards go through several stages, and in order for a standard to reach full standard status, there must be at least two independent demonstrated interoperable implementations of it. Standards in the IETF are, for historical purposes, known as Requests for Comments (RFCís).

Activities of especial importance to librarians and information professionals going on in the IETF include the work being done by the HTML and HTTP working groups. HTML, as mentioned above is the SGML compliant DTD for Web page definition, and HTTP is the transport protocol used by the Web to move Web pages and other content between the server and the web browser. While both of these protocols had their origins outside of the IETF ( this happens fairly often - one of the things the IETF does is take technologies from the outside and standardizes them), there has been a lot of work in the IETF on enhancing and standardizing both HTML and HTTP. An IETF working group developed WHOIS++, a protocol for indexing and accessing directories of information. One component of WHOIS++ is an indexing protocol. There is currently an IETF working group trying to develop a common indexing protocol that can be used in a variety of protocols and applications in the Internet that have indexing requirements. IETF working groups developed MIME, the Multimedia Internet Mail Extensions, that extends the capability of simple internet mail to be able to handle complex document types and multiple formats. IETF working groups are currently working on definitions for encapsulating EDI objects and SGML documents in mail using the MIME capabilities.

In the Internet area, there is a working group that recently released a draft document for a Service Location protocol that addresses how to locate various services in a distributed internetworked environment. In the area of Security, areas of interest include working groups working on Common Authentication technology, Web transport security, Privacy Enhanced Mail (PEM) which add privacy, security, and authentication to internet mail, and to integrate PEM with the MIME standards.

Among the work with the greatest implications for librarians and information professionals taking place in the IETF are the Uniform Resource Identifier (URI) activities. An IETF working group standardized and built on the Uniform Resource Locator (URL) mechanism that had originally been defined in the Web, and added definitions for new objects and protocols into the URL specification. There has also been work going on to define Uniform Resource Names (URNís) that would define persistent unchanging names for internet resources that might exist in multiple forms or copies or move around the Internet. URNís are seen as being a hierarchical distributed naming structure that will replace URLís in Web pages (and other places URLís are currently being used). Protocols will be developed and infrastructure deployed that will allow resolution of a URN to whatever series of URLís happened to exist for that resource at any given time, much as domain names and the Domain Name Resolution System infrastructure replaced the use of raw IP addresses. This work has not (unfortunately due to a variety of circumstances) progressed as quickly as was initially envisioned, although there have been experimental URN systems built and deployed in order to gain some experience with what works best and what are the infrastructure issues that need to be dealt with in order to build large scale production systems. A third area of the URI work as been the work done to develop Uniform Resource Citations (URCís). The URC work has been attempting to define the data elements and structures needed to describe Internet resources and could include such things as language, size, cost, format, and availability of the resource. This is another area in which work has progressed more slowly than hoped, although there is work in this area also currently going on in other communities besides the IETF.

The final IETF activities worth mentioning here are some of the work being done by working groups in the User Services area. Among other things working groups in the User Services area are developing a guide to help artists use the Internet and create and make content available on it. There is a working group chartered to address issues related to the connection of primary and secondary schools worldwide to the Internet, a working group to deal with issues related to end user training, including creating a catalogue of existing training materials, identify gaps in those materials, and provide users with self paced learning materials, and working group to develop a guide describing responsible use of the Internet.

Obviously, this is not an exhaustive description of the entire scope of work going on in the IETF, but rather an attempt to highlight those activities of especial interest and importance to librarians and information professionals, and ones that will have the greatest impact on them. Much of the work of the IETF is focused on much more traditional computer and networking protocol development and issues.

Additional Standards

A few additional standards are worth briefly mentioning here. NISO Z39.58-1992 Common Command Language for On-line Interactive Information Retrieval and its international counterpart ISO 8777 which specified a uniform command terminology for on-line search systems are worth mentioning if only to note that, while there has been some deployment of these standards, there have not been wide scale implementations of them. With the shift in emphasis in on-line systems from terminal based command driven interfaces to graphically oriented interfaces, its likely that these standards will continue to decrease in importance in the future, although they will probably still have some limited role to play for some time to come.

Other standards worth calling out include NISO 39.53-1995 Codes for the representation of Languages for Information Exchange which defines almost 400 three character alphabetic language codes. There is a corresponding international standard ISO 639. ISO 3166 Codes for the Representation of Names of Countries, which was recently approved, is the definitive international standard for country code definitions.

In the area of data element definition standards there are NISO Z39.44 Serials Holdings Statements, NISO Z39.57 Holdings Statements for Non-Serials Items, and NISO Z39.63 Interlibrary Loan Data Elements. These are all currently in the process of being revised and updated, and a three new standards, NISO Z39.69 Record Format for Patron Records, NISO Z39.70 Format for Circulation Transactions, and NISO Z39.71 Holdings Statements for Bibliographic Items are in development.

In the international arena, ISO 8459 defines a directory of bibliographic data elements for various library and information retrieval applications. ISO Draft International Standard (DIS) 690-2 specifies data elements to be included in bibliographic references to electronic documents and also sets out a prescribed order for those elements in the reference. In addition, it establishes conventions for the transcription and presentation of information derived from the source electronic document. In the database world, SQL, Structured Query Language, is an important standard that defines a language for building queries that can be executed against relational databases, and has been adopted by virtually all major relational database vendors.

This is obviously not an exhaustive list of all relevant standards. There are many communities developing standards that are important for librarians and information professionals that space constraints preclude mentioning here, and this is a dynamic and changing arena with new developments constantly occurring.

Infrastructure Issues

Finally, to conclude It is worth spending a little time discussing some other issues besides standards that are important in relation to building the global information superhighway. While all of the standards discussed above, and others, are crucial if that information superhighway is to exist, other things such as a supporting infrastructure based on those standards are all vital to its success. The context for this discussion will be some lessons that were learned about infrastructure issues that came out of the experience the University of California had from a research venture it engaged in to provide on-line access to the full electronic content of scholarly material. The University of California was one of nine US universities that was involved in a joint venture with Elsevier Science Publishers, known as the TULIP project, to provide access to its users on-line to the full bitmapped images of approximately 40 journals in the area of material science. The major goals of the TULIP project were to learn what types of infrastructure were required to support delivery of this type of material to end users, to gain an understanding of what affect having such materials available would have on the scholarly research process, and to begin to develop economic models that made sense for the delivery of this type of electronic content The project ran from 1992 through the end of 1995. While there is not space in this paper to provide details on the actual implementation (the bibliography at the end of this paper contains pointers to articles that do describe in detail both the UC implementation and those of the other participants), the lessons we learned from that implementation have important implications for the future of networked information.

The basic lesson that was learned in that project was that, as important as standards and technologies are in such systems, just as important (or perhaps even more crucial) was having in place the proper infrastructure to support those technologies. This supporting infrastructure covers a wide variety of areas. Among them:

1) Storage - Providing on-line access to large amount of electronic content requires large and ever increasing amounts of computer disk storage to house the material. The TULIP project, which was only 40 some journals for 4 years and contained only black and white images, required about 35 gigabytes of disk storage. When one contemplates scaling such a system to hundreds and even thousands of journals and going beyond black and white images to color and other types of multimedia objects, the storage requirements for such systems quickly become immense. (A follow on project to TULIP at the University of California is already storing over 200 gigabytes of images of journal articles).

2) Network Bandwidth - Having adequate network bandwidth available is crucial to being able to access networked information resources. This includes adequate bandwidth both in the global wide area internet, and also to the desktop inside building and campuses local area networks, as well as to the end users home. This need for every increasing bandwidth is being driven by new applications and the bandwidth intensive resources that are being made available through them. We are already seeing the implications of the incredible growth of the Worldwide Web in the last couple of years and some of the performance problems that have been driven by lack of adequate bandwidth. Applications such as mobile computing also have implications and opportunities in this area.

3) Equipment Infrastructure - New applications and network resources are driving the need for ensuring adequate equipment infrastructure at an ever increasing rate. These applications and the data being made available through them require increasingly faster computers to support them. Organizations need to have strategies for managing both the increasing need for more equipment infrastructure and the ever decreasing cycles with which that equipment needs to be replaced with new generation technology.

4) User Authentication, encryption, and Electronic Commerce - These new application and electronic resources will also require new and better authentication infrastructures that do not currently exist in large scale today. Simple application based password schemes will not work in the environment and will need to be replaced by public/private key based authentication systems that can work across multiple application domains. In addition, data encryption protocols and supporting infrastructure will be needed to protect both the privacy of the users and the authenticity of the information resources. Finally, multiple economic models will exist in this environment, and there will be a need for protocols and supporting infrastructure for electronic commerce in support of those models.

5) Printing - The growth of electronic networked information will by no means reduce the need for printing. In fact, due to user behavior patterns and the current state of the art in computer and display technology, the will be an ever increasing need to be able to print off electronic information. One of the major lessons that was learned from the TULIP project at the University of California was that the universityís printing infrastructure was not complete or ubiquitous enough to support the types of printing that users want to be able to do. The infrastructure was not there to support remote applications being able to print to all of the different types of printers that users had, being able to automatically determine what type of printer a user had locally, and being able to do what ever type of charging that was required by the local campus printing infrastructure.

These are by no means all of the infrastructure issues, but rather the major ones that came up within the context of the TULIP research project. Some of them are issues of technology and will improve over time as technology improves and supporting infrastructures are put in place. However, even in areas of technology, new applications and data types that make use of them, for the foreseeable future, will continue to place constantly increasing demands on whatever technology and infrastructure is in place. Also, many infrastructure issues go beyond technology and have social, economic, legal, and political considerations. Even with all of the work and discussion going on about these issues, there is still a long way to go toward resolving them, and much work remains to be done to determine what models make sense in an electronic environment and how much of our current print oriented models and metaphors can and should be carried forward.

Conclusion

This paper has attempted to survey some of the major standards and standards developments in the world electronic and networked information that librarians and information professionals should be aware of. It also attempted to put those standards and developments into the context of the larger infrastructure issues that must be dealt with as part of building a world of networked information, using experiences gained and lessons learned from one research project, to highlight some of those infrastructure issues. It is not intended to serve as a complete or comprehensive study of those developments, nor does it cover all of the many activities and developments that have occurred or are currently going on. This is a fluid and changing arena, with new developments, projects, research, and deployments constantly taking place. The paper is intended to serve as a guide and road map, and, hopefully, to encourage the reader to seek out further information in areas of interest to them. Pointers, in both print and electronic format, to the topics covered (as well as some others) can be found in the references and bibliography section below.

References

1) ANSI/NISO Z39.50-1995: Information Retrieval (Z39.50): Application Service Definition and Protocol Specification, NISO Press 1995, ISSN: 1041-5653. An on-line version is also available from the Library of Congress Web site for Z39.50 (see Bibliography below)

2) All NISO published standards and many draft standards can be obtained from: NISO Press Fulfillment Center, P.O. Box 338, Oxon Hill, MD, USA 20750-0338 (301) 567-9522 Fax: (301) 567-9553 US and Canada Toll Free: 1 800-282-6476

3) Application Profile for the Government Information Locator Service. (The GILS Profile is available from the Library of Congress Web site for Z39.50 (see Bibliography below)

4) Information on CIMI and the CHIO Project can be found on-line at:
http://www.cimi.org

5) ISO TC46/SC4/WG4 10162 Documentation - Search and Retrieve Service Definition

ISO TC46/SC4/WG4 10163 Documentation - Search and Retrieve Protocol Specification

6) The ZIG maintains an electronic mailing list where technical discussions occur and meeting are announced. To join, send electronic mail to:

LISTSERV@NERVM.NERDC.UFL.EDU with the body of the note containing:

Subscribe Z3950IW <Your Name>

7) Z39.56-1991 Serial Item and Contribution Identifier (SICI), NISO Press, ISBN: 1-880124-15-7 (The revised draft is also available from NISO Press as Z39.56-199x)

8) ISO 10160 Information and Documentation - Open Systems Interconnection -Interlibrary Loan Application Service Definition, 1993.

ISO 10161-1 Information and Documentation - Open Systems Interconnection -Interlibrary Loan Application Protocol Specification - Part 1: Protocol Specification, 1993.

9) ISO/IEC International Standard 10646-1:1993(E): Information Technology--Universal Multiple-Octet Coded Character Set (UCS)--Part 1: Architecture and Basic Multilingual Plane. International Organization for Standardization, Geneva, 1993

10) The Unicode Consortium: The Unicode Standard Worldwide Character Encoding, Version 1.0. Volume 1(Architecture, non-ideographic characters) Addison-Wesley, 1991

The Unicode Consortium: The Unicode Standard Worldwide Character Encoding, Version 1.0. Volume 2 (Ideographic characters) Addison-Wesley, 1992

Unicode Technical Report #4: The Unicode Standard 1.1. The Unicode Consortium, 1993

11) ISO/IEC International Standard 8879: Standard Generalized Markup Language (SGML). International Organization for Standardization, Geneva, 1986

12) Portable document format reference manual / Adobe Systems Incorporated ; Tim Bienz and Richard Cohn. Reading, Mass. : Addison-Wesley Pub. Co., c1993.

13) RFC 1866: T. Berners-Lee, D. Connolly, "Hypertext Markup Language - 2.0î, 11/03/1995 (Available on-line at the IETF Web page listed below in the Bibliography). Note that there are compendium documents, both RFCís and drafts, that define additional HTML features beyond what is in RFC 1866

14) NISO/ANSI/ISO 12083 Electronic Manuscript Preparation and Markup. ISBN: 1-880124-20-3 (Available from NISO Press)

15) A few of reference materials on EDI include:

The EDI handbook : trading in the 1990s / edited by Mike Gifkins & David Hitchcock ; with a foreword by Lord Young of Grafham. London : Blenheim Online, 1988.

Electronic Data Interchange (EDI) Gaithersburg, MD : Computer Systems Laboratory, National Institute of Standards and Technology, 1991. Series title: Federal information processing standards publication ; 161.

What is EDI? : a guide to electronic data interchange / Martin Parfett. 2nd ed. Manchester, England : NCC Blackwell, 1992.

16) A good introduction to the IETF is: RFC 1718 The Tao of IETF -- A Guide for New Attendees of the Internet Engineering Task Force. October 1994. (Available on-line at the IETF Web site listed below in the Bibliography - All IETF RFCís and working drafts may be obtained on-line at that location. Information about the organization of the IETF as well as announcements of upcoming meetings and proceedings of past ones may be found there as well.)

Bibliography

Printed Materials:

Moen, William: A Guide to the ANSI/NISO Z39.50 Protocol: Information Retrieval in the Information Infrastructure. NISO Press.

Z39.50 Implementation Experiences. NISO Press

The Government Information Locator System (GILS) Expanding Research and Development on the ANSI/NISO Information Retrieval Standard. ISBN: 1-880124-11-4, NISO Press.

From A to Z39.50 : a networking primer / by James J. Michael and Mark Hinnebusch. Westport : Mecklermedia, 1995.

Needleman, Mark, ìThe Z39.50 Protocol: An Implementor's Perspective.î Resource Sharing and Information Networks. In Resource Sharing and Information Networks, Haworth Press. volume 8, number 1, 1992

Needleman, Mark, "The Z39.50 Information Retrieval Protocol: The Promise and the Myth", paper presented at the Central European Conference and Exhibition for Academic Libraries and Informatics, Vilnius, Lithuania, September 27-29, 1993.

Practical SGML / by Eric van Herwijnen. 2nd ed. Boston : Kluwer Academic Publishers, 1994.

The SGML handbook / Charles F. Goldfarb ; edited by Yuri Rubinsky. Oxford : Clarendon Press ; Oxford ; New York : Oxford University Press, 1990.

Zeeman, Joe, ìInterlending in the Emerging Networked Environment: Implications for the ILL Protocol Standard.î (Report #8 in the IFLA UDT Series on Data Communications Technologies and Standards for Libraries), ISBN: 0-9694214-8-6 (Available from NISO Press)

Holm, Liv, ìModels for Open System Protocol Development.î (Report #6 in the IFLA UDT Series on Data Communications Technologies and Standards for Libraries), ISBN: 0-9694214-7-8 (Available from NISO Press)

Special Issue on TULIP Project, edited by Nancy Gusack and Clifford A Lynch. Library Hi Tech, Issue 52 13:4 1995, ISSN 0737-8831.

Gelfand, Julia and Needleman, Mark, "TULIP: Participating In An Experiment of Electronic Journal Access: Administrative and Systems Ensure Success," Paper Presented at the IATUL Annual Meeting, Sheffield University, Sheffield, West Yorkshire, United Kingdom, July 7, 1994.

BISAC X12 Implementation Guidelines for Electronic Data Interchange. ISBN: 0-940016-44-3 (Available from NISO Press)

Machine-Readable Coding Guidelines for the US Book Industry. ISBN: 0-940016-33-8 (Available from NISO Press)

SISAC X12 Implementation Guidelines for Electronic Data Interchange. ISBN: 0-940016-57-5 (Available from NISO Press)

Text Encoding Initiative (TEI) Guidelines (Available from NISO Press)

Selected Web Sites:

NISO - National Information Standards Organization:

http://www.niso.org

ANSI - American National Standards Institute:

http://www.ansi.org

NIST - National Institute for Standards and Technology:

http://www.nist.gov

IETF - Internet Engineering Task Force:

http://www.ietf.cnri.reston.va.us/home.html

ISO - International Organization for Standardization:

http://www.iso.ch (Available in both English and French)

IEC - International Electrotechnical Commission:

http://www.iec.ch

ITU - International Telecommunications Union:

http://www.itu.ch

DISA - Data Interchange Standards Association:

http://www.disa.org

Z39.50 Information and Pointers:

http://lcweb.loc.gov/agency/z3950

Unicode Information:

http://www.stonehand.com/unicode/standard.html

SGML Information:

http://www.sil.org/sgml/sgml.html

EDI Information and Resource Pointers:

http://www.premenos.com/Resources/

BISAC - Book Industry Systems Advisory Committee:

http://www/bookwire.com/bisg/bisac.html

SISAC - Serials Industry Systems Advisory Committee:

http://www/bookwire.com/bisg/sisac.html

-------------------------------------------------------------

Adobe and Postscript are registered trademarks of Adobe Systems Incorporated which may be registered in certain jurisdictions

conference papers conference reports Images WWW links