This version dated May 24, 2006 reflects
RFC 4452.
(Supercedes earlier draft from February 28, 2004.)
Feedback on this FAQ should be directed to <mailto:infoURI@oclc.org>.
Disclaimer:
This FAQ is provided for information purposes only
and should be treated as a work in progress.
A Uniform Resource Identifier (URI) provides a simple and extensible means for identifying a resource within the World Wide Web global information architecture. Each URI begins with a scheme name that refers to a specification for assigning identifiers within that scheme. As such, the URI syntax is a federated and extensible naming system wherein each scheme's specification may further restrict the syntax and semantics of identifiers using that scheme.
The final specification for INFO is RFC 4452. (The five Internet-Drafts that were published to support the INFO URI scheme application are available here on the INFO website.)
The INFO URI scheme was developed from within the library and publishing communities to expedite the referencing by URIs of information assets that have identifiers in public namespaces but have no representation within the URI allocation.
For various reasons (both cultural and technical) the creation and registration of a new URI scheme or URN namespace to support a given public namespace under the URI allocation may not have been attempted by the authority for that namespace. It is precisely to facilitate the representation of these public namespaces within the URI allocation that the INFO URI scheme was developed.
The motivation behind developing the INFO URI scheme was to allow legacy identification systems to become part of the World Wide Web global information architecture so that the information assets they identify can be referenced by Web-based description technologies such as XLink, RDF or Topic Maps. Note that we are concerned with "information assets", not "digital assets" per se - the information assets may be variously digital, physical or conceptual.
No. The INFO URI scheme does not compete with independent URI registrations but rather cooperates with independent URI registrations by providing a lightweight early URI registration mechanism to support referencing of public information assets ahead of any possible subsequent URI scheme or URN namespace application. Note that in the majority of cases no subsequent URI scheme or URN namespace application will be made by a Namespace Authority as the INFO resource identifier alone will be sufficient in providing an identification service. Only if additional services are required would a Namespace Authority seek to register a URI scheme independently.
No. INFO URIs exist primarily to support the representation of identity and comparison with other INFO URIs, and therefore any Web applications that can recognize a URI within a URI usage context should not break with this additional URI scheme. INFO URIs are conformant to the general syntax requirements of URIs as laid out in RFC 3986 and as such are readily parseable and recognizable as legitimate URI strings. Any dereference capabilities associated with any particular INFO namespace are outside the scope of INFO itself.
The INFO URI scheme was developed by members of the library and publishing communities working together under the auspices of ANSI/NISO.
Representing the library communities are Herbert Van de Sompel of LANL (Los Alamos National Laboratory) Research Library and Stu Weibel of OCLC (Online Computer Library Center), while representing the publishing communities are Tony Hammond of Nature Publishing Group and Eamonn Neylon of Manifest Solutions, a publishing technology consultancy. NISO, the National Information Standards Organization, is a standards developing organization accredited to the International Organization for Standards (ISO) by the American National Standards Institute (ANSI), the sole US representative to ISO.The scheme is called INFO as a shorthand for the information asset that it references by means of a URI.
Another reading of INFO is that INFO exists primarily to provide information about an information asset, the information disclosed being confined to identity alone. Richer sets of information, such as authority metadata, would require resolution services which are not directly supported by INFO and hence would require a corresponding independent URI scheme or URN namespace application.
An Internet-Draft for the INFO URI scheme was first announced September 30th, 2003 (-00). Subsequent revisions to this draft are available here and were issued in December 2003 (-01), July 2004 (-02), January 2005 (-03), and August 2005 (-04). RFC 4452 was published in April 2006.
The INFO URI scheme has already been fully embraced by a number of Web description technologies: the OpenURL Framework and the SRU/W information retrieval initiative.
The value proposition of INFO URIs is to facilitate Namespace Authorities in representing identifiers from their namespaces in URI form and so to make those identifiers available for public use in Web-based description technologies independently of any Web infrastructure considerations on behalf of the Namespace Authorities.
As such, INFO URIs are designed to be simple to deploy, both in terms of namespace registration and in the creation and general usage of these identifiers. These simplifications are fully in line with the INFO URI remit of providing a lightweight URI registration mechanism to facilitate the referencing of information assets under the URI allocation.
INFO URIs are not globally dereferenceable, nor is there any implication that they may be, although individual Namespace Authorities may register dereference methods on a per-namespace basis. The prime purpose of INFO is the disclosure of the identity of an information asset from a public namespace within the World Wide Web global information architecture.
A non-globally dereferenceable URI is one that supports no global resolution method, although per-namespace methods may exist as declared by the relevant Namespace Authorities.
INFO is focused exclusively on supporting identity. As such, this focus on the prime functionality of INFO URIs to supporting identity leads to the following considerations:
The real need for presenting legacy identifiers in URI form is that many Web-based description technologies (e.g. XLink, RDF, Topic Maps, OpenURL, SRU/W) recognize URIs as the only form of globally unique identifier.
Besides this, the URI naming architecture offers to identifiers a common base syntax, a common base semantics, and a common nomenclature for talking about identifier constructs (e.g. URIs reference "resources" while generally identifiers for information assets will reference diverse kinds of things - objects, records, etc.). The URI thus provides a uniform (and unifying) naming architecture for identifiers.
The examples FAQ included here might afford some insight into the desirability of rendering identifiers as URIs, cf. the identifiers in their native form and as rendered in URI form.
No. Namespaces used under INFO must be registered in the INFO Registry.
A Namespace Authority is entitled to register a recognized namespace after suitable review.
A Namespace Authority is the body that owns and manages a public namespace.The namespaces eligible for registration under INFO will typically be those of interest to the publishing, library and media communities. These communities necessarily have a very wide purview. Candidate namespaces will be those that are for public use only and that are not part of the URI allocation. Non-public namespaces are not eligible for registration.
For all registered namespaces the reader is advised to consult the INFO Registry. The following registered namespaces are typical:
No. On the contrary, by occupying its own toplevel URI scheme, INFO does not restrict a Namespace Authority from proceeding with registration under any portion of the URI allocation either as a toplevel URI scheme or as a URN namespace. In particular, an independent INFO URI scheme does not interfere with any of the existing URI namespaces which have their own defined semantics.
Some general considerations regarding URIs are:
HTTP URIs (RFC 2616) are Internet protocol elements for referencing hypertext documents which can be retrieved from a network authority using the HTTP transfer protocol. There is a common expectation that HTTP URIs can be dereferenced.
The following considerations hold in respect of HTTP URIs:
URN URIs (RFC 2141) are Internet protocol elements for referencing resources using persistent and location-independent identifiers, representations of which may be retrieved using various resolution mechanisms. There is a common expectation that URN URIs can be dereferenced, once suitable resolution mechanisms are defined (e.g. DDDS or other proprietary mechanisms). Indeed, RFC 1737 goes so far as to make a strong recommendation that "there be a mapping between the names generated by each naming authority and URLs".
Use of URN URIs requires a URN namespace registration. An informal URN namespace is of limited utility because its numerical nature obliterates any branding or name recognition and effectively renders the namespace anonymous. A formal URN namespace, on the other hand, would require a more substantial review than a corresponding registration under the INFO Registry. Based on experience with the initial INFO namespace target group, it is unlikely that many Namespace Authorities will proceed with independent applications as the burden of registering a URN namespace is high, especially in the case of organizations that are not strongly steeped in technology.
One particular impediment in applying for a URN namespace for INFO is that this would compromise any possible future URN namespace registration that a Namespace Authority might seek to make in respect of considerations of persistence, location independence and/or dereference to resource representations.
The following considerations hold in respect of URN URIs:
Two other generic URI schemes that might be considered are DATA URIs and TAG URIs.
DATA URIs (RFC 2397) are Internet protocol elements for the referencing of inline resources, i.e. the reference is an immediate address.
The following considerations hold in respect of DATA URIs:
TAG URIs (RFC 4151) are Internet protocol elements for referencing resources that are globally unique across network space and time. The basic premise is that a specific identifier is prefixed with a TAG authority which is constructed from a fully qualified domain name (either used alone or embedded within an email address) and a date in ISO 8601 format. The TAG authority should thus guarantee that the TAG URI is globally unique. TAG URIs may or may not have resolution mechanisms, although those mechanisms are not specified.
The following considerations hold in respect of TAG URIs:
The INFO Registry provides a mechanism for the registration of public namespaces that are used for the identification of information assets, and that are not referenceable within the URI allocation.
The INFO Registry is located on the INFO website at: <http://info-uri.info/registry/>
NISO is the Maintenance Agency for the INFO Registry and may delegate operational responsibility to an operating body, or Registry Operator.
A publicly articulated policy established under NISO governance will be made available on the INFO website <http://info-uri.info/>. The INFO Registry policy defines a review process for candidate namespaces and provides measures of quality control and suitability for entry of namespaces
The INFO Registry provides an online registration form which solicits the relevant fields. This information will then be made available to a public approval process. Subject to a favourable review the namespace will be added to the INFO Registry.
The INFO Registry is publicly accessible and supports discovery (by both humans and machines) of:
The INFO URI syntax is very straightforward:
"info:" namespace "/" identifier [ "#" fragment ]where
namespace
is a registered namespace token,
identifier
is a %-escaped identifier, and
fragment
is an optional %-escaped identifier to
a secondary resource.
The INFO URI syntax introduces certain restrictions on the generic URI syntax:
The INFO URI syntax is fully conformant with the generic URI syntax defined in RFC 3986.
This specification uses the Augmented Backus-Naur Form (ABNF) notation of RFC 4234 to define the URI. The following core ABNF productions are used by this specification as defined by Appendix B.1 of RFC 4234: ALPHA, DIGIT, HEXDIG. The INFO URI syntax is presented in two parts. Part A contains productions specific to the INFO URI scheme:
info-URI = info-scheme ":" info-identifier [ "#" fragment ] info-scheme = "info" info-identifier = namespace "/" identifier namespace = scheme identifier = *( pchar / "/" ) |
Part B contains generic productions from RFC 3986 which are repeated here both for completeness and for reference.
scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." ) pchar = unreserved / pct-encoded / sub-delims / ":" / "@" fragment = *( pchar / "/" / "?" ) unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" pct-encoded = "%" HEXDIG HEXDIG sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "=" |
The fragment on an INFO URI signifies a secondary resource with respect to the primary resource identified by an absolute INFO URI without a fragment.
Some examples of (syntactically) valid INFO URIs are given below together with the identifiers they reference in their native form and the information assets that they identify:
a. Dewey Decimal Classification << |
Notes:
ddc is the INFO namespace component for a Dewey Decimal Classification namespace and 22/eng//004.678 is the identifier component for an identifier of an information asset within that namespace.
|
info:ddc/22/eng//004.678 22/eng//004.678 Information Asset: Vocabulary Term "Internet" |
b. Library of Congress Control Number << |
Notes:
lccn is the INFO namespace component for a Library of Congress Control Number namespace and 2002022641 is the identifier component for an identifier of an information asset within that namespace.
|
info:lccn/2002022641 2002022641 Information Asset: Metadata Record "Newcomer, Eric. Understanding Web services: XML, WSDL, SOAP, and UDDI. Boston: Addison-Wesley, 2002." |
c. Serial Item and Contribution Identifier (SICI) << |
Notes:
sici is the INFO namespace component for a
Serial Item and Contribution Identifier namespace and
0363-0277(19950315)120:5%3C%3E1.0.TX;2-V is the identifier component for an identifier of an information asset in that namespace in escaped form, or in unescaped form 0363-0277(19950315)120:5<>1.0.TX;2-V .
|
info:sici/0363-0277(19950315)120:5%3C%3E1.0.TX;2-V 0363-0277(19950315)120:5<>1.0.TX;2-V Information Asset: Journal Issue "Library Journal, Vol. 120, no. 5. March 15, 1995." |
d. Astrophysics Data System Bibcode << |
Notes:
bibcode is the INFO namespace component for an Astrophysics Data System bibcode namespace and 2003Icar..163..263Z is the identifier component for an identifier of an information asset within that namespace.
|
info:bibcode/2003Icar..163..263Z 2003Icar..163..263Z Information Asset: Abstract of Journal Article "K. Zahnle, P. Schenk, H. Levison and L. Dones, Cratering rates in the outer Solar System, Icarus, 163 (2003) pp. 263-289." |
e. PubMed Identifier << |
Notes:
pmid is the INFO namespace component for a PubMed Identifier namespace and
12376099
is the identifier component for an identifier of an information asset within that namespace.
|
info:pmid/12376099 12376099 Information Asset: Abstract of Journal Article "Wijesuriya SD, Bristow J, Miller WL. Localization and analysis of the principal promoter for human tenascin-X. Genomics. 2002 Oct;80(4):443-52." |
Some examples of (syntactically) valid INFO URIs are given below as they would be used by applications:
a. RDF Graph << |
Notes:
This RDF graph asserts a set of statements about the resource
referenced by the URI info:pii/S0888-7543(02)96852-7 (which
identifies the Publisher Item Identifier
S0888-7543(02)96852-7 ) using the
Dublin Core
vocabulary.
|
<rdf:Description about="info:pii/S0888-7543(02)96852-7"> <dc:creator>Wijesuriya, S.D.</dc:creator> <dc:title>Localization and analysis of the principal promoter for human tenascin-X.</dc:title> <dc:identifier>info:doi/10.1006/geno.2002.6852</dc:identifier> </rdf:Description> |
b. Topic Map << |
Notes:
This topic map entry defines the topic internet which has a public
subject indicator as referenced by the URI
info:ddc/22/eng//004.678 (which identifies the
Dewey Decimal Classification Internet ) and a base name string of
Internet .
|
<topic id="internet"> <subjectIdentity> <subjectIndicatorRef xlink:href="info:ddc/22/eng//004.678" /> </subjectIdentity> <baseName id="_id123"> <baseNameString>Internet<baseNameString> </baseName> </topic> |
c. Extended XLink << |
Notes: |
<ce:inter-refs> <ce:inter-refs-text id="interref8">Parts I and II</ce:inter-refs-text> <ce:inter-ref-end xlink:href="info:pii/S0167-8396(00)00009-1"> <ce:inter-ref-title>Part I</ce:inter-ref-title> </ce:inter-ref-end> <ce:inter-ref-end xlink:href="info:pii/S0167-8396(00)00010-8"> <ce:inter-ref-title>Part II</ce:inter-ref-title> </ce:inter-ref-end> <ce:inter-refs-link/> </ce:inter-refs> |
d. RSS 1.0 Feed - ContextObject (1) << |
Notes:
The ContextObject shown here in an RSS feed
is making use of the mod_context RSS 1.0 module. This is
the same ContextObject as shown below in Example e) but with a different
serialization.
|
<channel rdf:about="http://rss.example.com/"> <title>My conetxt-sensitive RSS feed</title> <link>http://rss.example.com/"</link> <!-- ContextObject => { --> <ctx:ctx_ver>Z39.88-2004</ctx:ctx_ver> <ctx:rft_id> info:sici/0363-0277(19950315)120:5%3C%3E1.0.TX;2-V </ctx:rft_id> <ctx:req_id> mailto:a.n.other@example.net </ctx:req_id> <!-- } --> <items> <rdf:Seq> <rdf:li rdf:resource="http://rss.example.com/resource/1" /> <rdf:li rdf:resource="http://rss.example.com/resource/2" /> </rdf:Seq> </items> </channel> |
e. OpenURL Link - ContextObject (2) << |
Notes:
The ContextObject is shown here on an OpenURL which is requesting context-sensitive services from
the link resolver <http://link.example.com/resolver> for the
referent entity identified by the URI
info:sici/0363-0277(19950315)120:5%3C%3E1.0.TX;2-V
(which identifies the SICI code
0363-0277(19950315)120:5<>1.0.TX;2-V )
and the requester entity identified by the URI
mailto:a.n.other@example.net . The example OpenURL is text
wrapped with whitespace for readability.
|
http://link.example.com/resolver?url_ver=Z39.88-2004 &rft_id=info:sici/0363-0277(19950315)120:5%3C%3E1.0.TX;2-V &req_id=mailto:a.n.other@example.net |
f. SRU Record << |
Notes: |
<record> <recordSchema>info:srw/schema/1/dc-v1.1</recordSchema> <recordPacking>xml</recordPacking> <recordData> <srw_dc:dc> <dc:title>This is a Sample Record</dc:title> </srw_dc:dc> </recordData> <recordPosition>1</recordPosition> <extraRecordData> <rel:rank>0.965</rel:rank> </extraRecordData> </record> |
A prime motivator for comparison of URIs is to improve performance in retrieval operations by using caching mechanisms. A secondary motivator is to equivalence resource identifiers in Web descriptions of information structures to allow those structures to be normalized or merged with other like structures, e.g. the merging of two RDF graphs or the merging of two Topic Maps.
As far as INFO URIs are concerned there are no associated retrieval mechanisms, and the prime motivator for comparison of INFO URIs is to support the equivalencing of identifiers for information assets.
INFO URIs only exist in absolute form - no relative URI forms are allowed. The normalization steps that should be applied are with respect to the three components of the URI: the "scheme" component, the "namespace" component and the "identifier" component.
The following generic normalization steps should be applied:Before comparing INFO URIs they should be normalized by applying the standard INFO normalization rules (lowercasing the "scheme" and "namespace" components, unescaping non-reserved characters and uppercasing any %-escaped characters that remain). In the example below four unnormalized forms (Step A), are reduced to normalized forms (Step B). (Subsequent normalization steps are not shown as they would be namespace specific.)
The Registry may be consulted for namespace-specific rules on case normalization and punctuation normalization. If these namespace-specific rules are available INFO URIs may be reduced to a unique canonical form.
Step A - Unnormalized Forms U1. INFO:PII/S0888-7543(02)96852-7 U2. info:PII/S0888754302968527 U3. info:pii/S0888%2D7543%2802%2996852%2D7 U4. info:pii/s0888-7543(02)96852-7 Step B - Normalized Forms N1. info:pii/S0888-7543(02)96852-7 N2. info:pii/S0888754302968527 N3. info:pii/S0888-7543(02)96852-7 N4. info:pii/s0888-7543(02)96852-7 |
The effort to create the INFO URI scheme emerged from the ANSI/NISO process to standardize the OpenURL Framework for Context-Sensitive Services (see the Part 1 and Part 2 ballot documents), which requires the ability to describe resources by means of globally unique identifiers. The Draft Standard for Trial Use released for Public Comment introduced a "proprietary" naming architecture which allowed information assets to be referenced by means of widely used non-URI identifiers (e.g. National Library of Medicine PubMed identifiers, International DOI Foundation Digital Object Identifiers, NASA Astrophysics Data System Bibcodes, and others) which would be registered under the OpenURL Framework.
The Public Comment period started March 12th, 2003, when the first document was released. Early public feedback led to the decision to fundamentally revise the naming architecture, and to base all resource identification requirements within the OpenURL Framework on URIs alone. Because it was deemed unrealistic to expect that all namespaces required in the OpenURL Framework would be registered within the URI allocation by the respective Namespace Authorities, the INFO URI effort was launched.
Development of the INFO URI scheme is being conducted under the auspices of NISO, together with consultation from the IETF and the W3C. A consultation document "Bootstrapping the Web" was prepared to analyse the options available.
On June 19th 2003, representatives from NISO, NISO Committee AX on OpenURL Framework standardization, the IETF, and the W3C met to discuss the requirements with respect to the identification of resources in the OpenURL Framework. More specifically, this group discussed how to handle the ORI and XRI Naming Environments introduced in the OpenURL Framework Draft Standard for Trial Use as an integral part of the Internet's URI environment. There was a general consensus to proceed with registration of a toplevel URI scheme and follow on meetings with representatives from the various communities were scheduled.
It was recognized that these resource identifiers would have a much wider applicability to many Web-based applications and it was decided to decouple this URI scheme from the OpenURL Framework and to name it INFO. An Internet-Draft was duly prepared and posted to the Internet-Drafts repository on September 30th, 2003. This was also widely communicated to various mailing lists and there was much discussion of the new INFO URI scheme on the public uri@w3.org mailing list.
The final specification of INFO RFC 4452 was published April 2006 which, together with this FAQ, will guide Web applications, such as OpenURL Framework implementations and others, in making use of the INFO URI scheme to deploy legacy identifiers within the World Wide Web global information architecture.
Both the OpenURL Framework and INFO are now approved information standards and can be used securely in this capacity.
The INFO Registry has been established and namespace registrations are currently being processed. A Registry policy document is available as a work in progress from the INFO website. An RSS feed (defunct) is also available to alert subscribers to newly registered INFO namespaces.
NISO Committee AX completed its work on the OpenURL Framework standardization at its last face-to-face meeting on October 27th/28th, 2003 and handed over to NISO the final documents. The ANSI/NISO information standard Z39.88-2004 was duly balloted and approved for publication April 15th, 2005.