Session 5: (Thursday am) Link Aggregation
12 Untangling Compound Documents on the Web Nadav Eiron, Kevin S. McCurley

Data Mining, Hypertext Structure, Link Analysis, Search Engines, Semantic Web

Most text analysis is designed to deal with the concept of a ``document``, namely a cohesive presentation of thought on a unifying subject. By contrast, individual nodes on the World Wide Web tend to have a much smaller granularity than text documents. We claim that the notions of ``document`` and ``web node`` are not synonomous, and that authors often tend to deploy documents as collections of URLs, which we call ``compound documents``. In this paper we present new techniques for identifying and working with such compound documents, and the results of some large-scale studies on such web documents. The primary motivation for this work stems from the fact that information retrieval techniques are better suited to working on documents than individual hypertext nodes.

13 Providing Support for Browsing Intricately Interconnected Paths Pratik Dave, Unmil P. Karadkar, Richard Furuta, Luis Francisco-Revilla, Frank Shipman, Suvendu Dash

Navigation, Path-centric browsing, Navigation metaphors, Directed paths, Walden's Paths, Path Engine.

Paths have long been recognized as an effective medium for communicating knowledge. They have been included within hypermedia systems as supporting tools to organize and present information. Graph-centric or Node-centric browsing are the two commonly identified hypertext-browsing paradigms. We believe that Path-centric browsing, the browsing behavior exhibited by path interfaces, is an independent browsing paradigm that combines useful aspects of the two commonly supported cases. The Walden's Paths project supports Path-centric traversal over Web-based materials. This paper expands the notion of our paths to include more generalized structures and interconnections across paths. We present an architecture for describing complex networks of such paths. We discuss the design and present a prototype implementation of the Path Engine, a tool that provides a linear interface to intricately interconnected paths.

14 Publishing Evolving Metadocuments on the Web Andruid Kerne, Madhur Khandelwal, Vikram Sundaram

Short paper:

Metadocuments are documents that consist primarily of references to other documents. Our active browsing web visualization tool generates an evolving series of navigable metadocument snapshots over time. It conducts expression-directed automatic retrieval of information from the web. The granularity of browsing is shifted from documents to the finer grained information elements, which are metadocument constituents. While the user can engage in direct manipulation expressions of interest and design, the program performs procedural visual composition of the information elements to form spatial hypertext. As prior versions of the tool lacked the save/load capability, they were entirely process-oriented. The metadocuments existed only as transient states. This paper is an early report on our new metadocument authoring and publishing capability, and some of its potential uses. Saved metadocuments can be published on the web. Once published, they can serve both as static navigable metadocuments, and as the jumping off point for further evolutionary browsing of the information space represented by the collected elements.

15 Multi-Layered Cross-Media Linking Beat Signer, Moira C. Norrie

Short paper: Linking, Navigation

The integration of printed paper and digital information enables new forms of enhanced reading. We present digitally augmented paper as a specific application of our more general Integration Server (iServer) architecture for cross-media information management. Multi-layered linking is introduced as a way to manage the granularity of link anchors and an application making active use of multi-layered links is presented. Furthermore, we point out how the concept of supporting multiple layers in link management can be applied to other media such as, for example, XHTML in combination with the XML Linking Language (XLink).