Session 2: (Wednesday am) Emergent Web Patterns
4 Extracting Evolution of Web Communities from a Series of Web Archives Masashi Toyoda, Masaru Kitsuregawa

Link Analysis, web community, evolution

Recent advances in storage technology make it possible to store a series of large Web archives. It is now exciting challenge for us to observe evolution of the Web. In this paper, we propose a method for observing evolution of web communities. A web community is a set of web pages created by individuals or associations with a common interest on a topic. So far, various link analysis techniques have been developed to extract web communities. We analyze evolution of web communities by comparing four Japanese web archives crawled from 1999 to 2002. Statistics of these archives and community evolution are examined, and the global behavior of evolution is described. Several metrics are introduced to measure the degree of web community evolution, such as growth rate, novelty, and stability. We developed a system for extracting detailed evolution of communities using these metrics. It allows us to understand when and how communities emerged and evolved. Some evolution examples are shown using our system.

5 The Connectivity Sonar: Detecting Site Functionality by Structural Patterns Einat Amitay, David Carmel, Adam Darlow, Ronny Lempel, Aya Soffer

Link Analysis, Hypertext Structure, Search Engines, Data Mining, World Wide Web, Web graphs, Web Information Retrieval

Web sites today serve many different functions, such as corporate sites, search engines, e-stores, and so forth. As sites are created for different purposes, their structure and connectivity characteristics vary. However, this research argues that sites of similar role exhibit similar structural patterns, as the functionality of a site naturally induces a typical hyperlinked structure and typical connectivity patterns to and from the rest of the Web. Thus, the functionality of Web sites is reflected in a set of structural and connectivity-based features that form a typical signature. In this paper, we automatically categorize sites into eight distinct functional classes, and highlight several search-engine related applications that could make immediate use of such technology. We purposely limit our categorization algorithms by tapping connectivity and structural data alone, making no use of any content analysis whatsoever. When applying two classification algorithms to a set of 202 sites of the eight defined functional categories, the algorithms correctly classified between 54.5% and 59% of the sites. On some categories, the precision of the classification exceeded 85%. An additional result of this work indicates that the structural signature can be used to detect spam rings and mirror sites, by clustering sites with almost identical signatures.

6 Automatically Sharing Web Experiences through a Hyperdocument Recommender System Alessandra Alaniz Macedo, Khai Nhut Truong, Jose Antonio Camacho-Guerrero, Maria da Graca Campos Pimentel

Navigation, Linking, World Wide Web, Open Hypermedia, Semantics, Recommendation System

As an approach that applies not only to the support user navigation on the web, recommender systems have been built to assist and augment the natural social process of asking for recommendations from other people. In a typical recommender system, people provide recommendations as inputs, which the system aggregates and directs to appropriate recipients. In some cases, the primary transformation is in the aggregation; in others the value of the system lies in its ability to make good matches between the recommenders and those seeking recommendations. In this paper we discuss architectural and design features of WebMemex, a system that (a) provides recommended information based on capturing the history of navigation from a list of people well-known to the users --- including the users themselves, (b) allows the user to have access from any networked machine, (c) demands user authentication to access the repository of recommendations, and (d) allows the user to specify when capturing of her history should be performed.