HT '23: Proceedings of the 34th ACM Conference on Hypertext and Social Media

Full Citation in the ACM Digital Library

SESSION: Interactive Media: Art and Design

SPORE: A Storybreaking Machine

The paper presents SPORE, a Spatial Recommender System. As we enter a period of unprecedented collaboration between authors and computers, where artificial intelligence in particular seems likely to act increasingly in a co-authoring capacity, SPORE offers a different approach to collaboration. More organic and exploratory than other automated or procedural systems, SPORE aims to mimic the process of storybreaking that already exists in the creative industries.

IA, not only AI

This is a demo and overview of my work today, demonstrating my approach to what my mentor Doug Engelbart called 'Intelligence Augmentation'(IA), as opposed to only following the 'Artificial Intelligence' (AI) approach. The work has been implemented on the macOS platform, by my independent software development company 'The Augmented Text Company', in the form of the text tool 'Liquid', the word processor 'Author' and the PDF viewer 'Reader', using Visual- Meta to embed and read rich metadata in an open and robust manner.

HyperBrain: Human-inspired Hypermedia Guidance using a Large Language Model

We present HyperBrain, a hypermedia client that autonomously navigates hypermedia environments to achieve user goals specified in natural language. To achieve this, the client makes use of a large language model to decide which of the available hypermedia controls should be used within a given application context. In a demonstrative scenario, we show the client's ability to autonomously select and follow simple hyperlinks towards a high-level goal, successfully traversing the hypermedia structure of Wikipedia given only the markup of the respective resources. We show that hypermedia navigation based on language models is effective, and propose that this should be considered as a step to create hypermedia environments that are used by autonomous clients alongside people.

The Fact-Checking Observatory: Reporting the Co-Spread of Misinformation and Fact-checks on Social Media

In the context of the Covid-19 pandemic and the Russian invasion of Ukraine, tracking how misinformation and fact-checks spread on social media is key for understanding where fact-checking efforts need to be focused and what demographics are most likely to spread misinformation. In this article, we introduce the Fact-checking Observatory, a website that automatically generates human-readable weekly reports about the spread of misinformation and fact-checks on Twitter. The proposed approach differs from other tools that give one-off manual reports or visualisation by providing organisations and individuals with easily readable and shareable self-contained reports that contain both information about the spread of misinformation and fact-checks.

Hypertextuality and Virtual Reality: Translating Hypertext Functionality in Rob Swigart's Portal for the VR Game, DATA ENTRY: PORTAL

This article discusses the way in which the hypertext functionality of Rob Swigart's, Portal, has been translated for the Virtual Reality (VR) game, DATA ENTRY: PORTAL. The shift from the two-dimensional space of the original game into the 3-dimensional environment of the adaptation necessitated reconceptualizing the method in which narrative development progresses, navigation occurs, and choice is provided. With interest in retrogaming and rebooting popular games built for consoles and desktop computers, this paper provides insights into how player experience can be enhanced through building hypertext functionality into the gameplay of VR environments.

Viki LibraRy: A virtual reality library for collaborative browsing and navigation through hypertext

We present Viki LibraRy, a virtual-reality-based system for generating and exploring online information as a spatial hypertext. It creates a virtual library based on Wikipedia in which Rooms are used to make data available via a RESTful backend. In these Rooms, users can browse through all articles of the corresponding Wikipedia category in the form of Books. In addition, users can access different Rooms, through virtual portals. Beyond that, the explorations can be done alone or collaboratively, using Ubiq.

News in Time and Space: Global Event Exploration in Virtual Reality

We present News in Time and Space (NiTS), a virtual reality application for visualization, filtering and interaction with geo-referenced events based on GDELT. It can be used both via VR glasses and as a desktop solution for shared use by multiple users with Ubiq. The aim of NiTS is to provide overviews of global events and trends in order to create a resource for their monitoring and analysis.

Developing and implementing a superconnector of producers in the printing industry to facilitate book historical research: Enabling digitalization of research processes by consolidating data from multiple sources

The field of book history is on a head-on collision course with the requirements for the next phase in its digital development. This study first explores the notion of "digitalisation", the forthcoming phase, by drawing analogies from the forerunners in the Digital Revolution.

The second section pinpoints the primary arenas where current practices could obstruct a seamless progression towards the ensuing phase.

Subsequently, 'superconnectors' are introduced as potential mitigators of these barriers. A particular instance of such a superconnector is then delineated.

This article lays the theoretical groundwork for TAPITTA, an exemplary superconnector that was demonstrated at the Hypertext 23 Conference in Rome, from Sep 4--8, 2023. Commencing Jan 1, 2024, it is accessible online at https://tapitta.be."

Transhistorical Urban Landscape as Hypermap

This article explores the conception, design, and implementation of a hypertextual map that we call hypermap. Using Giovanni Nolli's 1748 map of the city of Rome as a backbone, we conducted an experiment based on one of the routes defined by Giuseppe Vasi's Grand Tour of Rome to collect various types of urban and environmental information, thus aiming to connect a multiplicity of data from different nature and times periods to enhance the serendipitous elaboration of new narratives, interpretations, and data (namely "unfolding") not implicitly enacted by the pure analytical and mechanistic overlapping of gathered data ("folding"). This experiment is part of the research project entitled Datathink that was conducted at the Bibliotheca Hertziana - Max Planck Institute for Art History in Rome, the experiment serves as a proof of concept for an augmented database of the urban landscape in the city of Rome and new ways to facilitate the access and enhancement of cultural artifacts and knowledge.

DEMONSTRATION SESSION: Interactive Media: Art and Design: Walkthrough Demo

Geo-Contextualizaton and Aggregation of Information Resources

This paper explores the addition of spatial features to geographic objects described in textual and photographic information resources and the resulting research possibilities. We briefly describe one of the systems used by the BHMPI to identify geographic objects and the process of adding spatial features from open data providers. This is followed by a discussion of the possibilities of bridging information domains by using the spatial feature and the usefulness of geo-contextualization for named entity normalization. Finally, we explore the usefulness of geo-queries for research in art history and related fields.

POSTER SESSION: Interactive Media: Art and Design: Posters

Intuitive Semantic Graph Tool for Enhanced Archive Exploration

The paper introduces a new method for visualizing and navigating information in a cultural heritage archive in a simple and intuitive way. The proposed approach employs pre-trained language models to cluster data and create semantic graphs. The creation of multilayer maps enables deep exploration of archives with large datasets, while the ability to handle multilingual datasets makes it suitable for archives with documents in various languages. These features combine to provide a user-friendly tool that can be adapted to different contexts and provides an overview of archive contents, to allow even non expert users to successfully query the archive.

Transparency in Messengers: A Metadata Analysis Based on the Example of Telegram

Social media platforms and messenger services such as Telegram have created a new way of publishing and consuming news. To combat the resulting negative effects, such as disinformation, social media analytics (SMA) can be applied to, inter alia, reconstruct the spread of information across platforms or to identify key actors and interactions between users. This paper examines metadata in Telegram to provide a privacy-friendly basis for further SMA. The characteristics of 770 Telegram channels and groups were derived by extracting initial key figures. We observed that messages that were created in channels were forwarded to other actors, while not a single original message created in groups was forwarded. This suggests that channels have a much greater impact on generating and spreading information to other Telegram actors than groups.

Decentralizing Social Media: An Examination of Blockchain-based Social Media Adoption and Use based on the Unified Theory of Acceptance and Use of Technology (UTAUT) Blockchain-based Social Media Adoption and Use

The study conducted semi-structured interviews with 31 early adopters of blockchain-based social media (BSM) platforms to understand their reasons for using these emerging platforms and to compare their usage with mainstream social media (MSM) platforms like Facebook and TikTok. Guided by the Unified Theory of Acceptance and Use of Technology (UTAUT) model, the manual content analysis of the interviews reveals that users' adoption of BSM platforms is primarily motivated by social influence, financial incentives, and a desire to bypass the content moderation policies implemented by MSM. At the same time, the steep learning curve, security and privacy concerns hinder the widespread adoption of these platforms. Finally, the study validates the suitability of the UTAUT model for examining the adoption and use of BSM platforms, but it also proposes to include two new factors, namely financial incentives and content moderation.

SESSION: Authoring, Reading, Publishing: Web reading

The boundary between reality and fiction in hyperfictions for smartphone

Hyperfictions for telephones, described as smartfictions, are based on our ordinary practices with a smartphone, in particular instant messaging and notifications. Using the example of Bury Me, My Love, we analyse the play on the boundary between reality and fiction, but also what these narratives reveal about our own use of the smartphone and the ethical issues they raise.

Beyond Hypertexting the Hypertext: Annotated and GIS Adaptations of Joyce's Ulysses as Case Studies for User Experience and Engagement

There are several hypermedia and geographic information system (GIS) based digital projects centered on James Joyce's Modernist novel Ulysses that aim to provide readers with better comprehension of the text. These digital projects use hypertext annotations, interactive maps, and/or extended reality (XR) to allow users to embed themselves in the setting and narrative, creating a greater possibility of reader enjoyment and engagement. My goal is to synthesize and contextualize the existing hypertext, hypermedia, GIS, XR, and other digital adaptations of Joyce's Ulysses, as well as discuss the history of the novel as a hypertextual work. I propose the development of a digital Ulysses project that applies hypertext/hypermedia, GIS, and wiki technology to provide users with easier access to all the materials necessary to contextualize the novel and add their own interpretations. This conceptual project will have larger implications in the world of digital humanities pedagogy and information literacy, as it radically alters the ways in which users process and access information that contributes to their understanding of literature.

Co-constructed readings of the Internet: voyaging on a digital storytelling platform

Using the oft-cited metaphor of hypertext and electronic fiction as a voyage for the reader, this paper considers the theme and practice of voyaging and map-reading on the digital storytelling platform Voices of Rural India, an Indian rural platform that uses stories to invite tourists to visit. This website embodies the theme of travel and tourism while also allowing readers the opportunity to navigate through the digital platform. Using sitemaps and close readings of the stories in the platform, this paper considers the notion of "community storytelling" and whether digital storytelling can be used to empower the reader, as well as the other parties involved in the creation of a text. This research builds on hypertext theory and scholarship in the area of collaborative writing and co-authorship, while arguing for new concepts of collective writing and media creation.

All click and no play: how interactive are interactive digital comics?

This contribution investigates a specific subcategory of digital comics -- part of what I call 'expanded digital comics' - considered by many to be inherently more interactive than prototypical comics, and often related to video games. However, whether they are indeed more interactive, and what it means for a comic to be interactive at all, is an open question. Trough a reframing of the concept of interactivity as one of the possible subtypes of agency (narrative, interpretative, material and social) that a semiotic text allows for, this contribution will survey and discuss a selection of ostensibly interactive digital comics, interrogating the types and degrees of interaction they establish and reflecting on the specificities of their meaning-making processes.

SESSION: Authoring, Reading, Publishing: Hypertext Authoring

Are You the Main Character?: Visibility Labor and Attributional Practices on TikTok

This paper revisits hypertext theory from the 1990s (George Landow, Jay David Bolter, etc.) and database theory from the 2000s (Lev Manovich, Victoria Vesna, etc.) with attention to explaining new authoring practices on the video-sharing platform TikTok. Because hyperlinking is automated on the platform whenever composers select audio clips, effects, hashtags, and author references to other videos for "dueting" and "stitching" to remix from pre-existing databases of material, TikTok is characterized by rich attributional practices of citation. At the same time, the site's users are keenly aware that search and recommendation algorithms may obfuscate their published materials and that additional labor may be required for their contributions to be visible in the larger hyperlinked matrix of database participation. At the same time, users may also choose to de-link their content or the content of others. Case studies are drawn from variations of the "main character" meme and recent moral panics about supposedly dangerous TikTok "trends."

What Degree of Freedom for the Reader of Patrimonial Digital Editions?: The case of a large interconnected scholarly corpus of Ancient and Early Medieval Chinese literature

In this paper, we present a practical project of building up an interconnected body of Ancient and Medieval Chinese texts, which associates records in a database, marked up texts, and a set of designed architectures (useful to structure individual texts and also necessary to set up a website). We first explain how we delineated, acquired and structured the corpus under study. We then explain why and how we set up a database which plays an important role in the editorial pipeline. We finally present our editorial choices, and more specifically why we have decided to limit the exploration tools available on our website to the possibilities offered by hypertext. The edition of this very large scholarly corpus is intimately tied to a research project which also builds a knowledge network and employs a variety of methods -- from traditional text analysis to computational text mining -- to understand how snippets of knowledge circulated through texts. We ask: Does an edition produced in the course of a research project necessarily reflect or even bear the traces of this research project? To what extent should the characteristics -- thematic, methodological, etc. -- of this project influence future access to such an edition? To what extent should a reader's freedom be preserved?

Name Links: an Aesthetic Discussion

A correspondence between the link in hypertext and the sign (both semiotic and linguistic) is well established. Consisting of source and destination, links parallel the signifier and signified of the semiotic and linguistic sign, as they do wider models and approaches to intertextuality. Deeper investigation of the connection between the sign and the link is, however, a currently rather neglected area for hypertext. Better understanding the complexities of the semiotic sign, however, can be beneficial for the epistemology of hypertext and more generally for understanding the complex meaning the link engenders.

To link or not to link - and to what - is an equivalently nuanced question for hypertext. In closed hypertext, such decisions are, while important, more limited in scope; in open hypertext, however, links represent a form of delegation (or sharing) of authority and responsibility. This contribution explores this aesthetic dimension of hypertext design, through reference to the semiotics of names developed through a case study on name links.

Showing the scars: A short case study of de-enhancement of hypertext works for circulation via fan binding or Kindle Direct Publishing

This short presentation examines instances of literary hypertexts intentionally stripped of that which makes them interconnected and updatable. To investigate aspects of how and why text creators, users, and intermediaries de-enhance hypertexts for reasons entirely distinct from the much-studied antipathy to hypertextuality found in some 20th century literary cultures, it contrasts one commercial and one non-commercial (indeed, actively anti-commercial) example: the mass phenomenon of Kindle Direct Publishing and the niche practice of fan binding. Fan bindings, where fanfiction and other fan works are printed and bound as material objects, sometimes using Print on Demand (POD) services but more often by hand, circulate in a gift economy with distinctive ethical norms and, as transformative works in their own right, illustrate how meaning is made as well as lost in uncoupling works from their fan community contexts. Juxtaposing these examples problematises conceptions of either commercial self-publishing or non-commercial fan communities as offering uncomplicated refuge for interactive literature, and challenges narratives of literary communities as en-duringly hostile to or no longer interested in experimentation with hypertextuality. The presentation addresses the conference topics of authorship and reading practices from a book history perspective, highlighting the wider significance of stances against hypertextuality and implications for hypertext creators and audiences across genres.

SESSION: Workflows and Infrastructures

Va.Si.Li-Lab as a collaborative multi-user annotation tool in virtual reality and its potential fields of application

During the last thirty years a variety of hypertext approaches and virtual environments -- some virtual hypertext environments -- have been developed and discussed. Although the development of virtual and augmented reality technologies is rapid and improving, and many technologies can be used at affordable conditions, their usability for hypertext systems has not yet been explored. At the same time, even for virtual three-dimensional virtual and augmented environments, there is no generally accepted concept that is similar or nearly as elegant as hypertext. This gap will have to be filled in the next years and a good concept should be developed; in this article we aim to contribute in this direction and also introduce a prototype for a possible implementation of criteria for virtual hypertext simulations.

Deep Viewpoints: Scripted Support for the Citizen Curation of Museum Artworks

This paper describes the design and use of Deep Viewpoints, a software platform for eliciting and sharing citizen perspectives associated with museum artworks. The design of the platform is inspired by the process of Slow Looking in which museum visitors are guided to observe artworks and develop their own response. Within Deep Viewpoints, the processes of observing and responding to artworks are guided by a script comprising stages containing artworks, statements, and prompts or questions to which the follower of the script can respond. Scripts are intended for use either in the gallery or remotely. We describe the design of Deep Viewpoints and how it can be used to respond to scripts, view the responses of others and author new scripts. We then describe our experiences of using Deep Viewpoints with communities traditionally underserved by the museum sector to bring new perspectives to the museum collection. Crucially, the communities were not only involved in interpreting artworks with the guidance of the scripts but also creating new scripts, mediating how others observe and think about art. Analysis of the authored scripts revealed a range of ways in which they were used to share interpretations of the artworks and mediate what questions others should ask themselves when viewing the artworks. Finally, we reflect on the potential role a scripted approach to Citizen Curation could play in promoting cultural engagement.

Evaluating a Radius-based Pipeline for Question Answering over Cultural (CIDOC-CRM based) Knowledge Graphs

CIDOC-CRM is an event-based international standard for cultural documentation that has been widely used for offering semantic interoperability in the Cultural Heritage (CH) domain. Although there are several Knowledge Graphs (KGs) expressed by using CIDOC-CRM, the task of Question Answering (QA) has not been studied over such graphs. For this reason, in this paper we propose and evaluate a Radius-based QA pipeline over CIDOC-CRM KGs for single-entity factoid questions. In particular, we propose a generic QA pipeline that comprises several models and methods, including a keyword search model for recognizing the entity of the question (and linking it to the KG), methods that are based on path expansion for constructing subgraphs of different radius (i.e., path lengths) starting from the recognized entity, i.e., for being used as a context, and pre-trained neural models (based on BERT) for answering the question using the mentioned context. Moreover, since there are no available benchmarks over CIDOC-CRM KGs, we construct (by using a real KG) an evaluation benchmark having 10,000 questions, i.e., 5,000 single-entity factoid, 2,500 comparative and 2,500 confirmation questions. For evaluating the QA pipeline, we use the 5,000 single-entity factoid questions. Concerning the results, the QA pipeline achieves satisfactory results both in the entity recognition step (78% accuracy) and in the QA process (51% F1 score).

Mitigating Bias in GLAM Search Engines: A Simple Rating-Based Approach and Reflection

Galleries, Libraries, Archives and Museums (GLAM) institutions are increasingly opening up their digitised collections and associated data for engagement online via their own websites/search engines and for reuse by third parties. Although bias in GLAM collections is inherent, bias in the search engines themselves can be rated. This work proposes a bias rating method to reflect on the use of search engines in the GLAM sector along with strategies to mitigate bias. The application of this to an existing large art collection shows the applicability of the proposed method and highlights a range of existing issues.

SESSION: Workflows and Infrastructures: Curation and editions

Comparison of news commonality and churn in international news outlets with TARO

The past decades have seen an increase in academic research and public debates on online news and journalism in general, with an emphasis on fake news and low-quality reporting.

This paper presents TARO: a model and a software framework for the collection and analysis of online news sources.

The novel aspects of the TARO model and framework are: the distinction between abstract pieces of news and concrete news items, news comparison techniques based on similarity on embedded spaces, and the management of rolling news via so-called snapshot extensions. One advantage of TARO is the ability to perform comparative analysis of international news sources in various languages and across time zones.

To prove the applicability and soundness of TARO, two quantitative cases studies related to the concept of churnalism are also presented in this paper. The two case studies provide quantitative insights on two tendencies of news outlets: news commonality (publishing the same news) and news churn (quickly removing recent news to make space for even more recent news).

Melody: A Platform for Linked Open Data Visualisation and Curated Storytelling

Data visualisation and storytelling techniques help experts highlight relations between data and share complex information with a broad audience. However, existing solutions targeted to Linked Open Data visualisation have several restrictions and lack the narrative element. In this article we present MELODY, a web interface for authoring data stories based on Linked Open Data. MELODY has been designed using a novel methodology that harmonises existing Ontology Design and User Experience methodologies (eXtreme Design and Design Thinking), and provides reusable User Interface components to create and publish web-ready article-alike documents based on data retrievable from any SPARQL endpoint. We evaluate the software by comparing it with existing solutions, and we show its potential impact in projects where data dissemination is crucial.

Orchestrating Cultural Heritage: Exploring the Automated Analysis and Organization of Charles S. Peirce's PAP Manuscript

This preliminary study introduces an innovative approach to the analysis and organization of cultural heritage materials, focusing on the archive of Charles S. Peirce. Given the diverse range of artifacts, objects, and documents comprising cultural heritage, it is essential to efficiently organize and provide access to these materials for the wider public. However, Peirce's manuscripts pose a particular challenge due to their extensive quantity, which makes comprehensive organization through manual classification practically impossible. In response to this challenge, our paper proposes a methodology for the automated analysis and organization of Peirce's manuscripts. We have specifically tested this approach on the renowned 115-page manuscript known as PAP. This study represents a significant step forward in establishing a research direction for the development of a larger project. By incorporating novel computational methods, this larger project has the potential to greatly enhance the field of cultural heritage organization.

SESSION: Social and Intelligent Media: Social media methods

Adaptive Navigational Support and Explainable Recommendations in a Personalized Programming Practice System

We present the results of a study where we provided students with textual explanations for learning content recommendations along with adaptive navigational support, in the context of a personalized system for practicing Java programming. We evaluated how varying the modality of access (no access vs. on-mouseover vs. on-click) can influence how students interact with the learning platform and work with both recommended and non-recommended content. We found that the persistence of students when solving recommended coding problems is correlated with their learning gain and that specific student-engagement metrics can be supported by the design of adequate navigational support and access to recommendations' explanations.

ContextBot: Improving Response Consistency in Crowd-Powered Conversational Systems for Affective Support Tasks

Crowd-powered conversational systems (CPCS) solicit the wisdom of crowds to quickly respond to on-demand users' needs. The very factors that make this a viable solution ---such as the availability of diverse crowd workers on-demand--- also lead to great challenges. The ever-changing pool of online workers powering conversations with individual users makes it particularly difficult to generate contextually consistent responses from a single user's standpoint. To tackle this, prior work has employed conversational facts extracted by workers to maintain a global memory, albeit with limited success. Through a controlled experiment, we explored if a conversational agent, dubbed ContextBot, can provide workers with the required context on the fly for successful completion of affective support tasks in CPCS, and explore the impact of ContextBot on the response quality of workers and their interaction experience. To this end, we recruited workers (N=351) from the Prolific crowd-sourcing platform and carried out a 3×3 factorial between-subjects study. Experimental conditions varied based on (i) whether or not context was elicited and informed by motivational interviewing techniques (MI-adherent guidance, general guidance, and no guidance), and (ii) different conversational entry points for workers to produce responses (early, middle, and late). Our findings show that: (a) workers who entered the conversation earliest were more likely to produce highly consistent responses after interacting with ContextBot; (b) showed better user experience after they interacted with ContextBot with a long chat history to surf; (c) produced more professional responses as endorsed by psychologists; (d) and that interacting with ContextBot through task completion did not negatively impact workers' cognitive load. Our findings shed light on the implications of building intelligent interfaces for scaffolding strategies to preserve consistency in dialogue in CPCS.

Effects of the spiral of silence on minority groups in recommender systems

Recommender systems play a critical role in today's data-rich landscape, where the abundance of information necessitates their ability to present the most relevant choices for individual users. However, it has been noted that recommender systems often fail to offer the most suitable options for minority groups. Therefore, this paper examines one of the potential reasons behind the underrepresentation of minority opinions within recommender systems through a literature review. Specifically, we explore the connection and importance between the spiral of silence concept and the dearth of suitable recommendations for minorities.

A Centrality for Social Media Users Focusing on Information-Gathering Ability

In this paper, we propose a centrality metric for social media users that focuses on their information-gathering ability. Existing methods of rating users in social graphs focus on various aspects of users, such as popularity, influential power, and informational quality, but these aspects are related to information-transmitting ability of users. On social media, information-gathering ability is also an important ability, which varies widely from user to user. There have been two well-known metrics related to it: the hub score in the HITS algorithm and Katz centrality. These two methods are, however, not designed for today's social media, and do not take important aspects of social media into consideration. HITS does not consider multi-hop information propagation, and Katz centrality assumes that all nodes in the graph are equally important as information sources and also as information propagation mediators. In the proposed method, we extend Katz centrality by introducing two properties of users: importance as information source and information forwarding probability. The result of our experiment on two Twitter follow graphs shows that our metric produces a ranking different from the existing metrics, and also suggests that it captures some useful aspect of users that are not captured by existing metrics.

SESSION: Social and Intelligent Media: Through the mirror of social media

Anatomy of Hate Speech Datasets: Composition Analysis and Cross-dataset Classification

Manifestations of hate speech in different scenarios are increasingly frequent on social platforms. In this context, there is a large number of works that propose solutions for identifying this type of content in these environments. Most efforts to automatically detect hate speech follow the same process of supervised learning, using annotators to label a predefined set of messages, which are, in turn, used to train classifiers. However, annotators can create labels for different classification tasks, with divergent definitions of hate speech, binary or multi-label schemes, and various methodologies for collecting data. In this context, we examine the principal publicly available datasets for hate speech research. We investigate the types of hate speech (e.g., ethnicity, religion, sexual orientation) present in their composition, explore their content beyond the labels, and use cross-dataset classification to examine the use of the labeled data beyond its original work. Our results reveal interesting insights toward a better understanding of the hate speech phenomenon and improving its detection on social platforms.

Warning. This paper contains offensive words and tweet examples.

A Comparative Study of Affective and Linguistic Traits in Online Depression and Suicidal Discussion Forums

Depression is a type of mental illness that negatively impacts the lives of millions of people worldwide. Extreme depression is related to increasingly hopeless and worthless feelings, which may lead to suicidal attempts. The widespread use of social media, coupled with the anonymity it provides, enables individuals to freely express and share their frustrations and low emotions on these platforms. As a preliminary study, here, we investigate how the user-generated content regarding the two mental-health issues, depression and suicidal tendencies, are related at linguistic levels based on two Reddit mental-health forums. By collecting user posts from two Reddit social media forums, r/depression and r/suicidal watch, we seek to find the (dis)similarity of the various affective, grammatical, and semantic attributes in these two groups. We find that while some of the affective features exhibit some differences, overall, most attributes yield similar patterns in these two groups. The results suggest that it is very challenging to separate depressive posts from suicidal posts at the linguistic level as they possess similar traits. Hence, it is imperative to monitor the content of the depression forum vigilantly (likewise the suicidal forum) to identify any suicidal tendencies.

Why do we Hate Migrants?: A Double Machine Learning-based Approach

Abstract: AI-based NLP literature has explored antipathy toward the marginalized section of society, such as migrants, and their social acceptance. Broadly, extant literature has conceptualized this as an online hate speech detection task and employed predictive ML models. However, a crucial omission in this literature is the genesis (or causality) of online hate, i.e., why do we hate migrants? Drawing insights from social science literature, we have identified three antecedents of online hate: Cultural, Economic, and Security concerns. Subsequently, we probe -which of these concerns triggers higher toxicity on online platforms? Initially, we consider OLS-based regression analysis and SHAP framework to identify the predictors of toxicity, and subsequently, we use Double Machine Learning (DML)-based casual analysis to investigate whether good predictors of toxicity are also causally significant. We find that the causal effect of Cultural concerns on toxicity is higher than Security and Economic concerns.

Catching Lies in the Act: A Framework for Early Misinformation Detection on Social Media

The proliferation of social media has intensified the necessity for automated misinformation detection. Existing methods often struggle with early detection, as key information is not readily available during the initial dissemination stages. In this paper, we introduce a novel model for early misinformation detection on social media by classifying information propagation paths and leveraging linguistic patterns. Our model incorporates a causal user attribute inference model to label users as potential misinformation propagators or believers. Designed for early detection, the model includes two auxiliary tasks: forecasting the scope of misinformation dissemination and clustering similar nodes (users) based on their attributes outperforming the current state-of-the-art benchmarks.

SESSION: Social and Intelligent Media: Social Media Practices (Panel)

Ghost Booking as a New Philanthropy Channel: A Case Study on Ukraine-Russia Conflict

The term ghost booking has recently emerged as a new way to conduct humanitarian acts during the conflict between Russia and Ukraine in 2022. The phenomenon describes the events where netizens donate to Ukrainian citizens through no-show bookings on the Airbnb platform. Impressively, the social fundraising act that used to be organized on donation-based crowdfunding platforms is shifted into a sharing economy platform market and thus gained more visibility. Although the donation purpose is clear, the motivation of donors in selecting a property to book remains concealed. Thus, our study explores peer-to-peer donation behavior on Airbnb, which was originally intended for economic exchanges, and further identifies which platform attributes effectively drive donation behaviors. We collect over 200K guest reviews from 16K Airbnb property listings in Ukraine by employing two collection methods (screen scraping and HTML parsing). Then, we distinguish ghost bookings among guest reviews. Our analysis uncovers the relationship between ghost booking behavior and the platform attributes. We also pinpoint several attributes that influence ghost booking. Our findings highlight that donors incline to credible properties explicitly featured with humanitarian needs, i.e., the hosts in penury.

The Looming Threat of Fake and LLM-generated LinkedIn Profiles: Challenges and Opportunities for Detection and Prevention

In this paper, we present a novel method for detecting fake and Large Language Model (LLM)-generated profiles in the LinkedIn Online Social Network immediately upon registration and before establishing connections. Early fake profile identification is crucial to maintaining the platform's integrity since it prevents imposters from acquiring the private and sensitive information of legitimate users and from gaining an opportunity to increase their credibility for future phishing and scamming activities. This work uses textual information provided in LinkedIn profiles and introduces the Section and Subsection Tag Embedding (SSTE) method to enhance the discriminative characteristics of these data for distinguishing between legitimate profiles and those created by imposters manually or by using an LLM. Additionally, the dearth of a large publicly available LinkedIn dataset motivated us to collect 3600 LinkedIn profiles for our research. We release our dataset publicly for research purposes. This is, to the best of our knowledge, the first large publicly available LinkedIn dataset for fake LinkedIn account detection. Within our paradigm, we assess static and contextualized word embeddings, including GloVe, Flair, BERT, and RoBERTa. We show that the suggested method can distinguish between legitimate and fake profiles with an accuracy of about 95% across all word embeddings. In addition, we show that SSTE has a promising accuracy for identifying LLM-generated profiles, despite the fact that no LLM-generated profiles were employed during the training phase, and can achieve an accuracy of approximately 90% when only 20 LLM-generated profiles are added to the training set. It is a significant finding since the proliferation of several LLMs in the near future makes it extremely challenging to design a single system that can identify profiles created with various LLMs.

The Interconnected Nature of Online Harm and Moderation: Investigating the Cross-Platform Spread of Harmful Content between YouTube and Twitter

The proliferation of harmful content shared online poses a threat to the integrity of online information and the integrity of discussion across platforms. Despite the various moderation interventions adopted by social media platforms, researchers and policymakers are calling for holistic solutions. This study explores how a target platform could take advantage of content that has been deemed harmful on a source platform by investigating the behavior and characteristics of Twitter users responsible for sharing moderated YouTube videos. Using a large-scale dataset of 600M tweets related to the 2020 US election, we find that moderated Youtube videos are extensively shared on Twitter and that users who share these videos also endorse extreme and conspiratorial ideologies. A fraction of these users are eventually suspended by Twitter, but they do not appear to be involved in state-backed information operations. The findings of this study highlight the complex and interconnected nature of harmful cross-platform information diffusion, raising the need for cross-platform moderation strategies.

Positive by Design: The Next Big Challenge in Rethinking Media as Agents?

Hypertext and Web pioneers had high aspirations and expectations about the potential positive impact of technology. However, studies in the last decades have shown how widely adopted social and intelligent media are either amplifiers or a source of adverse detrimental effects on their users. On the one hand, we now have a better understanding of these negative phenomena and strategies to identify and quantify their effects. On the other hand, as a community, we should take on the challenge of steering hypertext technologies toward positive applications. This position paper argues for a proactive role of the hypertext community in the design of agent media result of combining social media with intelligent algorithms. This silent paradigm shift introduced third-party proactive agents in a wide range of human-to-human interactions. We are today at a point where social media and global web applications cannot operate without such systems and demand, in the author's opinion, a similar proactive role of academia and scholars in understanding and driving their design for positive goals. This position paper outlines the need for this challenge to be taken on, and how and why the Hypertext community could lead in its own way this vision forward.

SESSION: Reflections & Approaches: Reflections

Historiographies of Hypertext

Hypertext professionals have been writing the history of hypertext since Ted Nelson coined the term in the 1960s and claimed Vannevar Bush's Memex as a precursor to his Xanadu system. Despite the abundance of papers celebrating important figures and anniversaries in hypertext history, there has been less critical reflection on the methods for conducting this analysis. In this paper, I outline the dominant methods of writing histories of hypertext within the community. Through tracing the overlaps and gaps within this literature, I argue for a greater focus on regular users of these technologies and comparative analyses of hypertext in relation to broader trends. The paper concludes with a brief demonstration of how to apply this historical work through a case study of reading on-screen and hypertext.

Seven Hypertexts

What is Hypertext? It has been studied and explored for over 50 years but a complete definition seems ever more elusive. The term is invoked in multiple communities, and applied in radically different domains, but if we cannot reconcile the different perspectives then we will be unable to learn from our shared history, or from each other in the future. In this paper we argue that the longevity and variety of hypertext work makes a simple definition impractical. Instead we suggest different contexts in which hypertext work has been conducted, and then attempt to draw out the relationships and commonalities between them. We describe seven contexts drawn from the literature: Hypertext as a Tool for Thought, as Knowledge Representation, as Social Fabric, as Literature, as Games, as Infrastructure, and as Interface. We argue that these are connected by a common requirement for non-regularity, driven by post-structuralist philosophy, and enshrining existentialist values in our technology. It is the application of these ideas to different problems that gives rise to current Hypertext, as we see the same technical features, and engineering and creative challenges, manifest in otherwise quite different digital domains.

Interdisciplinary Teaching Toward the Next Generation Hypertext Researchers

Two years ago the idea of International Teaching and Research in Hypertext (INTR/HT) was introduced. This paper follows up on this idea and further develops new thoughts on this topic, based on the experiences gained from three university courses taught under the umbrella of INTR/HT. We conclude we have envisioned a model for the future of international and interdisciplinary education in hypertext that has the potential to raise the next generation of hypertext researchers through efforts of collaborative teaching and learning activities.

Scholarly Hypertext Revisited: Leveraging Multimodal Publication Formats for Creating Multiperspectivity and Transparent Data Interpretation in the (Digital) Humanities

In recent years, the digital and nondigital humanities have seen an increased need for scholarly hypertext, manifesting in innovative online publishing initiatives. This is mainly due to two reasons: first, the desire to conceptualize and convey multiperspectivity in research, and second, the aim to transparently communicate complex data-driven research. In this paper, I argue that multilinear and multimodal hypertext formats have been underestimated so far but, in fact, offer a more suitable solution for representing complex narratives and argumentations than hypertext networks. While theoretical underpinnings for scholarly hypertext have remained scarce, I substantiate my claim by drawing from non-dualist theory and multimodal research to fill this research gap. Multilinear and multimodal hypertext has the potential to lay open the architecture of our narratives and argumentation. It, therefore, provides an epistemic value and enables richer explorations in both digital humanities and classical humanities research.

PANEL SESSION: Reflections & Approaches: Panels

Hypertext as Method: Reflections on Hypertext as Design Logic

The proposed panel demonstrates how viewing hypertext as method and mode of inquiry (rather than simply technology) can foreground synergies between book history, textual studies and computer science, and enhance the scope of research in the wider humanities community. Hypertext as method is explored through six interconnected papers, each showcasing a different interpretation or approach. The first discusses the role of hypertext as a pivot connecting the Humanities approach with the design of hypertext systems. The subsequent two papers discuss proto-hypertextual logic in specific historical instances before the final three demonstrate the explicit application of the hypertext method to contemporary book history challenges: webcomics, videogames, and interactive fiction. The aim is to demonstrate the potential of hypertext to energise collaboration among book historians, textual scholars, and hypertext scholars, who have often missed collaborations with one another. The breadth of subjects covered by the panel showcases the potential of hypertext as method while providing possible avenues for hypertext as a community to build connections with other disciplinary areas.

WORKSHOP SESSION: Workshops & Tutorials: Workshops

Legal Information Retrieval meets Artificial Intelligence (LIRAI)

The Legal Information Retrieval meets Artificial Intelligence (LIRAI) workshop series aims to provide a venue hosting discussion of novel ideas, evaluations, and success stories concerning the application of Artificial Intelligence (AI) and Information Retrieval (IR) to the legal domain. All around the world, lawmakers, legal professionals, and citizens must cope with the sheer amount of legal knowledge present in legal documents. These documents can be norms, regulations, directives, legal cases, and other relevant material for legal practitioners, such as legal commentary. The continuous evolution of legal documents is a challenging setting, with implicit relationships playing an important role beyond explicit references. Recently, the adoption of shared machine-readable formats and FAIR principles, as well as methods and practices from the Semantic Web, have certainly improved the accessibility of legal knowledge and its interoperability. Still, retrieving legal knowledge and making sense of it are not solved problems. The legal community often has special requirements for retrieval systems (e.g., high recall, explainability). Artificial Intelligence (AI) is positioned as a lever to enhance our ability to find, understand, and correlate legal information, and to comprehend its relationship to reality, in terms of compliance evaluation and risk/benefit analysis. We call contributions on these topics in the form of papers, which will be collected in an open-access proceedings published on CEURWS.org and thus indexed by Scopus, DBLP, Google Scholar, and other citation databases.

HUMAN'23: 6th Workshop on Human Factors in Hypertext

HUMAN 2023 is the 6th workshop of a series for the ACM Hypertext conferences. The HUMAN workshop has a strong focus on the user and thus is complementary to the strong machine analytics research direction that could be experienced in previous conferences.

The user-centric view on hypertext not only includes user interfaces and interaction, but also discussions about hypertext application domains as well as human-centered AI. Furthermore, the workshop raises the question of how original hypertext ideas (e. g., Doug Engelbart's "augmenting human intellect" [7] or Jeff Conklin's "hypertext as a computer-based medium for thinking and communication" [6]) can improve today's hypertext systems.

OASIS'23: 3rd International Workshop on Open Challenges in Online Social Networks

Online Social Networks (OSNs) became part of everyday life for many people around the world. They are one of the main channels through which information can spread at lightning speed. Thanks to this fact, people use them for the most disparate reasons, such as sources of information in place of newspapers, to receive emotional or technical support, or to share their ideas and opinions to satisfy their need for sociality.

Since their introduction, people questioned these services because they are affected by several problems. These problems include the preservation of the users' privacy, fake news diffusion, diffusion of illegal pieces of content, censorship vs free speech, economic value redistribution, security vs trust, and so on. The aim of this workshop is to partially try to overcome these problems by setting up a platform for researchers to publish their contributions.

The contributions can point to innovative methods and algorithms for social graph mining, which can be helpful to develop more efficient information diffusion techniques; the problem of privacy, and how can be enforced in these systems is current, and in particular the relation between security, trust and privacy is crucial in the scenario of OSNs; the decentralisation and its impact on the implementation of social services; how Artificial Intelligence techniques that respect the privacy of the users can be implemented; technologies that enable the metaverse.

Web/Comics 2023: Webcomics and/as Hypertext

Web/Comics 2023 is the first in a new workshop series for the ACM Hypertext conference. The Web/Comics workshop focuses on the transformation of the comics medium enacted by hypertext through the emergence of webcomics, or "graphic sequential narratives that are created, published, and read online" [1].

The Web/Comics workshop brings together interdisciplinary perspectives from the humanities and technological communities to share work and discuss the latest research on webcomics from the perspectives of both communities. It wants to act as a bridge to increase collaboration between the comics and hypertext research communities.

Researchers and practitioners working with webcomics or hypertext are invited to attend this workshop. Participants are asked to submit a short (between 2 and 4 pages) position paper on their current work. The planned event is a half day hybrid workshop with sessions based around short presentations, with emphasis on opportunities for dialogue and discussion in the final roundtable session.

NHT'23: Narrative and Hypertext 2023

NHT is a continuing workshop series associated with the ACM Hypertext conference for over a decade. The workshop acts as forum of discussion for the narrative systems community within the wider audience of the Hypertext conference. The workshop runs both presentations from authors of accepted short research papers, and invited talks to provide a venue for important discussions of issues facing and opportunities for members of the narrative and hypertext community. This year the workshop aims to specifically target the timely issue of "Mixed Reality Narrative Hypertext" while maintaining an open venue for wider relevant work.

TUTORIAL SESSION: Workshops & Tutorials: Tutorials

Design of Map-based Hypertext Systems

The tutorial focuses on the design of map-based hypertext systems. Firstly, the tutorial introduces the concepts of spatial data types with reference to Open Street Map, basic notions of cartography like scale and map themes and an overview of the technologies behind digital maps. The tutorial will provide practical guidelines about the connections between a) user interaction with maps, b) spatial data processing and c) visualisation. Following, the tutorial will walk participants through good and bad practices in the design of interactive maps.

From Trolling to Cyberbullying: Using Machine Learning and Network Analysis to Study Anti-Social Behavior on Social Media

The rise of social media and other web and mobile applications has transformed how people interact, but it has also created new challenges, such as anti-social behavior like trolling, cyberbullying, and hate speech. This behavior can have severe negative consequences for individuals and communities. This tutorial is intended for researchers and practitioners interested in computational social science and provides an overview of how to use machine learning and social network analysis techniques to detect and examine anti-social behavior in online discourse.