WebSci '21 Companion: Companion Publication of the 13th ACM Web Science Conference 2021

Full Citation in the ACM Digital Library

SESSION: 1.1 Workshop on AI and Inclusion (AAI)

Introduction to Workshop: Overcoming accessibility gaps on the Social Web

Mike Wald
Chaohai Ding
EA Draffan

WebSci’21 workshop on AI and Inclusion aims to invite both academic and industry experts in the area of digital accessibility and AI to offer research and their individual views as to a range of digital accessibility issues they perceive as causing barriers to access, alongside possible solutions and strategies powered by AI technologies to provide an inclusive social web.

Nipping Inaccessibility in the Bud: Opportunities and Challenges of Accessible Media Content Authoring

Carlos Duarte
Letícia Seixas Pereira
André Santos
João Vicente
André Rodrigues
João Guerreiro
José Coelho
Tiago Guerreiro

Social media represents a large part of the content available on the Web. While the accessibility of the UIs of existing social media platforms has been improving, the same cannot be said about the accessibility of the content authored by their users. Specifically, the accessibility of multimedia content that is increasingly available given the ease of access to mobile devices with cameras. User research has revealed that accessible authoring practices are a foreign concept to most social media users, but also that they are motivated to adopt inclusive practices. Our work focuses on promoting awareness to accessible social media authoring practices and in assisting the authoring process. We have prototyped a Google Chrome extension and an Android application that can identify when a Twitter or a Facebook user is authoring content with images and suggests a text alternative for the image. By suggesting the alternative, we raise awareness to the accessible authoring process and make it easier for the user to include it in the tweet or post. Text alternatives may be suggested from different sources: descriptions entered by other users for the same image, analysis of the main concepts present in the image, or text present in the image, for instance. Our prototypes can also provide text alternatives on demand for images on any web page or Android application, not just social media. In this paper, we highlight some of the challenges faced to offer this support in different technological platforms (web and mobile), but also ones that are raised by the domain characteristics (e.g. detecting the same image, supporting different languages) and that can be addressed through AI based technologies.

British Sign Language (BSL) User's Gaze Patterns Between Hands and Face During Online Communication

Nez Parr
Biao Zeng

British Sign Language (BSL) uses various visual cues from hands, mouth, and facial expressions to convey information and communicate. During the lockdown, deaf people relied more on online BSL communication. This brings a challenge for most deaf people and calls for social inclusion in the cyberworld. This study used a free online eye-tracker app and investigated how deaf people perceive BSL on the internet. A free view task was employed to explore gaze patterns when mouth and hand information was matched or unmatched. The study found 77.34% of gaze duration focused on face, while the mouth took 38.38% of the whole duration. In addition, results suggested that the mouth might play a primary role in conveying information when hand and mouth cues are incongruent.

SESSION: 1.2 Workshop on Data Literacy

2nd Data Literacy Workshop

Manuel Leon-Urrutia
Johanna Catherine Walker

The 2nd Data Literacy Workshop took place within the ACM Web Science Conference in 2021. Data literacy is an increasingly relevant concept within the area of Web Science, where transactions with data have a prominent role in the interplay of the web and society. This interdisciplinary workshop combines presentations of the papers selected after a peer review process, round tables and an interactive tutorial. The workshop promotes the cross-pollination of knowledge and experiences around Data Literacy, accrued by different communities of practice.

Teaching Data Journalism in a World of Tool and Tech Overload

Rahul Bhargava
Catherine D'Ignazio

The field of journalism is undergoing systemic disruption, with the subarea of data journalism transforming rapidly due to the introduction of new tools and techniques as well as the changes in reporting practices as journalists and newsrooms experiment and innovate. This paper explores the challenges for data journalism educators to teach in such a rapidly shifting landscape. Drawing from our experiences teaching journalism students in higher education, we assert that the goal of data journalism education amidst this complexity is not to teach tech, nor even to teach technical skills, but rather to model for students strategies of dealing with transformation and complexity. These include peer learning, hands-on learning activities, modeling learning and information seeking, and establishing a culture of critique. We introduce a number of activities that put those approaches into practice, drawing on learning literature to support our fellow educators shifting from the ”banking model” of education[10] to a learner-centered model[23]. Working with students to co-create knowledge, acting as a ”Guide on the Side”[15] can help better prepare students for the constantly evolving ecosystem of technologies and tools that support data journalism.

DALIDA: Data Literacy Discussion Workshops for Adults

Christophe Debruyne
Anne Kearns
Ciaran O'Neill
Mary Colclough
Laura Grehan
Declan O'Sullivan

Data literacy is the ability to identify, collect, process, and interpret data to gain and communicate insights. It relies on many disciplines ranging from maths and statistics to language and arts. Its importance is recognized for industry and business, but learning opportunities outside of tertiary education are lacking. In an increasingly data-driven society, the importance of data literacy is not limited to a professional context. Data literacy involves critical thinking and is crucial to becoming responsible, involved, informed, and contributing members of society. Unfortunately, we believe that socially, economically, or educationally disadvantaged groups do not have access to resources that help them gain (at least) an awareness of the topic. DALIDA is a public engagement project that aims to design workshops about this topic for that particular audience. Recognizing that data literacy is a very complex subject matter, we avail of co-creation activities to ensure that the workshops are attractive and engaging. This paper presents DALIDA, which commenced in 2021, and report on our approach and current progress.

Defining Data Literacy Communities by Their Objectives: A Text Mining Analysis

Ahmed Mohamed Fahmy Yousef
Johanna Catherine Walker
Manuel Leon-Urrutia

Data literacy is a multidimensional concept that attracts the attention of a variety of communities of practice, from different angles. The authors grouped these communities of practice in three categories: education, fields and professions, and citizenship. The meaning of data literacy varies depending on who uses it, and its concept is often conveyed in terms other than data literacy. This paper addresses the problematization of data literacy as a term by examining academic literature around it. To this end, a desk study was carried out to gather sources where the term is used. After an extensive search in the main academic databases and a subsequent PRISMA selection process, automated content analysis was applied to the gathered sources. The findings suggest that the concept of data literacy has a different treatment in different communities of practice. For example, librarians and citizen scientists have a different understanding of the concept of data literacy.

AI in My Life: AI, Ethics & Privacy Workshops for 15-16-Year-Olds

Malika Bendechache
Irina Tal
Pj Wall
Laura Grehan
Emma Clarke
Aidan Odriscoll
Laurence Van Der Haegen
Brenda Leong
Anne Kearns
Rob Brennan

AI in My Life’ project will engage 500 Dublin teenagers from disadvantaged backgrounds in a 15-week (20-hour) co-created, interactive workshop series encouraging them to reflect on their experiences in a world shaped by Artificial Intelligence (AI), personal data processing and digital transformation. Students will be empowered to evaluate the ethical and privacy implications of AI in their lives, to protect their digital privacy and to activate STEM careers and university awareness. It extends the ‘DCU TY’ programme for innovative educational opportunities for Transition Year students from underrepresented communities in higher education.

Privacy and cybersecurity researchers and public engagement professionals from the SFI Centres ADAPT1 and Lero2 will join experts from the Future of Privacy Forum3 and the INTEGRITY H20204 project to deliver the programme to the DCU Access5 22-school network. DCU Access has a mission of creating equality of access to third-level education for students from groups currently underrepresented in higher education. Each partner brings proven training activities in AI, ethics and privacy. A novel blending of material into a youth-driven narrative will be the subject of initial co-creation workshops and supported by pilot material delivery by undergraduate DCU Student Ambassadors. Train-the-trainer workshops and a toolkit for teachers will enable delivery. The material will use a blended approach (in person and online) for delivery during COVID-19. It will also enable wider use of the material developed. An external study of programme effectiveness will report on participants’: enhanced understanding of AI and its impact, improved data literacy skills in terms of their understanding of data privacy and security, empowerment to protect privacy, growth in confidence in participating in public discourse about STEM, increased propensity to consider STEM subjects at all levels, and greater capacity of teachers to facilitate STEM interventions. This paper introduces the project, presents more details about co-creation workshops that is a particular step in the proposed methodology and reports some preliminary results.

Engaging Students in Data Literacy: Lessons Learned from Data Intensive Classrooms

Thilanka Munasinghe
Amy Svirsky

This paper offers one approach to teaching relevant data literacy skills, knowledge, and attitudes using authentic assignments and assessments that provide students motivational and real-world data applications in order for them to demonstrate their data literacy skills. Students are expected to demonstrate data literacy skills through: Demonstrating knowledge of relevant analytic methods, and to recognize and apply quantitative algorithms, techniques and interpret results; Demonstrating strategic thinking skills, combined with a solid technical foundation in data and model driven decision making; Developing the ability to apply critical and analytical methods to formulate and solve science, engineering, medical, and business problems; Effectively communicating analytic findings to non-specialists. We discuss some of the pilot projects that we conducted inside the data-intensive courses such as Data Analytics and Data Science and share insights from the lessons learned to provide ideal pedagogical environments to make our students’ data literate.

SESSION: 1.3 Workshop on Facilitating health and social care transformation through trustworthy and collaborative data sharing

HSCT 2021 - Joined Up Data Equals Better Care: Facilitating Health and Social Care Transformation through Trustworthy and Collaborative Data Sharing: Welcome and Workshop Summary

Michael Boniface
Wendy Hall
Sophie Stalla-Bourdillon
Brian Pickering
Steve Taylor
Laura Carmichael
Jack Hardinges

To realise the benefits of health and social care transformation for communities and individuals, we need strong multi-disciplinary and cross-organisational understanding of how data sharing initiatives – both existing and emerging (e.g., data collaboratives, data foundations, data trusts) – support multi-party sharing of regulated data in ways that are socially acceptable, trustworthy, sustainable, and scalable. The workshop will provide a forum for discussion by bringing together a multi-disciplinary group of researchers and practitioners with a wide range of specialisms – e.g., health care and social care practice, cyber-security, data governance, (health) data science, social science, ethics, law, public health planning and policy, technology and innovation – to explore the state of the art, challenges, and future research directions for trustworthy and collaborative sharing of regulated data, as well as the insights generated from these activities.

Digital transformations in Domestic Abuse support:: implications for data sharing

Rebecca Taylor
Bea Gardner
Mark Weal

We report here on an empirical scoping study focused on the digital transformation that took place in Domestic abuse (DA) and Violence Against Women and Girls (VAWG) multiagency support as a result of the restrictions imposed by the pandemic. Interviews were conducted with third sector providers, Local authority stakeholders, and other agencies, in two geographical areas. Our emerging findings offer insights into data sharing and linkage from different stakeholder perspectives. We argue that improving case management and service management data sharing processes would have the greatest immediate impact on effective service delivery and collaborative practice and support data linkage in the future.

SESSION: 1.4 Workshop on Research Infrastructure for Web Science

Research Infrastructure for Web Science

David De Roure
Pip Willcox

Web Science researchers use a rich set of data sources, software tools, and computational infrastructure in many aspects of their work, often creating new methods and tools. The Research Infrastructure for Web Science (RI4WebSci) workshop is a forum to share experience, practice and innovations in all these aspects of Web Science, and to identify requirements for future Web Science infrastructure.

Sounding out the System: Multidisciplinary Web Science Platforms for Creative Sonification

Iain Emsley
Alan Chamberlain

In this paper, we present our initial findings in using digital methods to consider the way that different devices can connect to the same object. We take a more experimental view of the ways in which network data might be used in compositions to help us to move beyond traditional sonification techniques into more musical territories which enables us to start to understand the ways in which archival data and tools might be used as a creative response to the data and provide a more human way of engaging with the data archive. Such approaches can inform the ways in which future research platforms for Web Science can be developed in a truly multidisciplinary way which matches the needs of the wider research community and supports public engagement.

Petri Nets for Modelling Communal Flocking Along Paths of Possible Experience

Robert Walton
David De Roure

Web science researchers identify social structures that form as content is shared on social media. These structures channel the flow of content and the resulting experiences of ideas, feelings and actions. The human drive to seek shared experiences shapes structures that can tend toward homogeneity of interest, beliefs and even moment-by-moment experience. Improved modelling tools are needed to understand the ecosystems formed by these processes.

We propose a generative model in which experiences build on each other; with past experiences filtering which content raises into awareness and how it is experienced. We then ask what simple social rules would cause individuals to flock together along the possible paths of experience supported by this model.

We created a method for building Petri net models to explore these principles and to act as the foundation for future work. In the limit as the number of individuals is increased, the dynamics of this flocking behaviour becomes governed by differential equations; these form a solid foundation for the development of further theory. Software to recreate these results and to create and play with models is available.

SESSION: 1.5 Workshop on Web and Philosophy

Web and Philosophy: A Decade Retrospective

Harry Halpin
Alexandre Monnin

This retrospective of the Web and Philosophy (PhiloWeb) symposia traces the evolution of the philosophy of the web over a decade, from its origins at La Sorbonne through the Googleplex and beyond. The papers in the proceedings, as well as invited talks, given in the online retrospective (PhiloWeb 2021) are outlined. A call to arms to put the “philosophy” back into the “philosophical engineering” of web is shown to be necessary in order to redeem the revolutionary horizons opened by the web.

Noospheric consciousness: integrating neural models of consciousness and of the web

Shima Beigi
Francis Heylighen

The world-wide web has been conceptualized as a global brain for humanity due to its neural network-like organization. To determine whether this global brain could exhibit features associated with consciousness, we review three neuroscientific theories of consciousness: information integration, adaptive resonance and global workspace. These theories propose that conscious states are characterized by a globally circulating, resonant pattern of activity that is sufficiently coherent to be examined and reflected upon. We then propose a correspondence between this notion and Teilhard de Chardin's concept of the noosphere as a forum for collective thinking, and explore some implications of this self-organizing dynamics for the evolution of shared, global understanding.

On post-truth and correctness over the Web

Petros Stefaneas

We propose the extension of previous work done on “the Web as a tool for proving” [2] to suggest a basis for a meta-theory for post-truth. We believe that such a theory cannot be based on the current philosophical theories about truth such as the correspondence theory, the coherence theory, the pragmatic theory or the consensus theory [8]. Neither can be based on the constructivist approach that truths in general are social constructions. “Post-truths” are often considered in the literature as real existing entities [7]; the only aspect that takes place in the real world is the processes of assigning truth-values to assertions. Our view is that the truth-assignment processes involve at least two agents – the claim initiator and the truth interpreter - which can be persons, groups of persons or even machines. The same kind of approach applies for Web-based proving which occurs at a particular time and place and involves particular people, some of them can act as administrators. Underlying the truth-assignment processes and the proving processes over the Web, we have some kind of social event. The nature of the Web allows the participation of people who have not necessarily particular skills as members of a certain group. Web proving has been studied [2] as a particular type of Goguen's proof events [3][4]. Proof events, are social events that involve particular persons that form social groups of experts with particular knowledge and skills. These groups are open and have no internal hierarchical structure but they usually have at least one administrator who acts as an overseer of the correctness of the proving processes. In any case, a proof event apart from involving specialists, involves mediating objects such as spoken words, data, videos, scientific papers, etc. Web-based proof-events have a social component, communication medium, prover-interpreter interaction, interpretation process, understanding and validation, historical component, and styles. By truth-interpretation we understand the determination of the definition or meaning of the signs that are fixed by the language or semiotic code of communication used for the claim initiation or for what is thought as truth- assignment. Interpretation is an active process of interactive nature, as allowed by the open Web architecture [1][2].

The concept of post-truth has received a lot of publicity the last few years [7]. It refers to situations that claims can be accepted on the basis of beliefs or emotions and not of real facts [7] [8] [9]. As a concept, it originates from the study of misinformation and in particular fake news [6] [10]. However, its formal epistemological definition is an open question. In our view we need to re-approach the process of truth-assignment within the Web technology, in order to develop a proper theory on post truth based on agents within a social environment. Many social agents that can contribute towards this goal. Still, two of them play the most crucial part in this direction: the claim initiator and the truth-interpreter, both seen as integral parts in the social approach to truth-assignment processes. This theory needs to be accompanied by the necessary socio-theoretical approach regarding the concepts of correctness and proving. Mathematical logic uses mainly mechanical methods, such as rules of inference and validation, and in a sense considers that most of the truth-related components of a logical system can be constructed as finite sequences of such rules. This approach leaves out the social dimension of the truth-assignment process as well as the social dimension of proofs, including their histories, attempts to arrive at a true conclusion, motivations, misleading interpretations, etc. Truth-assignments have much greater diversity than what most of logicians would easily accept.

Any post-truth statement depends much on the processes used for the truth assignment. Depending on which processes are used, the output differs. To conclude that “an assertion is true”, a broader notion of correctness is desirable. In the literature, correctness refers to whether or not an argument or proof follows a logical path from premises to conclusions [12]. We suggest that, apart from this logical correctness, we need a rule-based and a morally-based conception of correctness to encapsulate accurately the social aspects. Correctness of an action over the Web needs to re-defined as an action that is not only logically acceptable but complies with these kinds of social norm [11]. The open architecture of the Web, facilitates certain social behaviors and prevents others, thus it provides novel features of the truth assignment process far beyond the traditional approach to argumentation within a natural language.

Bigger Isn’t Better: The Ethical and Scientific Vices of Extra-Large Datasets in Language Models

Trystan S. Goetze
Darren Abramson

The use of language models in Web applications and other areas of computing and business have grown significantly over the last five years. One reason for this growth is the improvement in performance of language models on a number of benchmarks — but a side effect of these advances has been the adoption of a “bigger is always better” paradigm when it comes to the size of training, testing, and challenge datasets. Drawing on previous criticisms of this paradigm as applied to large training datasets crawled from pre-existing text on the Web, we extend the critique to challenge datasets custom-created by crowdworkers. We present several sets of criticisms, where ethical and scientific issues in language model research reinforce each other: labour injustices in crowdwork, dataset quality and inscrutability, inequities in the research community, and centralized corporate control of the technology. We also present a new type of tool for researchers to use in examining large datasets when evaluating them for quality.

Extended Computation: Wide Computationalism in Reverse

Paul Smart
Wendy Hall
Michael Boniface

Arguments for extended cognition and the extended mind are typically directed at human-centred forms of cognitive extension—forms of cognitive extension in which the cognitive/mental states and processes of a given human individual are subject to a form of extended or wide realization. The same is true of debates and discussions pertaining to the possibility of Web-extended minds and Internet-based forms of cognitive extension. In this case, the focus of attention concerns the extent to which the informational and technological elements of the online environment form part of the machinery of the (individual) human mind. In this paper, we direct attention to a somewhat different form of cognitive extension. In particular, we suggest that the Web allows human individuals to be incorporated into the computational/cognitive routines of online systems. These forms of computational/cognitive extension highlight the potential of the Web and Internet to support bidirectional forms of computational/cognitive incorporation. The analysis of such bidirectional forms of incorporation broadens the scope of philosophical debates in this area, with potentially important implications for our understanding of the foundational notions of extended cognition and the extended mind.

An ‘Aristotelian’ Philosophy of the Internet

Laszlo Ropolyi

The paper argues for the necessity of building up a philosophy of the internet and proposes a version of it, an ‘Aristotelian’ philosophy of the internet. First, a short overview of some recent trends in the internet research is presented. This train of thoughts leads to a proposal of understanding the nature of the internet in the spirit of the Aristotelian philosophy i.e., to conceive “the internet as the internet”, as a totality of its all aspects, as a whole entity. For this purpose, the internet is explained in four – easily distinguishable, but obviously connected – contexts: we regard it as a system of technology, as an element of communication, as a cultural medium and as an independent organism. Based on these investigations we conclude that the internet is the medium of a new mode of human existence created by late modern man; a mode that is built on earlier (i.e., natural, and social) spheres of existence and yet it is markedly different from them. We call this newly formed existence web-life.

SESSION: 1.6 Workshop on Web Science for Digital Capabilities

Web Science for Digital Empowerment: WSDC 2021

Bidisha Chaudhuri
Srinath Srinivasa

Digital Capabilities is an emerging term that relates the economic notion of capability with ICT, especially Web and Internet technologies as an enabler. This workshop aims to bring together a disparate and eclectic population of researchers, practitioners and policy-makers to help build this emerging discipline.

Assisted Telemedicine Model for Rural Healthcare Ecosystem

Divya Raj
Srikanth T K

A project involving study and field trials to analyze and validate the relevance and feasibility of an “Assisted Telemedicine” model towards addressing the accessibility gaps in the rural primary healthcare ecosystem. The work also involved designing a blue-print of an Assisted Telemedicine app for catering to the healthcare consultation needs during and beyond Covid-19 in a participatory design model. A customized app was created for “assisted telemedicine” model and features were incrementally added based on observations and inputs received from various stakeholders. Initial studies indicate that this model of health care delivery can benefit a range of demographics and can find acceptance among the different stakeholders. The potential impact of this intervention is also studied from the perspective of Capability Approach.

AIoT: AI meets IoT and Web in Smart Healthcare

Asoke Talukder
Roland Haas

Extending quality healthcare to all communities is an important goal for sustainable societies and economies. Even in the most sophisticated healthcare systems the doctors tend to spend less time and are under increasing workload and time pressure. Telemedicine, AI, Web technologies, and Big Data can play an important role to increase efficiency in diagnosis and improve the quality of care. Telemedicine over Web is becoming increasingly popular and the COVID-19 pandemic has accelerated this trend. In this paper we present a novel smartphone-based care solution that uses progressive web apps (PWA) to capture patient data, integrates this data with a diverse set of medical knowledge sources and deploys AI to support differential diagnosis and patient stratification. The system can suggest action and treatment plans and has been designed with special consideration to cyber security. The smart care system is based on technologies of the next generation web like PWA, WebBluetooth, Web Speech API, WebUSB, and WebRTC and integrates well into the concept of a smart hospital.

Towards Evaluating Students’ Digital Capabilities: An Analysis of UK Further Education Student Surveys

Tim O'Riordan
Daniel Dennis

The ability of students to access learning via technology is a key factor in sustainable development goals. During the COVID-19 emergency most students’ educational experience moved from face-to-face physical classroom to web-based environments which exposed disparities in students’ digital resources and competence and placed greater attention on the need to address these inequalities. Digital competence is typically measured in terms of an individual's ability to use digital technology to achieve their work, study or personal objectives. Being digitally competent is significant with regard to an individual being able to achieve things that they value, but is only part of an overall evaluation of their digital capability. This paper argues that in addition to competence, assessment of digital capability should include an evaluation of a person's access to technology as well as their attitudes towards its value in achieving their goals.

This paper is a work in progress exploring findings derived from research evaluating strategies to improve staff capability and confidence in using online learning technologies at eight FE colleges in the south east of England. In this research students undertook surveys that included self-assessment of their digital competences following the DigComp model, information on their use of digital devices and home network reliability, and evaluated their enjoyment of and confidence in using online learning technologies. This current paper explores the outcomes from these surveys.

Evaluation of survey data revealed a significant digital divide between those who had access to suitable devices and reliable network connections and those who did not. Results show significant associations between students’ access to the technology they need to take part in online lessons, their self-assessed competence, and their capability to fully engage with and satisfaction with online learning. This paper suggests that these factors should be considered as part of a ‘digital capabilities index’ when undertaking evaluations of individual student needs and identifying potential ‘at risk’ students.

Leveraging technology to improve quality of mental health care in Karnataka

Srikanth T K
Girish N Rao
Rajani Parthasarathy
Divya Raj
Suresh Bada Math
Seema Mehrotra
Jagadisha Tirthahalli
Naveen C Kumar
Paulomi Sudhir
Deepak Jayarajan

Health is multidimensional and “there is no health without mental health”. We document ongoing technology-enabled initiatives for enhancing mental health care in the state of Karnataka. Multi-disciplinary teams have collaborated on a set of four projects to design and deploy digital technologies across different parts of the healthcare continuum, addressing beneficiaries (patients and care-givers), different levels of mental health care providers (doctors, social workers, etc.,) and health administrators, alike. The vision is to define and develop a digital platform for mental health care and services within the state of Karnataka and scalable across India.

Event Detection in Twitter using Social Synchrony and Average Number of Common Friends

Nirmal Kumar Sivaraman
Jaswant Reddy Tokala
Radha Sai Ch V Rupesh
Sakthi Balan Muthiah

Detecting events from social media data is an important problem. In this paper, we propose a method to detect events by looking at a novel parameter – the average number of common friends of all the pairs of users present in the dataset. This is a trait of herding. We analyze only the metadata for this and not the content of the tweets. We evaluate our method on a dataset of 3.3 million tweets that was collected by us. Our method outperformed the state of the art method in recall and F1 score. To test the generality of our method, we tested it on a publicly available dataset of 1.28 million tweets and the results are encouraging.

Diabetes Tracker: An Information System to assist and track nutritional information

Vipula Rawte
Hongyi Huang
Michael Morrison
Janine Wu
Travis Peterson
Thilanka Munasinghe

Chronic disease such as diabetes has become a genuine health concern in people’s lives. Millions of people are already diagnosed with it, and this number is only expected to continue growing along with the price of any kind of aid that can be provided for this disease. This makes being able to offer any kind of assistance to those that are suffering through this regardless of economic status that much more important. This paper will discuss a prototype information system being built to provide remote healthcare via informative assistance to anyone with diabetes. To reach as much of an audience as possible, this system will be implemented on both web and mobile application-based platforms. We intend to extend this information system can provide self-health management advice for users depending on any related information to them, their condition, and the information they enter, such as their diet intake and current body condition. At this stage, we are focusing on making an easy-to-use and understand application that does not include complicated warning systems and notifications. In the future, we intend to add carefully evaluated functionalities that are user-friendly for general users who are not tech-savvy. Thus, our objective for this work is two-fold. Firstly, we plan to implement successful web and mobile-based applications that are easy and helpful to use, providing relief to those who want and need it. Secondly, we hope to advance our research and assist others in the research community and inspire others trying to solve the same problem.

SceVar (Scenario Variations) Database: Real World Statistics driven Scenario Variations for AV Testing in Simulation: Abstraction of static and dynamic entities from road network-traffic ecosystem and their interactions or relationships in semantic data models for realistic simulation-based testing of AVs

Sagar Pathrudkar
Saikat Mukherjee
Vijaya Sarathi
Manish Chowdhary

Autonomous vehicles (AV) are on a journey from incubation to a widespread usage. A major challenge for self-driving cars to reach widespread usage are guarantees about their safety and reliability. One approach to increasing safety and reliability is testing in simulation. However, simulation-based testing can be beneficial only if the simulations mimic the real-world phenomena. Additionally, AV environments are characterized by high-dimensionality, nonlinearity, stochasticity, and non-stationarity, hence it is difficult in practice to exhaustively list and test all possible scenario variations. While this is a typical state space explosion problem, as usual it is essential to remove un-realistic scenarios from the test space, to reach a more realistic and plausible list of scenario variations. In this work, we present SceVar – a scenario variations database which analyzes real world driving data to extract realistic traffic patterns and driving behaviors which is then used to create massive number of scenario variations for (regression) testing in simulation. In addition, we also envisage that such statistical data can also be used by AV regulatory and testing agencies to certify vehicles for usage in specific operational design domains (ODDs).

SESSION: 1.7 Workshop on Democratic Futures and the Web

Introduction to WebSci’21 Workshop ‘Democratic Futures and the Web’

Silke Roth
Valentina Cardo
Matt Ryan

SESSION: 1.8 Tutorials

A new system for evaluating brand importance: A use case from the fashion industry

Andrea Fronzetti Colladon
Francesca Grippa
Ludovica Segneri

Today brand managers and marketing specialists can leverage huge amount of data to reveal patterns and trends in consumer perceptions, monitoring positive or negative associations of brands with respect to desired topics. In this study, we apply the Semantic Brand Score (SBS) indicator to assess brand importance in the fashion industry. To this purpose, we measure and visualize text data using the SBS Business Intelligence App (SBS BI), which relies on methods and tools of text mining and social network analysis. We collected and analyzed about 206,000 tweets that mentioned the fashion brands Fendi, Gucci and Prada, during the period from March 5 to March 12, 2021. From the analysis of the three SBS dimensions - prevalence, diversity and connectivity - we found that Gucci dominated the discourse, with high values of SBS. We use this case study as an example to present a new system for evaluating brand importance and image, through the analysis of (big) textual data.

Interactive Demonstrations and Hands-On Use of thenet.science Cyberinfrastructure for Network Science Chairs’ Welcome and Tutorial Summary

Golda Barrow
Chris J. Kuhlman
Dustin Machi
S. S. Ravi

Networks are readily identifiable in many aspects of society: cellular telephone networks and social networks are two common examples. Networks are studied within many academic disciplines. Consequently, a large body of (open-source) software is being produced to perform computations on networks. A cyberinfrastructure for network science, called net.science, is being built to provide a computational platform and resource for both producers and consumers of networks and software tools. This tutorial is a hands-on demonstration of some of net.science’s features.

The decentralization of Social Media through the blockchain technology

Barbara Guidi
Andrea Michienzi

Online Social Networks (OSNs) have become one of the most popular applications of the daily life of users in the worldwide. Today, the number of Social Media users is more than 4 billion, and this trend increases year after year with a high impact on the privacy issue. During the last years, decentralization of social services has been considered as a big opportunity to overcome the main privacy issues in OSNs, and not only (fake news, censorship, etc.). Blockchain technology represents today the most well-known decentralized technique, which has been taken into account to develop the new generation of decentralized social platforms. Blockchain-based Online Social Media (BOSMs) are decentralized Social Media platforms that use the blockchain technology as the underlying technology or as a tool in order to provide rewarding strategies. In this tutorial, we will highlight the BOSMs scenario by presenting their main characteristics and how data could be collected and analysed.

SESSION: 1.9 PhD Symposium

Introduction to Ph.D. Symposium

Leslie Alan Carr
Jisun An

Factors That May Prevent Meaningful Digital Legacy

Duncan Reid

We increasingly live our lives digitally and capture memories on devices such as smartphones. Over time, we collect an array of digital assets, especially photographs, that we may wish to leave for others as a digital legacy. In this PhD project, a mixed method approach was used to compare survey data with semi-structured interviews to see how technological advances may have aided photo management practices, and to what extent other types of digital assets might better represent the end user in terms of legacy. Based on the findings, it would appear that digital assets support various personas during life and should continue to do so after death, additionally possible design implications emerged that might aide more meaningful digital legacy curation.

Comparing two sentiment analysis approaches by understand the hesitancy to COVID-19 vaccine based on Twitter data in two cultures

Malak Alsabban

Data has become a precious resource, and Twitter is one of the most important sources. Especially, during the pandemic. In this study, I aim to compare the sentiment of two sentiment analysis approaches (lexicon based and machine learning) by using the tweets that are related to COVID-19 vaccine in two different cultures (English and Arabic).

Exo-SIR: An Epidemiological Model to Quantify the Exogenous Information Diffusion and its Application to Detect Events

Nirmal Sivaraman

Online social media has changed the way we function as a society. People talk about various topics like wishing someone’s birthday to political issues. We study how the information diffuse through the online social networks. We propose a model called Exo-SIR model that quantifies the extent to which the diffusion is powered by external sources of information. Also, we propose that this model could be used for detecting events by analyzing the patterns of user behavior.

Exploring the Potential of Knowledge Graphs to Support Distant Knowledge Search for Innovation

Dawa Chang

This research aims to investigate how and under what conditions Knowledge Graphs (KGs) can support ideation tasks in the innovation process in new product and service development. Overcoming humans’ cognitive limitations of creativity and enhancing their abilities to search and acquire “distant knowledge”, i.e., knowledge that exists outside individuals’ immediate technological or organizational boundaries, have been long-term topics in innovation management, motivated by their significance for more substantial innovation which shall guarantee organizations’ sustainability and long-term success. Consequently, many studies have been conducted to develop methods to tackle cognitive limitations and represent relevant knowledge more effectively to individuals. Research into the potential of KGs to support this process, however, has been limited, despite their abilities to represent and structure knowledge. Our research seeks to investigate the potential of KGs to support innovators through efficient and effective exploration of distant knowledge.

Algorithmic ecologies of justice: Using computational social science methods to co-design tools of resistance, resilience and care with communities

Yadira Sanchez Benitez

The aims of this research are to study the interlinks of social and technical harms caused by the powerful in Mexico by carrying out social media research as well as interviews to document social and technical harms affecting rural communities in Mexico; with the purpose of co-designing with the communities bespoke algorithmic ecologies and tools of justice that allow for accountability, resistance and care.

Researching the Impact of Music Streaming on Social and Personal Listening Behaviours

Allison Noble

Throughout the years, the act of music listening has followed a trajectory of digital evolution, with the development of each new listening device altering social, industrial and musical practices. In this current era where music streaming platforms continue to dominate in popularity, questions must be raised in regard to the effects that these socially connected, personalized, black-boxed platforms are having on societal, personal and communal acts associated with music listening. The aim of this proposal is to outline why music interaction must be examined within an interdisciplinary, Web Science-led context to further understand the social effects of music streaming.

Recognizing Hate-prone Characteristics of Online Hate Speech Targets

Raneem Alharthi

Recognizing the characteristics of online hate speech targets can provide important information, which can help predict potential targets and protect them. Understanding the targets of online hate characteristics in the context of social media platforms where the hate is dramatically increasing lately can improve awareness of how different people react to online hate, which we anticipate will be manifested differently for different demographic groups. Targets of online hate play an important role in providing an extensive and informative explanation of the online hate event. This has been barely studied in previous research. In this PhD research, we propose a hate prone characteristics recognition framework for online hate targets, which consists of several modules, including data collection, data pre-processing, feature extraction, contextualisation and the hate prone characteristics recognition model that has the ability to recognise the common online hate prone characteristics to enhance the online hate prevention services, and finally, the hateful replies prediction model. This online hate prediction model has the potential to be personalised/adaptive in future applications.

Building a Social Machine for Graduate Mobility

Neha Keshan

The paper discusses the construction of a social machine to solve a complex problem faced by doctoral students: “What Comes Next?” especially important during this pandemic when many traditional career paths have been compromised. This system utilizes TimeBank, SkillBank and Knowledge Graphs to provide users with an OpportunityBank: an opportunity to connect with other doctorates, to understand different career paths after graduation, alternative use of acquired skills, and to network sharing available positions, research work, and data. Users can collaborate while keeping an eye on what is happening in other social platforms. The system uses ‘prosopography narrative’ and provides a platform for ‘shadow institutes’, while maintaining provenance, trust, and accessibility to required information through knowledge graphs.