ACM DL

ACM Transactions on

the Web (TWEB)

Menu
Latest Articles

FusE: Entity-Centric Data Fusion on Linked Data

Many current web pages include structured data which can directly be processed and used. Search engines, in particular, gather that structured data and provide question answering capabilities over the integrated data with an entity-centric presentation of the results. Due to the decentralized nature of the web, multiple structured data sources can... (more)

What Web Template Extractor Should I Use? A Benchmarking and Comparison for Five Template Extractors

A Web template is a resource that implements the structure and format of a website, making it ready... (more)

Polarization and Fake News: Early Warning of Potential Misinformation Targets

Users’ polarization and confirmation bias play a key role in misinformation spreading on online social media. Our aim is to use this information to determine in advance potential targets for hoaxes and fake news. In this article, we introduce a framework for promptly identifying polarizing content on social media and, thus,... (more)

Cashtag Piggybacking: Uncovering Spam and Bot Activity in Stock Microblogs on Twitter

Microblogs are increasingly exploited for predicting prices and traded volumes of stocks in financial markets. However, it has been demonstrated that much of the content shared in microblogging platforms is created and publicized by bots and spammers. Yet, the presence (or lack thereof) and the impact of fake stock microblogs has never been... (more)

Layout Cross-Platform and Cross-Browser Incompatibilities Detection using Classification of DOM Elements

Web applications can be accessed through a variety of user agent configurations, in which the... (more)

Exploiting Usage to Predict Instantaneous App Popularity: Trend Filters and Retention Rates

Popularity of mobile apps is traditionally measured by metrics such as the number of downloads, installations, or user ratings. A problem with these... (more)

NEWS

About TWEB

The journal Transactions on the Web (TWEB) publishes refereed articles reporting the results of research on Web content, applications, use, and related enabling technologies.

The scope of TWEB is described on the Call for Papers page. Authors are invited to submit original research papers for consideration by following the directions on the Author Guidelines page.

read more
Forthcoming Articles
'The Enemy Among Us': Detecting Cyber Hate Speech with Threats-based Othering Language Embeddings

Offensive or antagonistic language targeted at individuals and social groups based on their personal characteristics (also known as cyber hate speech or cyber hate) has been frequently posted and widely circulated via the World Wide Web. This can be considered as a key risk factor for individual and societal tension linked to regional instability. Automated Web-based cyber hate detection is important for observing and understanding community and regional societal tension - especially in online social networks where posts can be rapidly and widely viewed and disseminated. While previous work has involved using lexicons, bags-of-words or probabilistic language parsing approaches, they often suffer from a similar issue which is that cyber hate can be subtle and indirect - thus depending on the occurrence of individual words or phrases can lead to a significant number of false negatives, providing inaccurate representation of the trends in cyber hate. This problem motivated us to challenge thinking around the representation of subtle language use, such as references to perceived threats from the other including immigration or job prosperity in a hateful context. We propose a novel framework that utilises language use around the concept of othering and intergroup threat theory to identify these subtleties and we implement a novel classification method using embedding learning to compute semantic distances between parts of speech considered to be part of an othering narrative. To validate our approach we conduct several experiments on different types of cyber hate, namely religion, disability, race and sexual orientation, with F-measure scores for classifying hateful instances obtained through applying our model of 0.93, 0.86, 0.97 and 0.98 respectively, providing a significant improvement in classifier accuracy over the state-of-the-art.

Learning Linear Influence Models in Social Networks from Transient Opinion Dynamics

Social networks, forums, and social media have emerged as global platforms for forming and shaping opinions on a broad spectrum of topics like politics, sports and entertainment. Users (also called actors) often update their evolving opinions, influenced through discussions with other users. Theoretical models and their analysis on understanding opinion dynamics in social networks abound in the literature. However, these models are often based on concepts from statistical physics. Their goal is to establish various regulatory phenomena like steady-state consensus or bifurcation. Analysis of transient effects is largely avoided. Moreover, many of these studies assume that actors opinions are observed globally and synchronously, which is rarely realistic. In this paper, we initiate an investigation into a family of novel data-driven influence models that accurately learn and fit realistic observations. We estimate and do not presume edge strengths from observed opinions at nodes. Our influence models are linear, but not necessarily positive or row stochastic in nature. As a consequence, unlike the previous studies, they do not depend on system stability or convergence during the observation period. Furthermore, our models take into account a wide variety of data collection scenarios. In particular, they are robust to missing observations for several time steps after an actor has changed its opinion. In addition, we consider scenarios where opinion observations may be available only for aggregated clusters of nodes  a practical restriction often imposed to ensure privacy. Finally, to provide a conceptually interpretable design of edge influence, we offer a relatively frugal variant of our influence model, where the strength of influence between two connecting nodes depend on the node attributes (demography, personality, expertise etc.). Such an approach reduces the number of model parameters, reduces overfitting, and offers a tractable and explicable sketch of edge-influences in the context of opinion dynamics. With six real-life datasets crawled from Twitter and Reddit, as well as three more datasets collected from in-house experiments (with 102 volunteers), our proposed system gives significant accuracy boost over four state-of-the-art baselines. We also observe that a careful design of edge strengths using node properties is crucial, since it offers substantially better performance than the one with independent edge weights.

User Studies on End -User Service Composition: a Literature Review and a Design Framework

Context: End-user service composition (EUSC) is a service-oriented paradigm that aims to empower end users and allow them to compose their own web applications from reusable service components. User studies have been used to evaluate EUSC tools and processes. Such an approach should benefit software development, because incorporating end users feedback into software development should make software more useful and usable. Problem: There is a gap in our understanding of what constitutes a user study, and how a good user study should be designed, conducted and reported. Goal: This paper aims to address this gap. Method: The paper presents a systematic mapping study of 46 selected user studies for EUSC. Guided by a review framework, the paper systematically and consistently assesses the focus, methodology and cohesion of each of these studies. Results: The paper concludes that the focus of these studies is clear, but their methodology is incomplete and inadequate, their overall quality is poor. The findings lead to the development of a design framework and a set of checklist guidelines for the design, conducting and reporting of good user studies for EUSC. The detailed analysis and the insights obtained from the analysis should be applicable to the design of user studies for service-oriented systems in general.

Detecting Cyberbullying and Cyberaggression in Social Media

Cyberbullying and cyberaggression are increasingly worrisome phenomena that affect people across all demographics. Already in 2014, more than half of young social media users worldwide experienced them in some form, being exposed to prolonged and/or coordinated digital harassment. Victims can experience a wide range of emotional consequences such as embarrassment, depression, isolation from other community members, which can lead to even more serious consequences such as suicide attempts. Nevertheless, tools and technologies to understand and mitigate it are scarce and mostly ineffective. In this paper, we take the first concrete steps to understand the characteristics of abusive behavior in Twitter, one of today's largest social networks. We analyze 1.2 million users and 2 million tweets, comparing users participating in discussions around seemingly normal topics like the NBA, to those more likely to be hate-related, such as the Gamergate controversy or the gender pay inequality at the BBC. We also explore specific manifestations of abusive behavior, i.e., cyberbullying and cyberaggression, in one of the hate-related communities (Gamergate). We present a robust methodology to distinguish bullies and aggressors from regular users by considering text, user, and network based attributes. Using various state-of-the-art machine learning algorithms, we can classify these accounts with over 90% accuracy and AUC. Finally, we look at the current status of the Twitter accounts of users marked as abusive by our methodology and discuss the performance of the mechanisms used by Twitter to suspend users.

All ACM Journals | See Full Journal Index

Search TWEB
enter search term and/or author name