An Empirical Evaluation of Property Recommender Systems for Wikidata and Collaborative Knowledge Bases

Title: An Empirical Evaluation of Property Recommender Systems for Wikidata and Collaborative Knowledge Bases

Authors: Eva Zangerle, Wolfgang Gassler, Martin Pichl, Stefan Steinhauser, Günther Specht (University of Innsbruck)

Abstract: The Wikidata platform is a crowdsourced, structured knowledgebase aiming to provide integrated, free and languageagnostic facts which are amongst others used by Wikipedias. Users who actively enter, review and revise data on Wikidata are assisted by a property suggesting system which provides users with properties that might also be applicable to a given item. We argue that evaluating and subsequently improving this recommendation mechanism and hence, assisting users, can directly contribute to an even more integrated, consistent and extensive knowledge base serving a huge variety of applications. However, the quality and usefulness of such recommendations has not been evaluated yet. In this work, we provide the first evaluation of different approaches aiming to provide users with property recommendations in the process of curating information on Wikidata. We compare the approach currently facilitated on Wikidata with two state-of-the-art recommendation approaches stemming from the field of RDF recommender systems and collaborative information systems. Further, we also evaluate hybrid recommender systems combining these approaches. Our evaluations show that the current recommendation algorithm works well in regards to recall and precision, reaching a recall@7 of 79.71% and a precision@7 of 27.97%. We also find that generally, incorporating contextual as well as classifying information into the computation of property recommendations can further improve its performance significantly.

This contribution to OpenSym 2016 will be made available as part of the OpenSym 2016 proceedings on or after August 17, 2016.

Evaluating and Improving Navigability of Wikipedia: A Comparative Study of Eight Language Editions

Title: Evaluating and Improving Navigability of Wikipedia: A Comparative Study of Eight Language Editions

Authors: Daniel Lamprecht (KTI, Graz University of Technology), Dimitar Dimitrov (GESIS – Leibniz Institute for the Social Sciences), Denis Helic (KTI, Graz University of Technology) and Markus Strohmaier (GESIS – Leibniz Institute for the Social Sciences and University of Koblenz-Landau)

Abstract: Wikipedia supports its users to reach a wide variety of goals: looking up facts, researching a topic, making an edit or simply browsing to pass time. Some of these goals, such as the lookup of facts, can be effectively supported by search functions. However, for other use cases such as researching an unfamiliar topic, users need to rely on the links to connect articles. In this paper, we investigate the state of navigability in the article networks of eight language versions of Wikipedia. We find that, when taking all links of articles into account, all language versions enable mutual reachability for almost all articles. However, previous research has shown that visitors of Wikipedia focus most of their attention on the areas located close to the top. We therefore investigate different restricted navigational views that users could have when looking at articles. We find that restricting the view of articles strongly limits the navigability of the resulting networks and impedes navigation. Based on this analysis we then propose a link recommendation method to augment the link network to improve navigability in the network. Our approach selects links from a less restricted view of the article and proposes to move these links into more visible sections. The recommended links are therefore relevant for the article. Our results are relevant for researchers interested in the navigability of Wikipedia and open up new avenues for link recommendations in Wikipedia editing.

This contribution to OpenSym 2016 will be made available as part of the OpenSym 2016 proceedings on or after August 17, 2016.

A Framework for Open Assurance of Learning

Title: A Framework for Open Assurance of Learning

Authors: Gokul Bhandari and Maureen Gowing (Odette School of Business, University of Windsor)

Abstract: Assurance of Learning (AOL) refers to the outcomes assessment process which involves the systematic collection, review, and use of information about educational programs undertaken for the purpose of improving student learning and development [8]. While emerging trends such as open education, open learning, learning analytics, academic analytics, and big data in education have recently become mainstream, studies regarding the design and development of open source analytics applications for AOL are non-existent. In this paper, we describe an application called AOL Analyzer that we developed for our business school last year to assist in the analysis of AOL results reported by faculty. To the best of our knowledge, this is a first paper to bridge the existing gap in
AOL analytics research.

This contribution to OpenSym 2016 will be made available as part of the OpenSym 2016 proceedings on or after August 17, 2016.

User Generated Services during Software Introductions

Title: User Generated Services during Software Introductions

Authors: Martin Schymanietz and Nivedita Agarwal (University of Erlangen-Nuremberg)

Abstract: In this paper, we describe the lack of user participation and involvement during software introductions. Especially big projects with a volume larger than 10 million US$ are very likely to neglect important benchmarks like e.g. the budget or even completely fail. To fight these costly failures and support software introductions, we propose a service system that integrates the user into the software rollout. This service system consists of three service modules that are supported by components for feedback, communication, user incentives and motivation as well as. The service modules shall empower the users to give support and deliver tutorials or training to other users and furthermore establish a project specific platform which encourages a continuous improvement of the current software solution.

This contribution to OpenSym 2016 will be made available as part of the OpenSym 2016 proceedings on or after August 17, 2016.

Exploring the roles of external facilitators in IT-driven open strategizing

Title: Exploring the roles of external facilitators in IT-driven open strategizing

Authors: Josh Morton, Alex Wilson and Louise Cooke (Loughborough University; School of Business and Economics)

Abstract: This paper examines the different roles external facilitators have in information technology driven open strategizing. Using a strategy-as-practice lens and drawing on two empirical cases of open strategy in organizations, our paper highlights four emerging roles of external facilitators which we call; structuring, promoting, moderating and analyzing. In concluding the paper we call for further research relating to external facilitators and open strategy.

This contribution to OpenSym 2016 will be made available as part of the OpenSym 2016 proceedings on or after August 17, 2016.

Comparing OSM Area-Boundary Data to DBpedia

Title: Comparing OSM Area-Boundary Data to DBpedia

Authors: Doris Silbernagl, Nikolaus Krismer and Günther Specht (Department of Computer Science, University of Innsbruck, Austria)

Abstract: OpenStreetMap (OSM) is a well known and widely used data source for geographic data. This kind of data can also be found in Wikipedia in the form of geographic locations, such as cities or countries. Next to the geographic coordinates, also statistical data about the area of these elements can be present. Since it is possible to extract these data from OpenStreetMap as well, it is sensible to examine the quality of the OSM information about those specific boundary elements and compare them to an also crowd-sourced source like Wikipedia. Hence, in this paper OSM data of different countries are used to calculate the area of valid boundary (multi) polygons and are then compared to the respective DBpedia (a large scale knowledge base extract from Wikipedia) entries.

This contribution to OpenSym 2016 will be made available as part of the OpenSym 2016 proceedings on or after August 17, 2016.

Predicting the quality of user contributions via LSTMs

Title: Predicting the quality of user contributions via LSTMs

Authors: Rakshit Agrawal and Luca de Alfaro (University of California, Santa Cruz)

Abstract: In many collaborative systems it is useful to automatically estimate the quality of new contributions; the estimates can be used for instance to flag contributions for review. To predict the quality of a contribution by a user, it is useful to take into account both the characteristics of the revision itself, and the past history of contributions by that user. In several approaches, the user’s history is first summarized into a number of features, such as number of contributions, user reputation, time from previous revision, and so forth. These features are then passed along with features of the current revision to a machine-learning classifier, which outputs a prediction for the user contribution. The summarization step is used because the usual machine learning models, such as neural nets, SVMs, etc. rely on a fixed number of input features.We show in this paper that this manual selection of summarization features can be avoided by adopting machine-learning approaches that are able to cope with temporal sequences of input.

In particular, we show that Long-Short Term Memory (LSTM) neural nets are able to process directly the variable length history of a user’s activity in the system, and produce an output that is highly predictive of the quality of the next contribution by the user. Our approach does not eliminatethe process of feature selection, which is present in all machine learning. Rather, it eliminates the need for deciding which features from a user’s past are most useful for predicting the future: we can simply pass to the machine-learning apparatus all the past, and let it come up with an estimate for the quality of the next contribution.

We present models combining LSTM and NN for predicting revision quality and show that the prediction accuracy attained is far superior to the one obtained using the NN alone. More interestingly, we also show that the prediction attained is superior to the one obtained using user reputation as a feature summarizing the quality of a user’s past work. This can be explained by noting that the primary function of user reputation is to provide an incentive towards performing useful contributions, rather than to be a feature optimized for prediction of future contribution quality.

We also show that the LSTM output changes in a natural way in response to user behavior, increasing when the user performs a sequence of good quality contributions,and decreasing when the user performs a sequence of low-quality work. The LSTM output for a user could thus be usefully shown to other users, alongside the user’s reputation and other information.

This contribution to OpenSym 2016 will be made available as part of the OpenSym 2016 proceedings on or after August 17, 2016.

Differentiating Communication Styles of Leaders on the Linux Kernel Mailing List

Title: Differentiating Communication Styles of Leaders on the Linux Kernel Mailing List

Authors: Daniel Schneider, Scott Spurlock and Megan Squire (Elon University)

Abstract: Much communication between developers of free, libre, and open source software (FLOSS) projects happens on email mailing lists. Geographically and temporally dispersed development teams use email as an asynchronous, centralized, persistently stored institutional memory for sharing code samples, discussing bugs, and making decisions. Email is especially important to large, mature projects, such as the Linux kernel, which has thousands of developers and a multilayered leadership structure. In this paper, we collect and analyze data to understand the communication patterns in such a community. How do the leaders of the Linux Kernel project write in email? What are the salient features of their writing, and can we discern one leader from another? We find that there are clear written markers for two leaders who have been particularly important to recent discussions of leadership style on the Linux Kernel Mailing List (LKML): Linux Torvalds and Greg Kroah-Hartman. Furthermore, we show that it is straightforward to use a machine learning strategy to automatically differentiate these two leaders based on their writing. Our findings will help researchers understand how this community works, and why there is occasional controversy regarding differences in communication styles on the LKML.

This contribution to OpenSym 2016 will be made available as part of the OpenSym 2016 proceedings on or after August 17, 2016.

Motivation of Newcomers to FLOSS Projects

Title: Motivation of Newcomers to FLOSS Projects

Authors: Christoph Hannebauer and Volker Gruhn (paluno – The Ruhr Institute for Software Technology University of Duisburg-Essen)

Abstract: While the motivations of Free/Libre and Open Source Software (FLOSS) developers have been the subject of extensive research, the motivations for their initial contribution to a FLOSS project has received only little attention. This survey of 94 newcomers to the FLOSS projects Mozilla and GNOME identifies the motivations for the modification of the FLOSS components and for the submission of these modifications back to the FLOSS project. With the responses, we test a hypothesis based on the previous qualitative research on newcomer motivations: Most newcomers modify a component because they need the modification for themselves. Surprisingly, this is not the case for our respondents, who have a variety of primary modification motivations. Newcomer occupation is discussed as a reason for this difference to previous results.

This contribution to OpenSym 2016 will be made available as part of the OpenSym 2016 proceedings on or after August 17, 2016.

Observing Custom Software Modifications: A Quantitative Approach of Tracking the Evolution of Patch Stacks

Title: Observing Custom Software Modifications: A Quantitative Approach of Tracking the Evolution of Patch Stacks

Authors: Ralf Ramsauer (Technical University of Applied Sciences Regensburg); Daniel Lohmann (Friedrich-Alexander University Erlangen-Nuremberg); Wolfgang Mauerer (Technical University of Applied Sciences Regensburg Siemens AG, Munich)

Abstract: Modifications to open-source software (OSS) are often provided in the form of “patch stacks”– sets of changes (patches) that modify a given body of source code. Maintaining patch stacks over extended periods of time is problematic when the underlying base project changes frequently. This necessitates a continuous and engineering-intensive adaptation of the stack. Nonetheless, long-term maintenance is an important problem for changes that are not integrated into projects, for instance when they are controversial or only of value to a limited group of users. We present and implement a methodology to systematically examine the temporal evolution of patch stacks, track non-functional properties like integrability and maintainability, and estimate the eventual economic and engineering effort required to successfully develop and maintain patch stacks. Our results provide a basis for quantitative research on patch stacks, including statistical analyses and other methods that lead to actionable advice on the construction and long-term maintenance of custom extensions to OSS.

This contribution to OpenSym 2016 will be made available as part of the OpenSym 2016 proceedings on or after August 17, 2016.