Predicting stock-price remains an important subject of discussion among financial analysts and researchers. However, the advancement in technologies such as artificial intelligence and machine learning techniques has paved the way for better and accurate prediction of stock-price in recent years. Of late, Support Vector Machines (SVM) have earned popularity among Machine Learning (ML) algorithms used for predicting stock price. However, a high percentage of studies in algorithmic investments based on SVM overlooked the overfitting nature of SVM when the input dataset is of high-noise and high-dimension. Therefore, this study proposes a novel homogeneous ensemble classifier called GASVM based on support vector machine enhanced with Genetic Algorithm (GA) for feature-selection and SVM kernel parameter optimisation for predicting the stock market. The GA was introduced in this study to achieve a simultaneous optimal of the diverse design factors of the SVM. Experiments carried out with over eleven (11) years’ stock data from the Ghana Stock Exchange (GSE) yielded compelling results. The outcome shows that the proposed model (named GASVM) outperformed other classical ML algorithms (Decision Tree (DT), Random Forest (RF) and Neural Network (NN)) in predicting a 10-day-ahead stock price movement. The proposed (GASVM) showed a better prediction accuracy of 93.7% compared with 82.3% (RF), 75.3% (DT), and 80.1% (NN). It can, therefore, be deduced from the fallouts that the proposed (GASVM) technique puts-up a practical approach feature-selection and parameter optimisation of the different design features of the SVM and thus remove the need for the labour-intensive parameter optimisation.
In professional sports clubs, the growing number of individual IT-systems increases the need for central information systems. Various solutions from different suppliers lead to a fragmented situation in sports. Therefore, a standardized and independent general concept for a club information systems (CIS) is necessary. Due to the different areas involved, an interdisciplinary approach is required, which can be provided by sports informatics. The purpose of this paper is the development of a general and sports informatics driven concept for a CIS, using methods and models of existing areas, especially business intelligence (BI). Software engineering provides general methods and models. Business intelligence addresses similar problems in industry. Therefore, existing best practice models are examined and adapted for sport. From sports science, especially training systems and information systems in sports are considered. Practical relevance is illustrated by an example of Liverpool FC. Based on these areas, the requirements for a CIS are derived, and an architectural concept with its different components is designed and explained. To better understand the practical challenges, a participatory observation was conducted during years of working in sports clubs. This paper provides a new sports informatics approach to the general design and architecture of a CIS using best practice models from BI. It illustrates the complexity of this interdisciplinary topic and the relevance of a sports informatics approach. This paper is meant as a conceptional starting point and shows the need for further work in this field.
All highly centralised enterprises run by criminals do share similar traits, which, if recognised, can help in the criminal investigative process. While conducting a complex confederacy investigation, law enforcement agents should not only identify the key participants but also be able to grasp the nature of the inter-connections between the criminals to understand and determine the modus operandi of an illicit operation. We studied community detection in criminal networks using the graph theory and formally introduced an algorithm that opens a new perspective of community detection compared to the traditional methods used to model the relations between objects. Community structure, generally described as densely connected nodes and similar patterns of links is an important property of complex networks. Our method differs from the traditional method by allowing law enforcement agencies to be able to compare the detected communities and thereby be able to assume a different viewpoint of the criminal network, as presented in the paper we have compared our algorithm to the well-known Girvan-Newman. We consider this method as an alternative or an addition to the traditional community detection methods mentioned earlier, as the proposed algorithm allows, and will assists in, the detection of different patterns and structures of the same community for enforcement agencies and researches. This methodology on community detection has not been extensively researched. Hence, we have identified it as a research gap in this domain and decided to develop a new method of criminal community detection.
Decision making in sport involves forecasting and selecting choices from different options of action, care, or management. These processes are conditioned by the available information (sometimes limited, fallible, or excessive), the cognitive limitations of the decision-maker (heuristics and biases), the finite amount of available time to make the decision, and the levels of risk and reward. Decision support systems have become increasingly common in sporting contexts such as scheduling optimization, skills evaluation and classification, decision-making assessment, talent identification and team selection, or injury risk assessment. However no specific, formalised framework exists to help guide either the development or evaluation of these systems. Drawing on a variety of literature, this paper proposes a decision support system development framework for specific use in high-performance sport. It proposes three separate criteria for this purpose: 1) Context Satisfaction, 2) Output Quality, and 3) Process Efficiency. Underpinning these criteria there are six specific components: Feasibility, Delivered knowledge, Decisional guidance, Data quality, System error, and System complexity. The proposed framework offers a systematic approach for users to ensure that each of the six components are considered and optimised before, during, and after developing the system. A DSS development framework for high-performance sport should help to improve both short and long term decision-making in a variety of sporting contexts.
In tennis, the accumulation of data has progressed and research on tactical analysis has been conducted. Estimating strategically important factors would have the benefit of providing players with useful advice and helping audience members understand what tennis players are good at. Previous research has been conducted into ways of predicting Association of Tennis Professionals (ATP) tennis match outcomes as well as estimating factors that are important for victories using machine learning models. The challenge of previous research is that the victory factor lacks concreteness. Since we thought the root of the abovementioned problem was that previous researchers used game summary as a feature and did not consider the process of rallies between points, this research focused on calculating the frequency of single shots, two-shot patterns, and specific effective shot patterns from each point rally of ATP singles matches. We then used those data to predict point winners and useful features using L1-regularized logistic regression. The highest accuracy obtained was 66.5%, and the area under the curve (AUC) was 0.689. The most prominent feature we found was the ratio of specific shots by specific players. From these results, our method could reveal more concretely tactical factors than previous studies.
Data is the key to information mining that unveils hidden knowledge. The ability to revealed knowledge relies on the extractable features of a dataset and likewise the depth of the mining model. Conversely, several of these datasets embed sensitive information that can engender privacy violation and are subsequently used to build deep neural network (DNN) models. Recent approaches to enact privacy and protect data sensitivity in DNN models does decline accuracy, thus, giving rise to significant accuracy disparity between a non-private DNN and a privacy preserving DNN model. This accuracy gap is due to the enormous uncalculated noise flooding and the inability to quantify the right level of noise required to perturb distinct neurons in the DNN model, hence, a dent in accuracy. Consequently, this has hindered the use of privacy protected DNN models in real life applications. In this paper, we present a neuron noise-injection technique based on layer-wise buffered contribution ratio forwarding and ϵ-differential privacy technique to preserve privacy in a DNN model. We adapt a layer-wise relevance propagation technique to compute contribution ratio for each neuron in our network at the pre-training phase. Based on the proportion of each neuron’s contribution ratio, we generate a noise-tuple via the Laplace mechanism, and this helps to eliminate unwanted noise flooding. The noise-tuple is subsequently injected into the training network through its neurons to preserve privacy of the training dataset in a differentially private manner. Hence, each neuron receives right proportion of noise as estimated via contribution ratio, and as a result, unquantifiable noise that drops accuracy of privacy preserving DNN models is avoided. Extensive experiments were conducted based on three real-world datasets and their results show that our approach was able to narrow down the existing accuracy gap to a close proximity, as well outperforms the state-of-the-art approaches in this context.
Driven by the increased availability of position and performance data, automated analyses are becoming the daily routine in many top-level sports. Methods from the domains of data mining and machine learning are more frequently used to generate new insights from massive amounts of data. This study evaluates the performance of four current models (multi-layer perceptron, convolutional network, recurrent network, gradient boosted tree) in classifying tactical behaviors on a beach volleyball dataset consisting of 1,356 top-level games. A three-way between-subjects analysis of variance was conducted to determine the effects of model, input features and target behavior on classification accuracy. Results show significant differences in classification accuracy between models as well as significant interaction effects between factors. Our models achieve classification performance similar to previous work in other sports. Nonetheless, they are not yet at the level to warrant practical application in day to day performance analysis in beach volleyball.
The day-to-day use of digital devices with Internet access, such as tablets and smartphones, has increased exponentially in recent years and this has had a consequent effect on the usage of the Internet and social media networks. When using social networks, people share personal data that is broadcast between users, which provides useful information for organizations. This means that characterizing users through their social media activity is an emerging research area in the field of Natural Language Processing (NLP) and this paper will present a review of how personality can be detected using online content.
Approach A systematic literature review identified 30 papers published between 2007 and 2019, while particular inclusion and exclusion criteria were used to select the most relevant articles.
Outcomes This review describes a variety of challenges and trends, as well as providing ideas for the direction of future research. In addition, personality trait identification and techniques were classified into different types, including deep learning, machine learning (ML) and semi-supervised/hybrid.
Implications This paper’s outcomes will not only facilitate insight into the various personality types and models but will also provide knowledge about the relevant detection techniques.
Novelty While prior studies have conducted literature reviews in the personality trait detection field, the systematic literature review in this paper provides specific answers to the proposed research questions. This is novel to this field as this particular type of study has not been conducted before.