Web forums are frequently used as platforms for the exchange of information and opinions as well as propaganda dissemination. But online content can be misused when the information being distributed, such as radical opinions, is unsolicited or inappropriate. This study introduces a technique that combines machine learning and semantic-oriented approaches to identify radical opinions in hate group Web forums. Four types of text features (syntactic, stylistic, content-specific, and lexicon features) are extracted as text classification predictors, and three classification techniques (SVM, Naïve Bayes, and Adaboost) are implemented. Postings from two hate group Web forums are collected and the preliminary results are encouraging. In addition, cross-validation indicates the proposed technique is stable and extendible to timeframes beyond that of the training data. The proposed technique can also be an effective tool for other sentiment classification problems.
©2011 Walter de Gruyter GmbH & Co. KG, Berlin/Boston