Over a third of adults go online to diagnose their health condition. Direct-to-consumer (DTC), interactive, diagnostic apps with information personalization capabilities beyond those of static search engines are rapidly proliferating. While these apps promise faster, more convenient and more accurate information to improve diagnosis, little is known about the state of the evidence on their performance or the methods used to evaluate them. We conducted a scoping review of the peer-reviewed and gray literature for the period January 1, 2014–June 30, 2017. We found that the largest category of evaluations involved symptom checkers that applied algorithms to user-answered questions, followed by sensor-driven apps that applied algorithms to smartphone photos, with a handful of evaluations examining crowdsourcing. The most common clinical areas evaluated were dermatology and general diagnostic and triage advice for a range of conditions. Evaluations were highly variable in methodology and conclusions, with about half describing app characteristics and half examining actual performance. Apps were found to vary widely in functionality, accuracy, safety and effectiveness, although the usefulness of this evidence was limited by a frequent failure to provide results by named individual app. Overall, the current evidence base on DTC, interactive diagnostic apps is sparse in scope, uneven in the information provided and inconclusive with respect to safety and effectiveness, with no studies of clinical risks and benefits involving real-world consumer use. Given that DTC diagnostic apps are rapidly evolving, rigorous and standardized evaluations are essential to inform decisions by clinicians, patients, policymakers and other stakeholders.