Full Program »
Data Mining Approach To Classify Cases of Lung Cancer
According to the World Cancer Research Fund, a leading au- thority on cancer prevention research, lung cancer is the most commonly occurring cancer in men and the third most commonly occurring cancer in women, with the 5-year relative survival percentage being significantly low. Smoking is the major risk factor for lung cancer and the symptoms associated with it include cough, fatigue, shortness of breath, chest pain, weight loss, and loss of appetite. In an attempt to build a model capable of identifying individuals with lung cancer, this study aims to build a data mining classification model to predict whether or not a patient has lung cancer based on crucial features such as the above mentioned symptoms. Through the CRISP-DM methodology and the RapidMiner software, dif- ferent models were built, using different scenarios, algorithms, sampling methods, and data approaches. The best data mining model achieved an accuracy of 92.24%, a specificity of 95.90%, and a sensitivity of 68.29%, using cross validation and the Artificial Neural Network algorithm.