Summary
Objectives:
The main objective of the research is an application of the clustering and cluster
validity methods to estimate the number of clusters in cancer tumor datasets. A weighed
voting technique is going to be used to improve the prediction of the number of clusters
based on different data mining techniques. These tools may be used for the identification
of new tumour classes using DNA microarray datasets. This estimation approach may
perform a useful tool to support biological and biomedical knowledge discovery.
Methods:
Three clustering and two validation algorithms were applied to two cancer tumor datasets.
Recent studies confirm that there is no universal pattern recognition and clustering
model to predict molecular profiles across different datasets. Thus, it is useful
not to rely on one single clustering or validation method, but to apply a variety
of approaches. Therefore, combination of these methods may be successfully used for
the estimation of the number of clusters.
Results:
The methods implemented in this research may contribute to the validation of clustering
results and the estimation of the number of clusters. The results show that this estimation
approach may represent an effective tool to support biomedical knowledge discovery
and healthcare applications.
Conclusion:
The methods implemented in this research may be successfully used for the estimation
of the number of clusters. The methods implemented in this research may contribute
to the validation of clustering results and the estimation of the number of clusters.
These tools may be used for the identification of new tumour classes using gene expression
profiles.
Keywords
Gene expression - data mining - clustering - cluster evaluation - validity indices