|
Platinum
Platinum Data Mining, Classification, and Prediction Using SLAM™ |
Please note: these functions are introduced within a conceptual 'workflow' for the purpose of introduction only. Within GeneLinker™, you are free to apply any appropriate function to your data at any time.
1. Import Gene Expression Data
A training dataset (expression values with known classes) is required to train an artificial neural network (ANN) classifier. A test dataset can be imported to test a trained classifier. The two datasets must be studies of the same phenomenon (i.e. the variable type for both is the same, e.g. SRBC Tumors).
2. Import Variable Data
Import the classes (e.g. EWS, NB, BL, RMS) for the training dataset.
3. Discretize the Expression Data
Expression data is continuous. To apply the SLAM™ data mining algorithm, the data must first be discretized.
4. Apply SLAM™ Association Mining and Visualize the Results
SLAM™ (Sub-Linear Association Mining) is a technology that finds hidden linear and non-linear correlations in discretized gene expression data. The SLAM™ association viewer displays the results of running SLAM™ and allows you to work with the results.
{image}
5. Create Gene List
As an aid to supervised learning, a gene list is created from the genes (features) identified as significant by SLAM™. If necessary, this gene list can be used to filter the test dataset to ensure it contains the same genes as the training dataset.
6. Create an ANN Classifier and View Training Results
Creating an ANN classifier is the process of exposing a committee of neural networks to data with known classes of a particular type. The training results can be displayed in a classification plot or an MSE plot.
{image}
7. Classify Data and Visualize the Classification Results
Classification is the process of using a trained classifier to predict the classes of the test dataset.