homeabout uscontact us

 

Overview of Estimating Missing Values

 

Overview

Missing (null) values can lead to erroneous conclusions about data. Similarly, substitution of missing values may introduce inaccuracies and inconsistencies. Missing data values can negatively impact discovery results, and errors or data skews can proliferate across subsequent runs and cause a larger, cumulative error effect. As well, most analysis methods cannot be performed if there are missing values in the data.

Missing values may prevent proper classification, and poor substitution schemes for missing values may cause classification errors. If all the values substituted are determined by the most likely value, then the individual values are less likely to help define class (cluster) boundaries.

 

Actions

Two Step Process for Resolving Missing Values:

1. Remove (filter out) genes that have a minimum number of missing values.

2. Replace the remaining missing values. GeneLinker™ offers three techniques for estimating missing values:

 

Related Topics:

Estimating Missing Values by a Measure of Central Tendency

Missing Value Estimation by Nearest Neighbors

Replacing Missing Values With an Arbitrary Value