Please try to understand Probability Distribution a little bit, especially the particular section of "Common probability distributions and their applications".
Understanding distribution of data during mining is very important as we require to find patterns.
Normal distribution is very much useful for error detection and outlier detection.
If we take Normal Curve or Bell shaped Curve or Gaussian Curve of Normal distribution. The center portion of the curve has the highest probability of an event happening. In term of data analysis, we can find around 60% falling under center part of curve.
For a symmetric curve, the mean, median and mode will be same and will fall in the center. For a skewed distribution, it will be different. We can plot an historgram chart
Normal distribution curve showing higher probability of events happening in the center of the curve.

Covering and removing errors from the distribution results in 1 sigma, 2 sigma, 3 sigma. People try to perform 6 sigma projects based on impurity reduction or error reduction in project.

The below image show what happens to mean, median and mode when normal curve is skewed.

Quantile Quantile Plot
A box-and-whisker plot or boxplot is a diagram based on the five-number summary (completely based on median, range calculation) of a data set. The skewness also can be identified with Quantile Quantile plots.
Positive skew indirectly means that some non-conformance occurrence of events helps to minimizing the entire distribution overall value / performance (despite mean remains same, median and mode shifted towards maximum)
Negative skew indirectly means that non-conformance occurrence of events helps to maximizing the entire distribution overall value / performance (despite mean remains same, median and mode shifted towards maximum) .

Below image show how Histogram is related to Normal Distribution Curve and Box Plot

Below image show how to identify the outliers from box plot. Even i above histogram there are dots representing outliers.
Kurtosis
Kurtosis is another measure of deviation from Normal Distribution just like skewness. While skewness tell us whether there is any +ve or -ve translation of the event population. Kurtosis actually tell us the way how dispersion takes place.
While Skewness holds the central axis of the distribution and and bring an external change by changing the internal axis. Kurtosis takes the external curve or external population distribution and rearranges them to smoother it or flatten it or sharpen it.
Skewness is an internal change in the behavior of population. Kurtosis is the external change in the behavior of population.
Skewness denotes a sidewards movement, while kurtosis denotes the decrease or increase of height of the bell curve with respect to population.


Understanding distribution of data during mining is very important as we require to find patterns.
Normal distribution is very much useful for error detection and outlier detection.
If we take Normal Curve or Bell shaped Curve or Gaussian Curve of Normal distribution. The center portion of the curve has the highest probability of an event happening. In term of data analysis, we can find around 60% falling under center part of curve.
For a symmetric curve, the mean, median and mode will be same and will fall in the center. For a skewed distribution, it will be different. We can plot an historgram chart
Normal distribution curve showing higher probability of events happening in the center of the curve.

Covering and removing errors from the distribution results in 1 sigma, 2 sigma, 3 sigma. People try to perform 6 sigma projects based on impurity reduction or error reduction in project.

The below image show what happens to mean, median and mode when normal curve is skewed.

Quantile Quantile Plot
A box-and-whisker plot or boxplot is a diagram based on the five-number summary (completely based on median, range calculation) of a data set. The skewness also can be identified with Quantile Quantile plots.
Positive skew indirectly means that some non-conformance occurrence of events helps to minimizing the entire distribution overall value / performance (despite mean remains same, median and mode shifted towards maximum)
Negative skew indirectly means that non-conformance occurrence of events helps to maximizing the entire distribution overall value / performance (despite mean remains same, median and mode shifted towards maximum) .

Below image show how Histogram is related to Normal Distribution Curve and Box Plot

Below image show how to identify the outliers from box plot. Even i above histogram there are dots representing outliers.
Kurtosis
Kurtosis is another measure of deviation from Normal Distribution just like skewness. While skewness tell us whether there is any +ve or -ve translation of the event population. Kurtosis actually tell us the way how dispersion takes place.
While Skewness holds the central axis of the distribution and and bring an external change by changing the internal axis. Kurtosis takes the external curve or external population distribution and rearranges them to smoother it or flatten it or sharpen it.
Skewness is an internal change in the behavior of population. Kurtosis is the external change in the behavior of population.
Skewness denotes a sidewards movement, while kurtosis denotes the decrease or increase of height of the bell curve with respect to population.



 
 
 
No comments:
Post a Comment