Statswork

Statistical Data Analysis

Statistical Data Analysis

Statistical Data Analysis

In Brief

“Statistics is the only science where the experts may come up with different conclusion with the same data”

What is Statistical data analysis?

Statistics are the branch of mathematics used to analyse the data that can describe, summarize and compare. Statistical Data Analysis is a process of performing numerous statistical functions involving collection of data, interpretation of data and lastly, validation of the data. Numerous statistical tools such as SAS, SPSS, STATA, etc., are available nowadays to analyse the statistical data from simple to complex problems based on the nature of the study.

Types of statistical data analysis

In the field of statistics, there are two widely used statistical methods in data analysis. They are:

Below table reveals the descriptive statistics of the study variables

Figure shows the descriptive statistics of the study variables

Uses of Statistics

Furthermore, the following areas are the one in which statistics plays a major role:

What is a Data mean in statistics?

The nature of the data plays a vital role in the field of statistics. It is atmost important to identify the nature of the data before planning the research analysis. Usually, in statistics, there are various kinds of data available for the study, they are: Discrete data and continuous data are grouped as numerical, Categorical data involving nominal and ordinal. Mostly, every sampled data belong to any one of two groups: categorical or numerical and are described in the following table for easy understanding.

PMF and PDF

Every statistical data follows certain distribution function in the theory. In the statistical data analysis, continuous data are scattered under continuous distribution function, also called as the pdf or probability density function, whereas, the discrete data are scattered under discrete distribution function, also called as the pmf or probability mass function. Generally, the phrase ‘density’ is used for data in continuous form because density cannot be counted, but can be measured. Normal distribution, Poisson distribution, Binomial distribution, etc., are the most commonly used distribution in the statistical analysis.

In practice, statistical data analysis are broadly classified into two types: Univariate and Multivariate. If we wish to analyse the data which contains only one variable, then the univariate statistical analyses such as t-test, z test, f test, one way ANOVA, etc., can be performed. If the data contains two or more variables, then the multivariate techniques such as factor analysis, regression analysis, discriminant analysis, etc., can be performed depends on the nature of the study.

T-test

The t-test analysis is a statistical model which compares the values in two different groups to determine when there is enough difference between the data.

Analysis of Variance

Analysis of Variance (ANOVA) is a method utilized to decide whether the mean values of dependent variables remain constant when implemented in different groups which are independent of each other.

Apart from the above mentioned kinds of statistical analysis, there are also other important analyses every data scientist should know. They are:

To sum up, Statistical data analysis can be simplified into five steps, as follows:

  1. V.K.Rohatgi (1976). “An Introduction to Probability and Statistics”, John Wiley & Sons.
  2. Casella G. and Berger R L. (2002) “Statistical Inference”, Second Edition, Duxbury Advanced Series.
  3. Alan Agresti (1992), “An Introduction to categorical data analysis”, Second Edition, John Wiley & Sons.
  4. Mood, A. M. (1950). Introduction to the theory of statistics. New York, NY, US: McGraw-Hill.
Exit mobile version