Touch of data and statistics

Ammattiliitto Tutkimusta työelämästä

Calculator

Input calculations "freely" in the text area below and activate by the click or moving to outside of the input area. Max. one calculation for each row, you can divide an equation in multiple lines by ( ). Separate distinct elements in the same row with ; character but do not end the row with ; character. Blank rows are ignored. The calculator "exploits" math.js Javascript library. The answers (A1,A2,...) of the calculations are printed in the output box.
Input:
Output:
( Learn more about the syntax used in math.js especially math.parser() )

Basic concepts

Let's learn a bit how to analyze different kinds of statistical data. This page will contain material that might be useful for you when you want to get more control of the statistical information and the methods shaping the data (statistics, cost calculations,...). What is statistics? (hover on me) Statistics is about learning from data, making useful models "of the world". To describe, analyse, post-shape, correlate, predict, simulate, to make causal inferences, ... of the phenomenom in question. Getting control over the things (with the help of the data). Data you have is often only a partial "image" of the real world, you also need good (theory, common sence,...) concepts "above" it and understand e.g. how much subjectivity/objectivity influences how the real data is generated. Remember also that statistics is not just about the numbers, you can also analyse e.g. written text by certain statistical methods.

Description of the simple set of numbers

Let's generate a set of numbers which we call variable or data "x" (variable could be e.g. monthly wage) .



Summary statistics (the above variable x):
. "The Extremist" measure: ?
What is the cumulative probability? (hover on me) Cumulative probability tells how much probability or relative frequency mass is cumulated below the certain value of the statistical variable (x). In the above graph variable x goes along the horizontal axis and the corresponding cumulative probability values are on the vertical axis. X-numbers are ordered from the lowest value to the highest (the horizontal axis) and probability "grows" from 0 to 100 (%) without descending in any point. In the graph you can "easily" see the percentile points (e.g. 50 % = median point). Cumulative values may not be as intuitive as its "histogram" or "bar diagram" counterpart but it is one way to look at the distribution of the variable x and via it the theoretical distribution of x (e.g. normal) is determined.
What is bar diagram? (hover on me)

Relative differences and changes

Relative difference between two numbers



Logarthmic difference or change is useful in certain situations, it is close to the traditional relative change when the change is small.

Weighting observations

Weighted average

Weigted average of variable x means that it values are weighted in a certain way. Calculating the weighted average you will need another variable called weight-variable w. w-variable can have zero or greater values (>=0). x and w variables must equal in their lengths (number of observations).


x (variable) w (weighting):
Result: weighted average (mean) of x (by w) = .
How to calculate weighted average? (hover on me) In the normal (arithmetic) average each x-value is weighted by the same number 1. E.g. the normal average of 2 and 3 = (1*2 + 1*3)/(1+1) = (2+3)/2 = 2.5. With weights symbols (w1*2 + w2*3)/(w1+w2). So weights of the numbers are w1 = w2 = 1 . In the case of the weighted average the weights w1,w2,... can be different - or same :). You can conclude that weighted average is more general than "simple" average.
Sivun alkuun

More to come soon...