Logbook activity – discrete and continuous data
There are two types of data variables which are: discrete and continuous data. A variable is something which can change in value and it is usually numeric, but not always. Discrete data is where there are individual values; the values are unique and distinct from the other. It can measure something precisely for example shoe size or the number of siblings someone may have. A continuous variable can be any value for example height or the time in seconds it took for an athlete to run a 100metre race. It can vary from a small number to a large number.
Sometimes the distinction between a discrete and a continuous variable is unclear. For example a person’s age is a continuous variable as someone can be 21.34342 years old. This is measuring age accurately. However we don’t actually say our age in decimal points, we would say in this case that someone is 21 instead of 21.34342 years old. There are many examples of this type of data another one is world population and this is rounded up to the nearest 100,000.
A frequency distribution is used when there is a large set of data involved. It groups data into classes (intervals and categories).
Need essay sample on "Logbook activity – discrete and continuous data"? We will write a custom essay sample specifically for you for only $ 13.90/page
Mode average – the number which is the most frequent in the data set. – Median average – after ranking the data set in an increasing order (from smallest to largest), the value that is in the middle. – Quartile ranges – these are summary measures that divide a ranked data set (smallest to largest) into four equal sections. The middle quartile is the median. The lower quartile is approximately 25% of the values in the ranked data and the upper quartile is 75%. The inter-quartile range is the upper quartile range minus the lower quartile range.
This is a Histogram showing the height of students (collected from first year students at Middlesex University: A histogram is a summary graph which shows a count of data points in many ranges. It is an approximation of the frequency distribution of data. An example of the group/class of data that I have used for height is 110-120 and under, 120-130 and under. Essentially a histogram is similar to a bar chart but histograms have all the bars joined up as it is showing continuous data and represent frequency distributions. Also in a bar chart there all the widths are the same. However in a histogram they don’t necessarily have to be because it is the area of bars that are focused on a histogram.
One main advantage that a histogram has is that is shows the shape of the distribution for a large set of data so it is visually strong and easily understandable (as it is continuous); this is the reason why I chose to create a histogram graph of the data given to me. But the biggest disadvantage is that data can be lost because it is grouped. Another disadvantage is that it can be very difficult to compare two data sets. This graph shows that most peoples heights are grouped around the 170 – 180 mark. There are very few people who are 200cm and also very few that are 130-140cm. the table is generally going up and then back down in a pyramid shape.
I chose height as my continuous variable. Height is a continuous variable as it can be measured precisely. Any value is possible for the height of an individual. A definition of a continuous variable is: “A variable that can assume any numerical value over a certain interval or intervals,” (Source: textbook ‘introductory statistics’ by Prem S Mann.) The data collected had some unrealistic figures and to produce this graph it would not have been sufficient. As a result of this I deleted the unrealistic figures from the data. This included the heights which were 50cm, 53cm, 70cm and 84cm, 260cm and 360cm.