Types and Sources of Data
1.1. Types and Sources of Data#
Learning Outcome
Students will be able to distinguish between types and sources of data.
Sample Tasks:
Differentiate between categorical and quantitative data.
Classify data as nominal level, ordinal level or ratio level and discuss its characteristics and limitations.
Discern between a set of data and a source of data.
Our first reading, from Learning Data Science [LGN23], describes the feature types one finds in data. (It then does some examples using the pandas
library, which we learn about in Section 1.4.1.)
Reading Question
Your data contains a feature with possible entries “small”, “medium”, and “large”. What type is this feature?
Our second set of readings, also from Learning Data Science [LGN23], walks through some of the issues to consider when evaluating data sources and the data sets they provide.
Reading Questions
Your professor’s kid keeps track of all the Halloween candy they receive and takes measurements on it:
Brand
Length in the longest direction
Sugar in grams
What are the types of each of these features?
What other features could be obtained that would be of other types?
How reliable is this data? What does it give us information about?
Further Resource
From the Python Data Science Handbook [Van16]