tl;dr: As machine learning and big data become more popular, we are beginning to spend less of our time analyzing data, and more of our time preparing our data so it can be analyzed by our computers. Therefore, what is the main difference between to the way a machine treats a number, and the way a human treats a number? In a word, Context.
Lately, I’ve been thinking a lot about the differences in the ways humans learn versus the way machines learn. I’m happy to report that I am nowhere near finding an answer, but I’ve already encountered many interesting ideas on the way. I will be going into a little bit of one of my most recent discoveries once I start making more sense of it, but for now and the purposes of this class, I’d rather talk about another idea I’ve been thinking about recently, regarding numbers and narratives.
Humans have a love of context, especially in the form of narration; we are hardwired almost from a young age to prefer stories to facts. Even as our species advances and begin realize that at a fundamental level, most of our universe, and indeed aspects of our own behavior, operate in a seemingly ‘random’ manner, we are still keen on transforming this string of theoretically random events into causalities and correlations; separable forces and actors whose influence can be measured and weighted. However, while a computer may be able describe how a group of points relate to each other, only a human can really describe why a group of points relate to each other (although that reasoning could be behind the first robot scientists). Simply put, what is the difference between a graph and a doodle if not context?
Every time you scroll too quickly in Excel and immediately forget what all these arrays of numbers mean, it is essentially your logic yelling at you for a frame of reference; in this case, the humble header row. Yet to a computer, a header row is a human inconvenience, an arbitrary construct to be ignored during analysis. However, even with access to the finest analytical tools, the output of such analysis is meaningless until described, at which point it transforms into a rich story of how X changed and Y deviated and Z varied. A computer does not require context to compute, but a computation without context is rarely meaningful; what is a formula if not an accurate descriptor of a data set glued to a good story?
Thus, how do we approach the numbers we randomly encounter on the internet? Usually, most random numbers have a convenient context , $’s being the most common prefix, yet even then most numbers are prone to contextual abuse, intentional or otherwise. In this new numerical, data-driven world of ours, we toss around words like ‘a million‘, ‘2 billion‘, and ‘5.3 trillion‘ every day, but I don’t think we often truly grasp the scale of the numbers we’re dealing with. For example:
A million seconds is 12 days.
A billion seconds is 31 years.
A trillion seconds is 31,688 years.
I believe context is fundamental to how most humans interpret our data, and it’s something we need consider more thoroughly when using figures. Tossing data in the face of our target demographic, colleagues, or shareholders instead of rationalizing them in a meaningful way detracts from their value, and asks humans to behave like the robots they aren’t . A number without a story is just a number; it is up to us to define its context.
— — —
P.S. One beautiful approach I’ve discovered recently is this Chrome Extension called Dictionary of Numbers, which turns this:
"The hurricane displaced over 100,000 people and cost an estimated $3 million in damages."
into something like this:
"The hurricane displaced over 100,000 people [≈ population of Aruba] and caused an estimated $3 million [≈ cost of 30-second Super Bowl advertisement] in damages."
It’s not always the most relevant comparison, but it’s really interesting how many numbers I encounter on an average day of browsing that benefit from a little context.