Field Note · 2 min · February 8, 2026

On the Strange Honesty of Data

Every dataset is an argument about what was worth recording. We talk about data as if it were found rather than made — as if it sat in the world like ore, waiting to be mined. But someone always chose the columns. Someone decided what counted as an event, where to put the decimal, which messy reality to flatten into a clean category. By the time you load the file, the most important decisions have already been made, silently, by people you'll never meet.

Absence is information

The most honest column in any table is often the one that is missing. A churn dataset that never recorded why people left. A medical record that captures every test that was ordered and nothing about the patients who were never tested. The gaps are not neutral. They are the fossilized shape of what someone, at some point, did not think to ask.

Data doesn't lie. But it answers only the questions you already knew to ask — and stays silent about the ones you didn't.

This is the trap of being data-driven. The data will happily drive you in circles around your own assumptions, because it was collected inside those assumptions. It can sharpen a question you already have. It cannot hand you the question you failed to imagine.

Reading the negative space

So the most useful skill in working with data is not statistical. It is a kind of peripheral vision: the habit of asking what isn't here, who isn't counted, which outcome was never logged because no one expected it. The numbers tell you what was measured. The honest analyst spends just as long on what wasn't — and treats that silence not as missing data, but as data of its own.