We Don't Have a Methodology

Boker Labs is an applied AI lab focused on software security. Maybe twice a month, someone asks us what our methodology is. It's the question we get most often and the one we have the least satisfying answer to.

The reason we don't have a satisfying answer is that methodologies are a survivor's narrative imposed on work that was, at the time, not following any methodology. Read the writeup of a finding that paid serious money. The writeup describes a clean linear path from observation to compromise. The actual work was a thousand small probes, almost all of which went nowhere, with the path the writeup describes being the one that, after the fact, mattered.

We can describe what we do. The description isn't a methodology.

What we actually do

We read responses. Most of our findings come from looking at a response and noticing something doesn't fit. The fit isn't formalizable. It's what you have when you've read enough responses on a surface to know what they should look like, and the response in front of you is wrong in a way you can't immediately name.

We chain primitives. Most single observations don't pay. Most pairs don't either. The third observation, combined with the first two, sometimes does. We don't know in advance which observations will combine. We file them all and look at the pile.

We retest. The most reliable source of new findings we have is going back to an old confirmed finding six months later and asking whether the surrounding code drifted. It usually has. The fix that closed the original sometimes opens an adjacent one.

We run agents. We've built a small set of subagents that handle the parts of the work that don't require judgment. Enumeration. Bundle harvesting. Retest replay. Response capture. Every agent we've built has made the human work cheaper to start and harder to skip.

We talk to each other. Two researchers looking at the same target find different things, not because one is better but because the second is asking different questions. Most of our high-severity findings come out of conversations.

That's the whole list.

Why this is not a methodology

A methodology, in the sense the word usually carries in security, is a procedure. It produces an outcome if followed. The procedure is transferable. A new hire can read it and start producing findings.

What we do isn't a procedure. It's a set of habits applied with a particular kind of attention. The habits are easy to describe. The attention is what makes them work, and the attention is the part that doesn't transfer.

You can teach someone the habit of reading responses. You can't teach someone what "off" feels like. They get there by sitting with thousands of responses until the wrong ones become visible. There's no version of the training that compresses this. Our agents, including the frontier ones, don't have it. They get better at the habits. They don't, yet, get the attention.

What the industry sells instead

The security tooling market is full of products that promise to encode this kind of expertise. "Our scanner finds the bugs your researchers find." It mostly doesn't. It finds a subset, defined by patterns the authors knew about when they wrote it. The findings worth paying for are the ones that didn't match a pattern anyone had written down yet. That, in fact, is the definition of the findings worth paying for.

Tools that claim to be a methodology in a box are selling the survivor's narrative, dressed up. They produce floor-grade output. They don't produce the chain-grade findings, because chain-grade findings depend on attention that isn't codifiable.

We're not against tools. We use a lot of them. We build our own. We're against the framing that tools are a substitute for the practice the tools are supposed to support.

What we'd tell you

If you're hiring researchers, hire for the habits. The habits are interview-able. Ask the candidate to walk through a finding they're proud of. Listen for whether they describe the dead ends. The candidates who only describe the path that worked are telling you the survivor's narrative. The ones who describe the failures are showing you how they actually think.

If you're building a research practice, give it the conditions the attention requires. Long blocks of uninterrupted time. Permission to file primitives that don't pay yet. A norm of talking to each other about half-formed observations. Build the agents that handle the parts the attention doesn't need to be on. None of this looks like a methodology. All of it is what the methodology is for.

We're skeptical of anyone selling you a methodology. The thing that produces the findings isn't transferable as a document, and we don't believe the next generation of frontier models is going to change that, though part of what we're paid for is to keep checking.

The habits are real. Read responses. File primitives. Retest. Run agents on the parts the agents are good at. Talk to each other. Do those things with the kind of attention that comes from caring about the answer, and you'll, eventually, find what's there to find.

Boker Labs is built around that bet.