Below Minimum Wage: The System We Built and the System We’re Losing

Two years before the Berlin Wall fell. It was just after 3 in the morning. At my lab bench, I was preparing samples for calculating a blood glucose curve in one of the early brain imaging studies. Across from me, my grad student colleague was extracting DNA for his work on the molecular basis of neurodegeneration. We were working in the Neuroscience Laboratory Building (now long extinct). It was the former student food services base for the University of Michigan, an irony I never really got over. It was mid-winter in Ann Arbor. Slush ruled the streets. When the day arrived in four hours, we could be sure the skies would be gray.

Suddenly, K. slammed down his pipetter and exclaimed, “I’m going to talk to the Boss tomorrow! I just figured it out, we make less than minimum wage!”

The calculation was straightforward. Our stipend was maybe $7,000 a year, with tuition covered. We worked—conservatively—60 hours a week, often more. Factor in the 3 AM sessions, the weekend tissue preparations, and the endless equipment maintenance that somehow became the grad students’ responsibility. Do the math: roughly 3,100 hours per year, $7,000 total. About $2.25 per hour—a third less than the 1987 minimum wage of $3.35—to do cutting-edge neuroscience in a converted cafeteria food prep building.

K. did talk to the Boss the next day. I don’t know exactly what he expected—acknowledgment, perhaps, or some explanation of how this was a temporary sacrifice for future reward, or at minimum an expression of concern about the system we were trapped in.

What he got was simpler: “Why should I worry? I’ve got a nice car. I’ve got nice clothes.”

The Divergence

K. and I responded to that moment differently, though we both understood its implications with perfect clarity.

K. finished his PhD. Then he left research entirely. He’s now a practicing radiologist—work that pays substantially more than minimum wage, has defined schedules rather than 3 AM obligations, and doesn’t require pretending that exploitation is training.

I stayed. Not because I had some moral superiority or different principles. I stayed because I was too committed to getting my doctorate at that point. I’d tried blue-collar work before graduate school, and I didn’t want to do that. The sunk costs were real—years invested, experiments underway, a thesis taking shape. Walking away would mean admitting those years at $2.25 per hour had purchased nothing.

The samples I was preparing that night at 3 AM were for quantifying local cerebral glucose utilization using autoradiography. The data would contribute to my PhD thesis on cerebral metabolic variability. It was genuinely interesting work—understanding how the brain’s energy consumption varies spatially could inform everything from imaging diagnostics to our understanding of neurological disorders.

But it was work being done by someone making $2.25 per hour, with no leverage, no bargaining power, and no alternative but quitting. The quality of the science didn’t change the economics. If anything, the importance of the work made the exploitation easier to rationalize: we were suffering for something that mattered.

K. made a rational choice. He extracted himself from a system that valued his labor at below minimum wage and found work that valued it appropriately. He’s probably had a better work-life balance, made more money, had more control over his time, and still contributed to human welfare through medical practice.

I made a different calculation. I stayed in the system, finished the PhD, did a postdoc (at similarly exploitative wages), and eventually built a career as an academic that culminated in serving as NSF’s Assistant Director for Biological Sciences, from the bottom of the exploitation to administering the funding system that perpetuated it.

The View from the Other Side

Fast forward to 2014-2018. I’m now at NSF, overseeing hundreds of millions in biological research funding. I visit grant review panels regularly—watching as distinguished scientists evaluate proposals, debate scientific merit, and argue about which projects deserve support in a constrained budget environment.

And the panelists complain. Not about the science—they’re excited about the research. They complain about the funding decisions: why do we fund these projects and not others? Why these amounts? Why can’t we support more graduate students? Why are stipend levels what they are?

I’m sitting there thinking about 3 AM in 1987, about K.’s calculation, about the Boss’s nice car and nice clothes. And I’m the one explaining the constraints now. Limited budgets. Many worthy proposals. Tough choices. The same justifications, delivered more professionally than “why should I worry,” but fundamentally the same message: the system is what it is.

The irony wasn’t lost on me. I remembered pipetting at 3 AM. I remembered the calculation. I remembered the casual indifference to exploitation. And now I was administering fundamentally the same system, just with better rhetoric.

Here’s what hadn’t changed in those 27 years: the basic model of graduate STEM training still rested on extracting maximum labor at minimum cost, justified as “training” rather than employment. Stipends had risen nominally but not dramatically in real terms. The hours hadn’t decreased—if anything, competitive pressure had intensified. The power imbalance remained: PIs controlled everything, and students had no recourse.

If I could have redesigned the system from scratch, I would have created something different: fewer graduate students, higher wages, and much better mentoring. Quality over quantity. Living wages over exploitation—professional development over just-in-time labor.

But that’s not what happened. Instead, the system expanded. More grad students, more postdocs, more soft-money positions, all built on the same below-minimum-wage foundation, just scaled up. We produced more PhDs chasing fewer permanent positions, intensifying competition at every level.

Why did it persist? Because it worked—not for the individuals trapped in it, but for the system itself. The model produced science. Papers got published. Grants got renewed. PIs advanced. Institutions collected overhead. The fact that it ran on exploitation was a feature, not a bug. It selected for people willing to accept it (like me) and filtered out those who wouldn’t (like K.).

And those of us who accepted it, who succeeded despite it, who rose through it—we administered it. We knew it was broken. We’d done the math ourselves. But we had competing obligations: limited budgets to allocate, scientific priorities to balance, and institutional constraints to navigate. Fixing the exploitation model wasn’t in our remit. Our job was to distribute resources within the system as it existed.

The System’s Logic

The defense of graduate student stipends—if anyone bothered to make one explicitly—would go something like this:

“It’s training, not employment.” Students are learning, not working. The stipend is support to enable education, not compensation for labor. Never mind that the “training” produces publishable research, grant-supported data, and intellectual property that belongs to the institution. Never mind that without graduate student labor, most academic research would halt.

“Everyone goes through it.” This is the initiation ritual, the paying of dues, the sacrifice that earns you entry to the profession. I suffered at $2.25 per hour; the Boss probably suffered at similar rates, and you suffer too. The hazing justifies itself through tradition.

“The payoff comes later.” Yes, current compensation is terrible, but you’re investing in future earnings. The PhD opens doors. Except that it doesn’t, not reliably. The academic job market is brutal. Industry positions often don’t require a PhD. And many of those doors lead to postdocs—more exploitation at slightly higher rates.

“You’re doing what you love.” This is the passion tax: because you find the work intrinsically rewarding, because you’re intellectually engaged, because you care about the science, you should accept compensation far below market value. Your enthusiasm is exploitable.

“The alternative is worse.” No funding means no graduate programs means no research training means no next generation of scientists. We’re doing the best we can with limited resources. Which might be true, but doesn’t change the mathematical reality: $2.25 per hour is exploitation regardless of budget constraints.

None of these arguments would satisfy an outside observer. They barely satisfied those of us inside the system. But they were sufficient to maintain the equilibrium because both sides had reasons to accept it. Students needed credentials. PIs needed labor. Institutions needed productivity. Everyone was complicit.

The system persisted because it was stable—not fair, not optimal, but stable. An equilibrium based on asymmetric power: PIs had alternatives (they could recruit new students), students didn’t (switching programs meant losing years of work). That asymmetry meant PIs could extract labor at $2.25 per hour, and students would accept it.

K.’s confrontation with the Boss revealed this clearly. The Boss wasn’t defending the system or explaining its necessity. He was simply observing that it didn’t affect him negatively. Nice car. Nice clothes. Why should he worry? The graduate students’ misery wasn’t his problem.

That’s the logic of exploitation: those who benefit from it don’t experience its costs, so they have no incentive to change it. And those who bear the costs have no power to change it. The system perpetuates.

The International Contrast

It’s worth noting that this isn’t how all countries approach graduate STEM training.

In Germany, PhD students are employees with contracts, salaries, and benefits. They’re part of the research staff and are compensated as such. The fiction of “training not employment” doesn’t work there—if you’re doing research work, you’re paid for research work.

When I’d present at international conferences during my NSF tenure, European colleagues would sometimes ask about American graduate training. When I explained the stipend levels and working conditions, the response was consistent surprise. “How do your students survive?” they’d ask.

The answer: barely, and many don’t.

The American model—long programs, low stipends, no benefits, complete PI control—isn’t universal. It’s a choice, defended by inertia and rationalized by those who succeeded within it. Other countries produce excellent science without requiring graduate students to work for below-minimum-wage wages. We could too, if we wanted to.

The Berlin Wall Moment

Two years before the Berlin Wall fell, K. and I were pipetting at 3 AM. The Wall seemed permanent then—an ugly fact of geopolitics, stable if not good. Systems that appear unshakable can collapse suddenly when their contradictions become unsustainable.

We’re in that moment now with American science.

The 2025 funding cuts aren’t routine budget tightening. They’re not temporary political fluctuations that will reverse with the next election. They represent something different: a fundamental questioning of the compact between government and science that has sustained American research since Vannevar Bush’s endless frontier.

More than 7,800 grants canceled or suspended at NIH and NSF. Billions in unspent funds frozen. Thousands of researchers terminated or leaving the country. Universities cutting graduate admissions, eliminating postdoc positions, restructuring programs. The infrastructure we spent 75 years building is being dismantled.

And here’s the uncomfortable question: Should we fight to rebuild it exactly as it was?

That system—the one now under assault—was the system where graduate students made $2.25 per hour, where the Boss had a nice car and nice clothes and didn’t worry, where exploitation was rationalized as training, where we produced too many PhDs for too few jobs and called it a pipeline problem rather than a design flaw.

The system produced important science. My thesis work on cerebral metabolic variability contributed to understanding brain function. K.’s work on neurodegeneration might have led somewhere if he’d stayed. The research mattered. But it mattered that, while being built on exploitation, everyone involved understood and accepted it.

Now external force is breaking the system. Not because we collectively decided to reform it. Not because we recognized its flaws and chose differently. But because political power decided that science funding was a convenient target for leverage and cuts.

The question facing us isn’t whether the cuts are bad—they are. It’s not whether we should oppose them—we should. The question is: when we argue for restoration of science funding, what are we arguing to restore?

The System We Could Build

If we’re going to rebuild American science from this moment of crisis, we could choose differently.

Fewer graduate students, better compensation. Instead of admitting cohorts of 20 students to work as cheap labor, admit cohorts of 10 and pay them living wages. Fund fewer projects but fund them properly. This would require PIs to do more of their own work or hire professional staff, which would be appropriate, since it’s their research program.

Limited time-to-degree with guaranteed support. If a PhD genuinely takes five years, fund all five years from admission. No scrambling for RA positions. No anxiety about whether your PI’s grant will renew. No leverage for PIs to extract extra years of cheap labor by withholding degrees.

Professional development is a core mission. Graduate programs should be about training the next generation of scientists, not just producing data for current PIs. That means mentoring, career development, and skill-building beyond bench work. It means treating students as early-career professionals, not disposable labor.

Portable funding. Rather than money going to PIs who then allocate it to students, fund students directly through fellowships and training grants. This shifts power dynamics—students choose labs based on training quality, not desperation for any funding source.

Employment status with benefits. Stop the fiction that graduate students are just students. They’re researchers doing work that produces value. Compensate them as such, with real salaries, health insurance, retirement contributions, and labor protections.

Honest accounting of opportunity costs. A PhD takes 5-7 years, which are prime earning years. The compensation should reflect that cost. If we can’t afford to pay graduate students fairly, maybe we shouldn’t be running programs that require exploiting them.

This isn’t radical. It’s how many other countries already operate. It’s what we could build if we chose to prioritize quality over quantity, people over productivity, and sustainability over short-term extraction.

But building this requires admitting that the old system was fundamentally flawed, not just under-resourced. It requires PIs to accept they can’t run labs of 15 people on the cheap. It requires universities to acknowledge that graduate programs shouldn’t be profit centers via overhead. It requires funding agencies to insist on fair labor practices as grant conditions.

Most of all, it requires breaking the cycle where those of us who succeeded by enduring exploitation then administer systems that perpetuate it. The fact that we survived at $2.25 per hour doesn’t make it acceptable. The fact that we built careers despite the system doesn’t mean others should have to do the same.

The Reckoning

I’m still in touch with K. He’s doing fine—radiologists make good money, have reasonable schedules, and contribute meaningfully to patient care. He saw the system clearly, did the math, confronted the Boss, got an honest answer, and made a rational choice to exit.

I made a different choice. I stayed. I succeeded. I administered. And now I’m watching the system I succeeded within face potential collapse, and I’m wrestling with complicated feelings about that.

There’s grief—genuine grief—for what’s being lost. Brilliant research programs shut down mid-stream. Talented scientists are leaving the country—graduate students whose training is disrupted. The accumulated infrastructure of American scientific excellence is under assault.

But there’s also—if I’m honest—something else. A recognition that the system we’re grieving was deeply flawed. That its excellence was built on exploitation. Those of us who rose through it had obligations to fix it, and we didn’t. We knew better—K.’s calculation proved we knew better—but knowing better didn’t translate to doing better.

When we fight to restore science funding—and we should fight—we need to be clear about what we’re fighting for. Not restoration of the exploitation model. Not rebuilding the $2.25-per-hour wage. Not recreating the power imbalances that let PIs accumulate nice cars and nice clothes while graduate students pipetted at 3 AM.

We should be fighting for something better: a system that produces excellent science while treating the people who produce it as valuable professionals rather than exploitable labor. A system where the next generation doesn’t have to choose between career aspirations and basic dignity. A system where doing the math doesn’t lead to the conclusion that you’re being exploited, because the math actually works out fairly.

What I Would Tell My Younger Self

If I could go back to that 3 AM moment in 1987, what would I say?

I wouldn’t tell younger-me to quit. The PhD mattered. The work mattered. The career I built was meaningful. I don’t regret staying.

But I would tell younger-me that K. was right. Not just about the $2.25 per hour—that was obviously correct mathematically. But about the fundamental point: the system was designed to extract maximum value while providing minimum compensation, and that design wasn’t accidental or temporary or likely to change through individual complaints.

I would tell younger-me that succeeding within an exploitative system doesn’t validate the system. That making it to the other side doesn’t mean the journey was necessary or appropriate. That future responsibility comes with having survived—responsibility to change things for those who come after.

I would tell younger-me to remember that moment, that calculation, that casual indifference, and to let it inform every decision about how science should be organized and funded and sustained. That when you have power later, you use it differently than it was used against you.

And I would tell younger-me that systems that seem permanent—like the Berlin Wall, like the graduate student exploitation model—can collapse suddenly when their contradictions become unsustainable. That the question is always what we build afterward, whether we repeat the same mistakes or choose differently.

The Choice Ahead

We’re at that Berlin Wall moment now. The old system is breaking. What comes next is undetermined.

We could fight to restore exactly what we had: the funding levels of 2024, the program structures we’re familiar with, the career paths we know. We could rebuild the $2.25-per-hour model, just with better marketing and more rhetoric about the nobility of sacrifice for science.

Or we could acknowledge that the crisis creates opportunity. That when systems break, we can build better ones. That American science doesn’t have to rest on exploitation to produce excellence.

Four decades after K. slammed down his pipetter and did the math, the system he calculated is facing its reckoning. Those of us who survived it, who succeeded within it, who administered it—we bear responsibility for what comes next.

We can rebuild exploitation with better PR. Or we can build something actually better.

K. figured out we made less than minimum wage. The Boss explained why that didn’t matter to him. And the system rolled on for nearly four decades.

It won’t roll on much longer. The question is what replaces it.

When we rebuild American science—and we will rebuild it—we should build it for people like K. and younger-me, not for people like the Boss. We should build it so the math works out differently. So the response to “we make less than minimum wage” is horror and reform, not nice cars and nice clothes.

The Berlin Wall fell. The system breaks. What we build next is our choice.

Let’s choose better.

The Hypothesis Trap

When Scientists Fall in Bad Love With Their Own Ideas

Approximately four decades ago, I became a witness in a scientific misconduct case. The charges had been brought by an international postdoc in the lab where I had also worked before moving on, and I cannot remember many of the details, except that my written testimony stated that I knew nothing. But I do remember, in the context of more recent high-profile cases, that the essence of the accusation then was the same as it is now: altering experimental data to support the ‘party line’.

The recent disruption to American science has been extensively documented. Given how deeply intertwined government research dollars are with the budget models for R1 universities and the large academic medical centers, it’s not surprising that those funds were chosen for their leverage, and that the consequence of their being in jeopardy will profoundly alter the course of pursuing Vannevar Bush’s version of the endless frontier.

But I want to explore a different question raised by that long-ago case. When I recall that the essence involved “altering data to support the party line,” I need to ask: whose party line was it? In that case, and in many since, the party line wasn’t imposed by some external authority. It was the PI’s own hypothesis, their pet theory, the idea they’d invested years in developing and defending. The fraud wasn’t about serving power—it was about rescuing a cherished belief from contradictory evidence.

This raises uncomfortable questions about how we organize biomedical research. The current system—hypothesis-driven projects led by individual PIs who develop deep attachments to specific ideas—contains structural flaws that push even honest scientists toward motivated reasoning and occasionally push the dishonest ones past the line into fraud.

The Romantic Model of Science

Our funding system enshrines a particular vision of how science works. A brilliant investigator conceives a hypothesis. They design clever experiments to test it. They write a compelling grant proposal. If funded, they spend 3-5 years testing their idea. Success means publishing papers that confirm the hypothesis, which leads to more grants to extend the work.

This model has romantic appeal. It positions the PI as the creative genius whose insight drives discovery. It makes science a battle of ideas where the best hypotheses prevail. It creates clear narratives: an investigator proposes a theory, designs experiments to test it, and demonstrates it is correct. This is how we teach science, how we write about it in popular accounts, how we celebrate it in awards and prizes.

The problem is that this romantic model creates precisely the conditions under which fraud becomes tempting and honest self-deception becomes nearly inevitable.

When Hypothesis Becomes One’s Identity

Here’s what happened in numerous misconduct cases from the 1980s onward: A researcher develops a hypothesis. It’s not just any hypothesis—it’s their hypothesis, the idea that defines their research program, the theory that distinguishes them from competitors. They build a laboratory around it, recruit students and postdocs to test it, and write grants that promise to extend it.

The hypothesis becomes their professional identity. Colleagues know them as “the person who works on that theory.” Graduate students join their lab specifically to work on that problem. Papers in high-impact journals describe their unique contribution. Tenure committees evaluate whether the hypothesis has generated sufficient publications. Grant review panels judge whether the approach is likely to continue producing results.

Then experiments start yielding contradictory data. Not every experiment—if every experiment failed, the researcher might abandon the hypothesis. However, when enough experiments yield ambiguous or contradictory results, the careful scientist should begin to question the core idea.

This is where the system’s design creates problems. Walking away from the hypothesis means walking away from professional identity, from grants that depend on that research program, from students and postdocs whose projects are built on that framework. It means admitting that years of work may have been directed toward the wrong question. It means watching competitors promote alternative theories.

The pressure isn’t external—nobody is ordering the researcher to maintain their hypothesis. The pressure is structural, built into how we organize careers and evaluate success. When your identity, your lab’s funding, and your scientific reputation all depend on a particular idea being correct, it takes extraordinary intellectual honesty to acknowledge that idea might be wrong.

On the Spectrum: From Delusion to Fraud

Most scientists don’t fabricate data. But many engage in practices that fall short of fraud while still distorting the scientific record. These practices stem from the same structural problem: excessive investment in a specific hypothesis.

Selective reporting occurs when experiments yielding inconvenient results are dismissed as “technical problems,” whereas experiments supporting the hypothesis are published. The researcher isn’t fabricating data—they’re making judgments about which data are “good.” But those judgments are biased by investment in the hypothesis.

Data massaging occurs when researchers make analytical decisions that favor their theory. Which outliers to exclude? How to set cutoffs? Which statistical tests to use? Each decision seems defensible individually, but collectively, they bias results toward the preferred outcome. Again, this isn’t fabrication—it’s motivated reasoning dressed up as methodological choice.

Hypothesis rescue manifests as increasingly elaborate explanations for why experiments that should have supported the theory failed. Maybe the conditions weren’t quite right. Maybe there’s an additional factor we didn’t control for. Maybe the effect is context-dependent. Some auxiliary hypotheses are legitimate scientific refinements. Others are epicycles added to save a failing theory.

Selective collaboration and citation appear when researchers preferentially cite papers supporting their view while ignoring contradictory work. They collaborate with scientists who share their hypothesis, while avoiding those who promote alternatives. This creates echo chambers where a contested theory looks like a consensus because the believers only talk to each other.

These practices aren’t fraud in the legal sense. They’re what happens when intelligent, well-meaning scientists become too invested in particular ideas. The investment doesn’t require conscious dishonesty—it just requires the normal human tendency to see what we expect to see, to value evidence confirming our beliefs more highly than evidence challenging them.

The Cases We Remember

The 1980s wave of misconduct cases illuminates this pattern. Take John Darsee at Harvard Medical School. His fraudulent cardiology research wasn’t random fabrication—it was data manufactured to support his ongoing research program. He was so invested in demonstrating that his approach worked that he fabricated results when experiments didn’t cooperate. His extraordinary productivity should have raised red flags, but it fit the romantic model: the brilliant investigator producing breakthrough after breakthrough.

The Baltimore affair involved Thereza Imanishi-Kari’s immunology data that Margot O’Toole couldn’t replicate. The decade-long controversy ended in 1996 when an appeals board cleared Imanishi-Kari of all misconduct charges. But the case revealed how competing interpretations of the same data can arise when different investigators bring different assumptions to their analysis, and how difficult it becomes to distinguish between legitimate scientific disagreement and potential misconduct when researchers are deeply invested in their theories.

Eric Poehlman’s obesity research fraud—falsifying data in 17 grant applications and 10 publications—followed the same pattern. He had a research program, a reputation, and a stream of funding dependent on showing that his hypotheses about aging and obesity were correct. When data didn’t cooperate, he made them cooperate.

The common thread isn’t that these individuals were uniquely evil. It’s that they were operating in a system where too much depended on specific hypotheses being correct. The same pressures that led them to commit fraud push others into questionable practices and drive everyone toward motivated reasoning.

The Structural Alternative: Team Science

Consider how differently science works in fields that have moved away from the PI-centered hypothesis-driven model.

Large-scale genomics operates with diverse teams interrogating datasets rather than testing specific hypotheses. The question isn’t “Is my theory correct?” but “What patterns exist in these data?” Multiple investigators with different backgrounds and biases analyze the same datasets. Results require replication across labs. The data-sharing infrastructure enables other groups to independently verify findings.

Nobody’s career depends on a specific gene being associated with a particular disease. If your analysis suggests gene X matters but another team’s analysis contradicts that, there’s no professional catastrophe. You’re contributing to collective understanding rather than defending personal theories.

The BRAIN Initiative that I helped launch during my tenure at NSF was designed in part to avoid the hypothesis trap. Rather than funding individual PIs to test specific theories about brain function, it funded tool development, data collection, and infrastructure that multiple investigators could use. The bet was that understanding the brain required comprehensive data and analytical capabilities, not just clever hypotheses.

This doesn’t eliminate all bias—researchers still have preferences about which tools to develop or which brain regions to map. But it reduces the intense personal investment in any particular theory about how the brain works. The focus shifts from testing hypotheses to building shared resources.

Particle physics has worked this way for decades. Nobody at CERN builds a career on predicting a specific particle will or won’t be found. The infrastructure supports collective inquiry. Results require consensus across large collaborations. Data are shared immediately. Multiple teams analyze the same detector output.

Can you imagine a particle physicist fabricating Higgs boson data? The system makes it nearly impossible—not because particle physicists are more ethical, but because the organizational structure distributes both credit and accountability across large teams working with shared data.

The Biomedical Research Counterfactual

What would biomedical research look like if we designed it to minimize the hypothesis trap?

Separation of hypothesis generation from testing. One team develops theories and predictions. A different team, with no stake in the theory’s success, conducts the experiments. The testing team is rewarded for rigorous methods and clear results, not for confirming or refuting specific hypotheses. This isn’t unprecedented—clinical trials often use this model, with statisticians who haven’t seen interim results conducting final analyses.

Registered reports and pre-registration. Require researchers to specify hypotheses, methods, and analyses before collecting data. Journals commit to publishing based on methodological quality, not results. This removes the temptation to massage data because publication is already guaranteed. The researcher benefits from doing careful work, not from obtaining specific results.

Adversarial collaboration. When competing theories exist, fund collaborations between proponents to design jointly agreed-upon decisive tests. Each side specifies in advance what results would falsify their theory. The collaboration is rewarded for clarity and rigor, not for one side winning.

Collective attribution and team leadership. Move away from the PI model toward team leadership with distributed authority. Make it normal for multiple investigators to share senior authorship without hierarchical ordering. Reward contributions to collective projects, not just defending personal theories. This reduces the intensity of individual investment in specific hypotheses.

Diverse parallel approaches. Rather than funding one investigator to test one hypothesis over five years, fund multiple teams to simultaneously test competing hypotheses. Make this explicit: “We think question X is important but don’t know which of three theories is correct, so we’re funding all three approaches.” The field benefits from comparative testing; individual investigators aren’t catastrophically invested in one answer.

The Objections

These suggestions will provoke immediate resistance, much of it justified. The romantic model of science—brilliant individual investigator pursuing visionary ideas—isn’t entirely fiction. Great insights do come from individuals. Breakthrough theories do require conviction to pursue against skepticism. Hypothesis-driven research has produced genuine discoveries.

Moreover, team science and collective approaches have their own challenges. Large collaborations can become bureaucratic. Consensus-building can delay needed action. Distributing credit across many people may reduce individual incentive for excellence. Pre-registration can be gamed by enrolling multiple studies and selectively reporting which ones to complete.

The adversarial collaboration model assumes good faith from competing investigators, which isn’t always present. Separating hypothesis generation from testing may slow progress if the best experiments require an intimate understanding of the theory. Distributed leadership creates coordination problems.

These are real concerns. I’m not arguing for the complete abandonment of hypothesis-driven research or the PI model. But I am arguing that we’ve over-indexed on one way of organizing science—a way that creates predictable problems around motivated reasoning and hypothesis attachment—without seriously considering alternatives that might mitigate those problems.

The Incentive Redesign

The deeper issue is incentive structure. We reward:

  • Publications in high-impact journals (which prefer dramatic confirmations of interesting hypotheses)
  • Grant funding (which requires convincing reviewers you’re pursuing important ideas likely to yield results)
  • Citations (which accumulate for papers making strong claims, not for careful null results)
  • Awards and prizes (which celebrate breakthroughs, not rigorous refutations)
  • Tenure and promotion (based on establishing an independent research program—meaning a distinctive hypothesis)

Each incentive encourages researchers to develop strong attachments to specific theories. The scientist who carefully tests a hypothesis, finds ambiguous results, and concludes, “This is more complicated than we thought,” doesn’t thrive under these incentives. The scientist who generates a provocative theory, designs experiments to support it, and publishes dramatic results thrives—even if the theory is ultimately wrong.

We could design different incentives:

  • Reward rigorous replication attempts
  • Fund adversarial collaborations that test competing theories
  • Celebrate careful negative results that prevent the field from pursuing dead ends
  • Promote scientists who change their minds when evidence demands it
  • Value contributions to infrastructure and methods that enable collective progress

None of this is unprecedented. Clinical trial statisticians build careers on methodological rigor, not therapeutic breakthroughs. Methods developers in genomics gain recognition for creating tools others use. Psychology researchers are valued for independently testing whether published findings hold up.

The question is whether biomedical research, more broadly, is willing to diversify its incentive structures and organizational models. The field is enormously successful—NIH funding, breakthrough therapeutics, extended lifespans. Why change a winning formula?

BACK TO THAT 1980S CASE

The postdoc who brought misconduct charges understood something important: when data are being altered to support “the party line,” someone needs to object. That takes courage—postdocs are vulnerable, whistleblowers face retaliation, and questioning senior scientists is risky.

But here’s what I’ve come to understand that I didn’t fully appreciate forty years ago: the party line wasn’t imposed from outside. It emerged from structural features of how we organize research. The PI who allegedly manipulated data wasn’t serving some external master. They were serving their own hypothesis, the idea they’d built a career around, the theory their lab existed to develop.

That makes the problem both worse and better than simple corruption. Worse, because it means well-meaning scientists with good intentions can slide into questionable practices without recognizing it. The same motivated reasoning that drives fraud also drives less dramatic but equally problematic biases in how we collect, analyze, and report data.

Better because it means organizational redesign might help. We can’t eliminate human fallibility or the emotional attachment scientists develop to their ideas. However, we can design systems that reduce the extent to which outcomes depend on any particular hypothesis being correct. We can create structures where admitting you were wrong is professionally survivable. We can reward rigor over drama, collective progress over individual breakthroughs.

The Path Forward

I’m not optimistic about radical transformation. The biomedical research enterprise is vast, successful, and institutionally entrenched. The romantic model of the lone investigator testing brilliant hypotheses is deeply embedded in how we tell science stories, train graduate students, and allocate prestige.

But incremental change is possible:

Funding agencies can require pre-registration for hypothesis-driven research while also funding more exploratory, team-based approaches. NIH’s BRAIN Initiative and precision medicine programs already point in this direction. Expanding these models would diversify how research gets organized.

Journals can mandate data sharing and the use of registered reports. Some journals already do this; others resist for fear of losing exciting submissions to competitors. But collective action could shift norms. If high-impact journals required rigorous transparency, researchers would adapt.

Universities can broaden tenure criteria to value methodological rigor, replication, infrastructure development, and collaborative contributions, alongside traditional metrics of independent research. This requires courage because it means promoting faculty who don’t fit the standard template, but it’s feasible.

Training programs can teach critical evaluation of one’s own hypotheses. Rather than just training students to design clever experiments and write compelling grants, we can teach them to actively look for ways they might be wrong, to value evidence against their theories, and to see changing one’s mind as a strength rather than a weakness. This is partly cultural, partly structural.

Funders can experiment with alternative models. Fund some research explicitly as adversarial collaboration. Fund some as team science with distributed leadership. Fund some as infrastructure development. Create parallel tracks so researchers can build careers through multiple pathways, reducing the pressure to develop intense attachment to specific hypotheses.

None of this will eliminate fraud—there will always be individuals who cheat. However, it might reduce the structural pressures that push honest scientists toward motivated reasoning and, in some cases, scientists toward outright fabrication.

Integrity is More Than Honesty

That 1980s case I barely remember continues to inform my thinking, not because I have clear memories of it but because it captures something essential: scientific integrity requires more than individual honesty. It requires organizational structures that don’t push even honest people toward biased reasoning.

The postdoc filing charges was practicing integrity. But they were fighting against a system where a PI’s attachment to their hypothesis created pressure—probably unconscious, probably rationalized, but pressure nonetheless—to make the data fit the theory. One brave postdoc can’t fix structural problems alone.

We’ve built an enormously productive research enterprise. Biomedical science has achieved genuine miracles. The hypothesis-driven, PI-centered model has generated breakthrough after breakthrough. I’m not arguing it’s failed—clearly it hasn’t.

However, I argue it’s flawed in predictable ways. The same features that make it successful—individual investigators developing strong convictions about important ideas and pursuing them relentlessly—also create conditions for motivated reasoning, questionable research practices, and occasional fraud.

Acknowledging those flaws doesn’t diminish the achievements. It opens space for experimentation with alternative models that might reduce the problematic incentives while preserving the creative energy that drives discovery. The question is whether we’re willing to diversify how we organize research or whether we’ll continue over-relying on a single model because it’s familiar and has worked in the past.

The endless frontier that Vannevar Bush envisioned shouldn’t be endless in just one direction. It should include exploring different ways of pursuing knowledge, different structures for organizing inquiry, and different incentives for rewarding contributions to collective understanding.

That’s the real challenge: not just preventing fraud but creating systems where the pressures toward fraud—and toward less dramatic but equally problematic biases—are reduced. Where changing your mind based on evidence is professionally rewarded rather than punished. Where attachment to ideas is balanced by commitment to collective truth-seeking.

The party line that worries me most isn’t imposed by political power. It’s the party line we impose on ourselves when we become too attached to our own hypotheses, when our professional identities become too entangled with specific theories, when the systems we’ve built make admitting error too costly. We need to extend that understanding to object not only to individual fraud but also to organizational structures that make such fraud more likely. Building not just oversight systems but alternative models of how to pursue science.

That’s the integrity challenge for the next forty years.

When Agencies Collaborate: What EEID Teaches Us About Pandemic Preparedness

The research team moved carefully through the forest canopy platform at dusk, nets ready. In Gabon and the Republic of Congo during the mid-2000s, international ecologists were hunting for the reservoir host of Ebola virus. They targeted fruit bat colonies—hammer-headed bats, Franquet’s epauletted bats, little collared fruit bats—collecting blood samples and oral swabs.

By December 2005, they had their answer, published in Nature. They’d found Ebola RNA and antibodies in three species of fruit bats across Central Africa. For years, scientists had known Ebola emerged periodically, but couldn’t identify where the virus persisted between human epidemics. This research provided the answer: fruit bats, widely distributed and increasingly in contact with humans as deforestation pushed people deeper into forests.

Thanks for reading sciencepolicyinsider! Subscribe for free to receive new posts and support my work.

That discovery triggered a wave of follow-up research, much of it funded through the Ecology and Evolution of Infectious Diseases program—EEID—a joint NSF-NIH-USDA initiative I would later help oversee. EEID-funded teams documented how human activities created spillover opportunities: bushmeat hunting, agricultural expansion into bat habitat, mining operations bringing workers into forests. They identified cultural practices that facilitated transmission: burial traditions, preparation of bushmeat, children playing with dead animals. They built mathematical models of how Ebola moved from bats to humans and then through human populations. The science showed where Ebola lived, how it spilled over, and which human behaviors created risk.

Yet nine years after that initial Nature paper—after years of EEID-funded research mapping Ebola ecology—the virus emerged in Guinea in late 2013 and was identified in March 2014. A two-year-old boy, likely exposed through contact with bats, became patient zero. Within months, the outbreak had spread to Liberia and Sierra Leone. By 2016, more than 28,000 people were infected and 11,000 died. The economic impact exceeded $2.8 billion.

I was leading NSF’s Biological Sciences Directorate at the time, overseeing NSF’s role in EEID. We had funded years of follow-up research. We knew fruit bats harbored Ebola. We had models for predicting transmission. We had mapped high-risk regions. And yet 11,000 people died anyway. All of this was foreshadowing what would happen with SARS-CoV-2 later and on a much larger scale.

Here is the uncomfortable question I’ve been wrestling with ever since: If we funded the right science and had years of warning, why were we not better prepared?

What EEID Was Supposed to Do

EEID launched in 2000 because infectious disease ecology fell between agency missions. NSF supported ecology but wasn’t focused on disease. NIH funded disease research but wasn’t equipped for field ecology. USDA cared about agricultural diseases but not the broader ecological context. The program brought all three together: NSF’s ecological expertise, NIH’s disease knowledge, and USDA’s understanding of agricultural-wildlife interfaces.

The administrative structure was elegant on paper. All proposals submitted through NSF underwent joint review by all three agencies, and then any agency could fund meritorious proposals based on mission fit. For Ebola research, this meant NSF might fund the bat ecology, NIH’s Fogarty International Center might support the human health surveillance component, and USDA might fund work on bushmeat practices—different pieces of the same puzzle, coordinated through a single program.

The program typically made 6-10 awards per year, totaling $15-25 million across agencies. Not huge money, but enough to support interdisciplinary teams working across continents. And it worked—EEID funded excellent science at the intersection of ecology and disease that no single agency could have supported alone.

Why Interagency Collaboration Is Genuinely Hard

When I arrived at NSF in 2014 with the outbreak at its peak, I inherited EEID oversight and quickly discovered that elegant-on-paper doesn’t mean easy-in-practice. The deepest challenges weren’t administrative—they were cultural.

NSF and NIH approach science from fundamentally different starting points. NSF’s mission is discovery-driven basic research. When NSF reviewers evaluate proposals, they ask: Is this important science? Will it advance the field? NIH’s mission is health-focused and translational. NIH reviewers want to know: Will this help prevent or treat disease? What’s the public health significance?

I saw this play out in a particularly contentious panel meeting around 2016. Our panelists were reviewing a proposal on rodent-borne hantaviruses in the southwestern U.S.—excellent ecology, good epidemiology, solid modeling. The NSF reviewers loved it: beautiful natural history, important insights about how environmental variability affects transmission. The NIH reviewers were skeptical: where was the preliminary data on human infection? How would this lead to intervention?

An hour passed debating what constituted “good preliminary data.” For NSF reviewers, the PI’s previous work establishing field sites was sufficient—it showed feasibility. NIH reviewers wanted preliminary data on the virus itself, on infection rates. They weren’t being unreasonable—they were applying NIH’s standards. But we were talking past each other.

That debate crystallized the challenge. Two agencies with different cultures had to agree on the same proposals. Sometimes it created productive tension. Sometimes it just meant frustration.

The administrative burden on investigators was worse than we acknowledged. When NIH selected a proposal for funding instead of NSF, the PI had to completely reformat everything for NIH’s system—different page limits, different budget structures, different reporting requirements. This could add 3-6 months to award start dates. Try explaining to a collaborator in Guinea why you don’t know which U.S. agency will fund your project or when you’ll actually get money.

For program officers, EEID meant constant coordination overhead—meetings to discuss priorities, coordinating review panel schedules across agencies, negotiating which agency would fund which proposals. This work wasn’t counted in official program costs, but it was real. Hours we could have spent on other portfolio management.

Despite all this friction, EEID succeeded at its core mission. It funded research that advanced both fundamental science and disease understanding. When the 2014 Ebola outbreak hit, epidemiologists reached for transmission models developed through EEID grants. The program had trained a generation of researchers in genuinely interdisciplinary work.

What the 2014 Outbreak Exposed

But here’s what haunts me: we funded the science but not the systems. By 2014, nearly a decade of research had confirmed fruit bats as Ebola reservoirs, mapped their distribution across Africa, and identified high-risk human-bat contact zones. Papers were published in top journals. And then… nothing. No one built surveillance systems in West African villages where contact with bats was common. No one established early warning networks. No one created mechanisms to translate “we found Ebola in these bats” into “we’re monitoring for spillover in Guinea.”

EEID funded research, not surveillance. That’s appropriate—it’s a research program, not an operational public health system. But there was no mechanism to bridge the gap. When EEID-funded scientists discovered important findings, those findings stayed in academic papers. They didn’t flow to CDC, didn’t trigger surveillance efforts, didn’t inform preparedness planning.

During our quarterly coordination calls with NIH and USDA program officers, the question would occasionally arise: Who’s responsible for acting on what we’re learning? If EEID research identifies high-risk pathogen reservoirs, whose job is it to establish surveillance? The answer was usually silence, then acknowledgment that it wasn’t our job—we fund research—but uncertainty about whose job it was.

The missing infrastructure was organizational, not intellectual. We knew enough to be better prepared. The problem was lack of systems to act on knowledge. No agency was responsible for translating academic research into surveillance systems. CDC focuses on domestic diseases. NIH funds research but doesn’t run operations overseas. USAID’s PREDICT program did fund surveillance but didn’t have coverage in Guinea. We had pieces of the puzzle but no mechanism to assemble them.

I remember discussions about whether EEID should become more operational—perhaps requiring funded projects to include surveillance components. The response was always that this would fundamentally change the program’s character. NSF resists mission-directed research. My former agency’s strength is supporting investigator-driven discovery. Making EEID operational would require multiple agencies and authorities, and, most importantly, substantially more funding. A research program can’t solve an operational preparedness gap.

The scale problem was obvious. At $15- $ 25 million per year, EEID could support excellent science but not comprehensive surveillance. Think about what that would require: ongoing monitoring in multiple countries, relationships with local health systems, rapid response capacity, and laboratory infrastructure. This requires hundreds of millions annually, not tens of millions.

The timeline mismatch was equally frustrating. Research operates on slow timescales—EEID grants ran five years, and from proposal to publication might take 6-7 years. The initial bat reservoir discovery was published in 2005. If that had immediately triggered surveillance in West Africa, we’d have had nearly nine years before the 2014 outbreak. But triggering surveillance takes decisions, funding, international coordination—processes that themselves take years. By the time anyone might have acted, attention had moved elsewhere.

What This Means for Pandemic Preparedness

The most troubling insight: we knew enough to be better prepared for Ebola, and later for COVID-19, but knowledge alone wasn’t enough. EEID succeeds at advancing knowledge but can’t create surveillance systems, can’t fund operational preparedness, can’t bridge the gap between discovering threats and preventing epidemics. That gap is organizational and political, not scientific.

Should we expand EEID? More funding would support more projects, but it wouldn’t solve the fundamental problem. You could triple EEID’s budget and still have the research-to-surveillance gap. More papers about bat reservoirs don’t automatically create early warning systems. The limitation isn’t insufficient research funding—it’s absence of operational systems to act on research findings.

We need something structurally different. Here’s what I’d do:

First, create a rapid-response funding mechanism within EEID. When Ebola emerged in 2014, imagine if researchers could have gotten funding within weeks to investigate transmission dynamics and surveillance in surrounding regions, rather than waiting for the next annual competition. Model this on NSF’s RAPID program—streamlined review, modest awards ($100-200K for one year), quick deployment—but create an entirely different pocket of money for it from all the participating funders.

Second, establish formal connections between EEID and operational agencies. This is the biggest gap. Require EEID-funded researchers to submit one-page “surveillance implications” memos with final reports, which program officers share with CDC, USAID, and WHO. Better yet, have CDC or BARDA co-fund some EEID proposals with clear surveillance applications. Create visiting scholar programs where CDC epidemiologists spend time with EEID research teams and vice versa.

Third, strengthen international partnerships with genuine co-leadership. The 2014 outbreak showed the cost of inadequate surveillance infrastructure in West Africa. Expand EEID to include more disease hotspot regions—India, Brazil, Indonesia, DRC, West African nations—where foreign investigators can be lead PIs, foreign institutions receive and administer funds, and research priorities reflect host country needs. This isn’t altruism—it’s pragmatic self-interest.

The Larger Lesson

Interagency collaboration is genuinely hard—the friction I described isn’t fixable through better management. It’s inherent when bringing together organizations with different missions and cultures. EEID proves such collaboration can work and produce excellent science. But it requires sustained effort, goodwill, and tolerance for complexity.

The alternative—each agency in its silo—is worse. Infectious disease ecology requires expertise no single agency possesses. Complex problems require complex solutions. EEID demonstrated this is possible. The challenge is making it sufficient.

What haunts me is that we’re probably going to repeat the pattern. Right now, post-COVID, pandemic preparedness has political salience. But history suggests this won’t last. After the 2014-2016 Ebola outbreak, there was similar urgency. Within a few years, budgets declined and attention shifted. USAID’s PREDICT program was terminated in 2019—just months before COVID—due to budget constraints. We cut surveillance funding during a quiet period, then paid an enormous price when the next pandemic hit.

Prevention is invisible. We never know which pandemics we successfully prevented. There’s no constituency defending preparedness funding when cuts loom. That’s the structural problem we haven’t solved.

What Needs to Happen

Will we learn from EEID’s experience and build the infrastructure we need? Or will we fund the right research but lack systems to act on it—again?

The answer depends on recognizing that pandemic preparedness isn’t primarily a scientific challenge—we know enough—but an organizational and political one. Can we create structures spanning research and operations? Can we sustain funding between crises? Can we build systems robust enough to survive political leadership changes?

EEID succeeded at what a research program can do: funding excellent science that advanced understanding. The larger failure—inadequate pandemic preparedness—requires solutions at different organizational levels. But EEID’s experience provides a foundation: proof that interagency collaboration can work, that we can identify threats before they become catastrophes.

The team in Central African forests collecting bat samples did their job. They found the virus, mapped the threat, advanced our understanding. The question for the rest of us—program officers, policymakers, public health officials, citizens who fund this through taxes—is whether we’ll do our job: building systems that turn knowledge into prevention.

Science can identify threats. But preventing pandemics requires more than science. It requires sustained organizational commitment, interagency coordination, international cooperation, and political will—especially during quiet periods when threats seem distant. EEID demonstrated the scientific component is feasible.

The rest is up to us. And based on what I’ve seen, I’m not optimistic we’ll get it right before the next one hits.

Three Things Aviation Teaches Us About Science Funding

A trip to Long Beach Airport reveals something deep about policy

The LA Uber driver let me off at the small passenger terminal at Long Beach Airport, and I had to do some serious trial and error with Google Maps to find the old Douglas Aircraft hangar where JetZero had set up shop with the admirable goal of completely disrupting the commercial aviation market by building a wide-body blended wing aircraft that would carry a 787 Dreamliner load of passengers across the country for half the fuel cost.

The hangar was open to the air, the ramp and runway fully active, yet the ethos inside was pure early-2000s Google—when anything seemed possible. The enormous space was filled with a full-size cabin mock-up, engineers at workstations, cinema-size screens streaming CAD imagery of the new plane sporting various well-known airline liveries, and a collection of flying scale model drones. The plane itself looked like it had flown off a science fiction set.

Thanks for reading sciencepolicyinsider! Subscribe for free to receive new posts and support my work.

The engineering team was equally striking: veterans from Boeing, Embraer, and McDonnell Douglas, each bringing decades of experience from very different aviation cultures. I met one of the chief designers—the inventor of sharklets, those forked wingtips that reduce drag and improve fuel efficiency, now ubiquitous on some commercial aircraft. Another engineer had come from Embraer, where he’d designed the popular 2×2 cabin configuration that passengers overwhelmingly prefer on narrow-body aircraft. Now he was tackling the challenge of designing a completely new kind of airplane cabin that would maximize comfort in a blended wing configuration.

These engineers had learned their craft in established organizations with very different approaches to decision-making, risk assessment, and innovation. The wave of consolidation in aviation—most notably Boeing’s merger with McDonnell Douglas and its subsequent shift from an engineer-driven culture to one focused on shareholder returns—had left many veteran engineers looking for something different. The 737 MAX crisis highlighted how far Boeing had drifted from its engineering roots. JetZero represented a chance to get back to what they loved: solving hard technical problems without the constraints of quarterly earnings calls and legacy infrastructure.

They were attempting something none of their former employers would touch: a radical departure from the tube-and-wing design that has dominated commercial aviation for seventy years. This raised a question that goes far beyond aircraft design: Why can radical innovation happen at a startup like JetZero but not at Boeing, Airbus, or Embraer?

This isn’t just about airplanes. It’s about how organizations—whether aircraft manufacturers or science funding agencies—decide what’s worth building, who gets to decide, and how they balance proven approaches against risky bets. Aviation and science funding face the same fundamental challenge: how to organize technical innovation.

Studying how Boeing, Airbus, and Embraer make these decisions has revealed patterns that apply directly to science funding. Here are three lessons from aviation that illuminate how research gets funded—and why some innovations happen while others never get off the ground.

Lesson 1: How Organizations Assess and Manage Technical Risk

The Aviation Pattern

Boeing, in its traditional engineer-driven culture, approached risk through data and testing. Engineers made decisions based on technical feasibility. They’d prove something worked, then seek regulatory approval. The 787 Dreamliner exemplified this: Boeing pushed carbon-composite technology to unprecedented levels while keeping the basic configuration conventional. The cultural assumption: engineers know best, prove it works, get approval, move forward.

Airbus operates from a completely different framework. As a consortium involving multiple governments, labor unions, and industry stakeholders, risk assessment includes political, economic, and social factors alongside technical ones. Workers’ councils have a voice in production decisions. Safety regulators participate earlier in the design process. The A380 Superjumbo was technically conservative—four engines, conventional configuration—but represented enormous manufacturing and political risks, requiring coordination across nations. The cultural assumption: technical decisions affect many stakeholders, and all deserve input.

Embraer’s approach reflects its position as a state development tool for Brazil (the country holds a veto over control of the company’s strategic direction). They can’t compete head-to-head with Boeing and Airbus, so their risk calculus focuses on market positioning. Find niches, develop partnerships, move quickly. The E-Jet family succeeded by targeting the underserved regional market. The cultural assumption: innovation means finding white space in a market dominated by established players.

Same engineering principles. Same physics. The same goal of building safe, efficient aircraft. But fundamentally different risk assessment frameworks.

The Parallel to Science Funding

The American system, through NSF and NIH, operates remarkably like Boeing’s traditional approach. Peer review is engineer-driven decision-making translated to science. Data—preliminary results, track record—drives decisions. The central question reviewers ask is Boeing’s question: “Can this PI deliver with taxpayer money?” Merit review happens after the proposal is submitted. The system rewards incremental progress from established investigators, just as Boeing refined the 737 through successive iterations.

European research funding embeds more stakeholder involvement. Horizon Europe’s missions approach brings policymakers, industry representatives, and public voices into the priority-setting process. Risk assessment explicitly includes societal benefit and economic impact. Clinical translation gets emphasized earlier in the research pipeline. Scientists remain central but aren’t the sole decision-makers.

Emerging science powers like China take yet another approach. Strategic national priorities drive funding decisions. The question isn’t “What’s the best science?” but “Where can we compete globally?” This enables leapfrog strategies: massive focused investments in AI, quantum computing, and biotechnology designed to establish leadership in emerging fields rather than catching up in established ones. This top-down approach is now also emerging within the US science ecosystem.

For researchers, understanding which risk framework you’re operating in helps you frame proposals effectively. The American system rewards demonstrated competence and incremental progress. Other systems may value societal impact, strategic positioning, or rapid deployment. Neither approach is better or worse—they reflect different cultural assumptions about how to allocate risk in technical innovation.

Lesson 2: Who Gets to Decide What Gets Built

The Aviation Pattern

At Boeing, engineers and program managers traditionally drove major decisions. Shareholders and the board provided financial constraints. Airlines shaped requirements. But core technical choices were the engineers’ responsibility. This produced technically sophisticated aircraft, sometimes disconnected from market realities. The 747-8 (the last of the classic jumbo jets’ instantiations), for instance, was an engineer’s dream—but the market for it was lukewarm.

Airbus engages multiple stakeholders from day one. National governments in France, Germany, the UK, and Spain have seats at the table. Workers’ councils negotiate production methods. Industry partners across Europe collaborate on components. Customers get involved earlier. The result is more consensus-driven and sometimes slower, but with broader buy-in. The A350’s long development process reflected extensive consultation but yielded strong market acceptance.

Embraer’s alignment with Brazil’s government development goals sets direction, but the company maintains a partnership model with established players and responds quickly to market signals. Less hierarchical decision-making enables nimble adaptation. The attempted Embraer-Boeing partnership that ultimately fell apart illustrated starkly different decision-making speeds between the two companies.

JetZero represents something different entirely. A small team iterates rapidly. Engineers from different aviation cultures bring different assumptions. Venture capital’s risk tolerance differs fundamentally from corporate risk aversion. They can attempt radical innovation precisely because they’re not constrained by established stakeholder expectations or legacy infrastructure.

The Parallel to Science Funding

American peer review puts scientists in the decision-making seat. On its face, this seems ideal: who better to judge scientific merit than other scientists? But peer review favors known researchers using proven methods. Peers can become conservative gatekeepers. The result is high quality and incremental progress, but potentially missed breakthroughs.

European models bring more voices into the room. The European Research Council maintains scientific independence but operates within frameworks emphasizing societal missions and grand challenges. Policymakers, industry representatives, and public stakeholders help set priorities. Scientists remain central but aren’t the sole arbiters. This creates stronger connections to societal needs, though sometimes at the cost of researcher autonomy.

Directed research models flip the equation. Governments or funding agencies set priorities; researchers respond to calls for proposals. This is top-down rather than bottom-up. The advantage is alignment with national priorities. The risk is missing unexpected discoveries that don’t fit predetermined categories.

I’ve seen these differences firsthand, reviewing for both American and international funding agencies. The questions panels ask reveal cultural assumptions about whose judgment matters. American panels debate scientific rigor and PI capability. International panels I’ve participated in spend more time on broadening participation and strategic fit with national priorities.

For researchers, understanding who has a voice in funding decisions is crucial for navigating the system. American researchers working internationally need to recognize that peer review isn’t universal—other countries organize scientific decision-making to reflect different values about expertise, accountability, and public benefit.

Lesson 3: The Tension Between Incremental Improvement and Radical Innovation

The Aviation Pattern

Established aircraft manufacturers favor incremental improvement for sound reasons. The tube-and-wing design has been refined for seventy years. Every iteration builds on accumulated knowledge. Existing manufacturing facilities, pilot training programs, maintenance infrastructure, and regulatory pathways all assume this configuration. Airlines understand the operating economics. Risk is manageable, returns are predictable. The 737 MAX—an incremental update to a 1960s design—still makes economic sense despite its troubles.

JetZero’s blended wing body has been studied since the 1940s. Its technical advantages are clear: dramatic improvements in fuel efficiency, reduced noise, and potential for entirely new cabin configurations. But it requires new manufacturing processes, new pilot training, and new regulatory frameworks. The risk isn’t primarily technical—it’s organizational and systemic. There’s no clear path from prototype to profitable, scalable production. Established players, accountable to shareholders and constrained by quarterly earnings expectations, can’t justify the investment.

Startups like JetZero can attempt radical innovation because they have no legacy infrastructure to protect. They can accept higher technical risk. The venture capital model tolerates failure in ways public corporations cannot. They don’t need to satisfy existing stakeholders or worry about cannibalizing current product lines. They can focus on long-term disruption rather than next quarter’s earnings.

But we should be clear: most aviation innovation is incremental for good reason. Lives depend on safety. Capital requirements are enormous. Development timelines span 10-15 years. Regulatory burden is intense. Incremental improvement has delivered extraordinary gains—modern aircraft are unimaginably more efficient, safe, and capable than those of fifty years ago.

The Parallel to Science Funding

Science funding faces the same tension. Established PIs using proven methods dominate for sound reasons. Track records reduce risk. Incremental progress is predictable, publishable, and fundable. Infrastructure investments favor established approaches—if your university has a state-of-the-art imaging facility, proposals that use it have an advantage. Peer reviewers understand and can evaluate proven methods. The “preliminary data” requirement inherently favors ongoing work over genuinely new directions. The system is designed to minimize taxpayer waste through careful risk management.

Truly novel approaches struggle in this environment. High-risk/high-reward programs exist but represent a tiny fraction of overall funding. Early career investigators face a chicken-and-egg problem: “How will you do this?” reviewers ask, but gathering preliminary data requires resources they don’t yet have. Reviewers are more comfortable funding known quantities. Paradigm shifts are rare and unpredictable—there’s no clear “return on investment” for genuinely radical ideas.

Consider the BRAIN Initiative. The vision was bold: transform neuroscience through new technologies and approaches. But implementation favored established neuroscientists with proven track records. The system worked as designed: minimizing risk by funding demonstrated competence. As I’ve written earlier, BRAIN fell short in its delivery goals: curing brain diseases. ARPA-H was explicitly created to escape the incremental trap, but it’s still finding its model. The European Research Council’s advanced grants show somewhat higher tolerance for risk, but even there, track record matters enormously.

For researchers pursuing truly novel approaches, it’s crucial to understand you’re working against system design, not just reviewer bias. The system is optimized for reliable incremental progress, not moonshots. Radical innovation in science, like radical innovation in aviation, may require different funding models—something more like venture capital, tolerant of high failure rates in pursuit of occasional transformative breakthroughs.

This raises a deeper question: Should science funding favor incremental or radical innovation? Or do we need both, in different proportions? Aviation supports both Boeing’s incremental refinements and JetZero’s radical rethinking. Should science funding do the same—and if so, in what balance?

What This Means for Science Policy

These aviation patterns reveal a fundamental feature of how societies organize technical innovation. The choices Boeing, Airbus, and Embraer make about risk assessment, decision-making authority, and the balance between incremental and radical innovation aren’t purely business decisions. They’re cultural choices embedded in what Sheila Jasanoff calls civic epistemologies—different assumptions about how knowledge should be produced, who should decide, and what goals matter most.

American science funding has historically reflected American cultural values: individual merit and achievement drive peer review by scientific peers. Data-driven decision-making shows up in preliminary data requirements. Risk minimization operates through proven track records. Incremental progress represents the reliable path. This isn’t accidental—it’s deeply cultural.

Other countries organize differently because they value different things. European systems emphasize societal benefit and stakeholder input. Asian systems prioritize strategic national development goals. Different countries strike different balances between discovery and application, between researcher autonomy and national priorities, between tolerance for failure and demands for accountability.

For all researchers, understanding these cultural patterns helps you work more effectively within the system. Know what the system optimizes for—reliable incremental progress from established investigators. If you’re pursuing radical innovation, recognize you’re working against the grain. International collaborations require understanding that your partners may operate within fundamentally different funding cultures with different assumptions about what science is for and how it should be organized.

For science policy, we should be explicit about what our funding systems optimize for. There’s no “best” system—only different tradeoffs reflecting different values. Maybe we need multiple models, as aviation has both Boeing and JetZero. Comparing systems reveals assumptions we don’t normally question.

In future posts, I’ll explore specific country comparisons: How does the European Research Council actually work? What can we learn from how other countries fund AI research? How do different countries handle the tension between researcher autonomy and national priorities?

A Final Thought

Visiting JetZero and seeing engineers from Boeing, Embraer, and McDonnell Douglas collaborate on something radical that couldn’t happen within their former companies crystallized something I’d been observing in science policy work: innovation doesn’t just require good ideas and talented people. It requires organizational structures and cultural assumptions that allow certain kinds of ideas to be pursued.

The JetZero engineers didn’t suddenly become more creative or capable. They remained the same engineers who’d designed sharklets at Boeing or cabin configurations at Embraer. What changed was the organizational context—the risk tolerance, decision-making authority, and freedom from legacy constraints. That shift in context enabled them to attempt what had been impossible in their former roles.

Science funding works the same way. Researchers operating within NSF’s peer review system are no less creative than those pursuing radical ideas through ARPA or venture-backed biotechs. But the organizational context shapes which ideas can be pursued and which innovations are possible.

Understanding how different countries organize technical innovation—whether building aircraft or funding research—helps us see our own system more clearly. And maybe, just maybe, it helps us imagine how we might do things differently.

What examples have you seen where organizational culture shaped what research got pursued? Have you experienced different funding cultures working internationally? Share in the comments.

A Grant Reviewer’s New Year Advice to Proposers: What I’d Tell My Younger Self

Happy New Year! Below is Science Policy Insider’s first posting of 2026:

We were reviewing a proposal that included gorgeous preliminary data and confocal microscopy images from what was, at the time, cutting-edge: two-channel laser-scanning technology. Because the images were both crisply in focus and colored in green and red to reflect the locations of different sub-cellular fluorescent molecular probes, it felt as if this was an extraordinary grant proposal, based on the images themselves, never mind the fact that there was no working hypothesis nor technical consideration of a phenomenon called autofluorescence, where the biomolecules of the cell itself produce their own signal that can be confused with the signals coming from the two probes.

Thanks for reading sciencepolicyinsider! Subscribe for free to receive new posts and support my work.

The panel discussion revealed the problem. Some reviewers were ready to fund based solely on the images. Others raised the autofluorescence issue, the missing hypothesis. But even the skeptics prefaced their concerns with “The data are beautiful, but…” Those pictures had done their job—they made weak science look compelling.

That’s when I learned: awesome preliminary data can cloud objectivity. After reviewing thousands of grants at NIH and NSF over three decades, I’ve seen it happen repeatedly.

So, as you plan your 2026 submissions, here’s what I wish I’d known from the start—lessons that might save you the same learning curve.

Lesson 1: Clarity Beats Cleverness

In my early days, I thought impressive vocabulary and complex sentences demonstrated sophistication. Surely reviewers would appreciate nuanced, academic writing that showcased the full complexity of my thinking. I was wrong.

Clarity wins every time. Reviewers are overwhelmed, often reading up to 15 proposals a week while managing their own labs, teaching loads, and grant deadlines. Simple, direct writing isn’t dumbing down your science—it’s respecting your reviewers’ cognitive bandwidth and making your research accessible to the non-specialists who might be reading it.

Several decades ago, I learned the “grandmother” test. If you can’t explain your research clearly and simply to someone outside your immediate field (like maybe your grandmom), it’s probably not clear enough for a review panel where only one or two people are genuine specialists in your exact area.

Here’s my practical advice: Read your overall proposal goals (or aims) out loud. If you stumble over your own sentences, reviewers will too. Remember that if you can’t explain it, you probably don’t understand it well enough yourself. Make the first paragraph of each section a roadmap for what follows. And use jargon only when necessary—when there’s genuinely no more straightforward way to say it.

I once reviewed a proposal with brilliant research that was nearly incomprehensible to anyone outside the PI’s subspecialty. The same panel reviewed another proposal that explained equally complex ideas with straightforward language. Guess which one got funded?

Lesson 2: Preliminary Data Is About Trust, Not Volume

Early in my career, I believed more data equaled a stronger proposal. Fill those pages with figures! Show them everything you’ve got! Every additional graph strengthens your case, right?

Wrong. It’s the quality of the data that counts.

Here’s what preliminary data does: it answers the question “Can this PI execute what they’re proposing?” It’s not about impressing reviewers with how much you’ve already accomplished. It’s about building trust that you can deliver on your promises. And here’s the thing that surprised me most: including the wrong preliminary data raises more questions than having no data at all.

Show that you can execute the specific methods you’re proposing. Demonstrate the feasibility of your key innovation—the part that’s novel and risky. If you don’t have the correct preliminary data yet, address that gap head-on rather than papering over it with tangentially related work.

The deeper insight here is that reviewers are assessing risk. They’re not asking, “Do you have data?” They’re asking, “Do I trust you can deliver what you’re promising with taxpayer money?” Those are fundamentally different questions.

Lesson 3: Broader Impacts Require Situational Awareness

I initially treated broader impacts as a required checkbox. Standard language about societal benefits and outreach seemed perfectly adequate—everyone writes similar things, right? Just describe some plausible activities and move on.

Reviewers can spot boilerplate instantly. We’ve read hundreds of proposals with identical broader impacts sections, and they all blur together into meaningless noise.

The best broader impacts sections connect to who you are and what you’re genuinely already doing in ways that align with the nation’s best interests. Integration with your research and your actual life matters far more than ambitious plans that sound good on paper.

Scaling is essential: build on what you’re already doing rather than inventing entirely new programs you’ll never have time to implement. Be specific rather than grandiose. If you already mentor undergrads in your lab, explain how this project will train them in new techniques. If you have existing connections to a local K-12 program, describe how you’ll use them—don’t manufacture new partnerships from whole cloth.

Here’s the tell: “We will develop outreach materials” raises immediate skepticism. But “I teach a summer workshop at Lincoln High School’s science program—this research will provide three new hands-on modules on climate modeling” builds trust. One is a vague promise. The other is a concrete plan rooted in existing relationships.

Lesson 4: Budget Justification Actually Matters

I used to think budgets were purely administrative. Surely reviewers barely glanced at them—they cared about the science, not the accounting, right? Standard rates and percentages seemed perfectly sufficient.

Reviewers absolutely read budget justifications. We look for alignment between what you’re proposing to do and what you’re proposing to spend. Misalignment raises immediate red flags. And here’s something that surprised many junior faculty I’ve mentored: over-budgeting is just as problematic as under-budgeting.

Every major budget line should connect clearly to a specific aim in your proposal. Justify why you need that piece of equipment—what will it do that your existing infrastructure can’t? Personnel effort should match the work described. If you’re requesting 50% effort for a postdoc, reviewers should see that postdoc playing a central role in half your aims.

Red flags I’ve seen repeatedly: proposing ambitious international fieldwork with minimal travel budget or requesting full postdoc salary when the proposal’s narrative gives that postdoc almost nothing to do. These inconsistencies make reviewers wonder whether you’ve really thought through how the work will get done.

Lesson 5: How You Handle Weaknesses Reveals Everything

I once believed you should never acknowledge limitations. Defend every choice—project confidence at all costs. Any admission of weakness would be seized upon by reviewers looking for reasons to reject your proposal.

This might be the lesson I wish I’d learned earliest. Reviewers already see the weaknesses in your proposal. Pretending problems don’t exist destroys your credibility far more than the limitations themselves.

How you address limitations reveals your scientific maturity. Acknowledge real problems early and directly. Then explain your mitigation strategy: “If plan A fails, we will try plan B because…” Show you’ve thought through alternatives and have realistic contingency plans.

This lesson became even clearer when I started seeing resubmissions. The response letter matters as much as the revised proposal itself. A defensive tone—arguing with reviewers, insisting they misunderstood you—equals instant rejection. But a response that says “We appreciate the panel’s insights. We have substantially revised Section 2.3 to address concerns about statistical power. New preliminary data (Figure 3) demonstrates feasibility of the alternative approach” shows growth and responsiveness.

Panels respect PIs who demonstrate scientific judgment far more than those who claim perfection. We know perfect proposals don’t exist. We want to see that you can identify problems and solve them.

Lesson 6: The Human Element of Review

I believed grant review was a purely objective, data-driven process where careful reviewers gave equal attention to every proposal, systematically evaluating each against clear criteria.

Reviewers are human. They’re tired. They’re distracted. They have bad days. Panel dynamics matter—who speaks up first, who’s respected, who’s combative. Your proposal isn’t evaluated in isolation; it competes with the others in that review cycle, and comparison effects are real even if the program officers say it shouldn’t be.

Here’s the practical reality: reviewers read proposals at night, on weekends, while traveling. They’re squeezing this work into already overwhelming schedules. If they’re confused by page two, they may never fully engage with your brilliant idea on page eight. Your first page matters disproportionately.

Make your innovation immediately clear. Give reviewers ammunition to advocate for you in panel discussions—clear summary statements they can quote, compelling preliminary data they can point to. The discussant’s job is to convince other panelists to fund your work. Make their job easy.

This isn’t unfair. It’s simply reality. Design your proposal for the actual conditions of review, not the idealized version where everyone reads every word with perfect attention on a quiet Sunday morning with fresh coffee.

Lesson 7: Resubmissions Are About Demonstrating You Listened

I initially thought resubmissions were second chances to explain myself better. The reviewers had clearly misunderstood my brilliant idea. Now I’d show them what I really meant, with more precise explanations and stronger arguments.

Resubmissions are about showing scientific growth. They demonstrate whether you can receive criticism, integrate feedback, and improve your work. The reviewers weren’t wrong—or at least, whether they were wrong doesn’t matter. What matters is whether you can respond constructively to their concerns.

Start your response letter with genuine gratitude, not perfunctory politeness. Group your responses to criticisms thematically rather than addressing them line by line, which makes you look defensive. Show clearly what you changed and where reviewers can find those changes in the revised proposal. If you genuinely disagree with a criticism, do so respectfully and support your position with data, not rhetoric.

The successful resubmissions I’ve seen follow a pattern: acknowledge the feedback, explain the changes, demonstrate improvement with new evidence. The unsuccessful ones argue, defend, and explain why the reviewers didn’t understand the first time.

What These Lessons Reveal About Science Funding

These aren’t just tips for better grant writing. They reveal something more profound about how American science funding works. As I’ve written before, the current system prioritizes risk mitigation over bold ideas. It values clear communication and demonstrates competence over theoretical brilliance. It rewards incremental progress from established investigators more readily than moonshots from newcomers.

I’m not criticizing (this time)—it’s how the system is designed. When you’re allocating hundreds of millions in taxpayer dollars, trust and deliverability matter. Understanding this cultural logic helps you work within the system more effectively.

And it raises an interesting question I’m exploring in my new work on international science policy: Do other countries fund science differently because they assess risk differently? Do European or Asian funding systems embed different assumptions about what science should accomplish? That’s a topic for future posts.

These lessons came from mistakes, from failed proposals, from thousands of hours in review rooms watching good science get rejected for preventable reasons. I wish I’d understood them earlier in my career. I’m offering them now to help you avoid the same learning curve.

What hard-won lessons have you learned about grant writing? What advice would you give your younger self? Share in the comments.

As you prepare your 2026 submissions, remember there’s a human being on the other side of your proposal. Make their job easier. Help them advocate for your science. Give them reasons to say yes.

Why I’m Taking Science Policy Insider International

A View from Abroad

Mid-competition week for a panel reviewing proposals on genes and cells: the fifteen-minute clock starts, and the five of us assigned to this proposal dive in. We consider factors such as whether the proposer is early in their career and how the COVID pandemic might have affected their laboratory’s productivity. We carefully assess their plan for mentoring trainees, including their previous track record and plans. The excellence of the proposer is evaluated, not by raw bibliometric measures such as H-index, but by substantive contributions to the field. And we take a very close look at the proposal itself—not only in terms of intellectual merit, but also to make sure that it is distinct from the investigator’s other supported science. Is this an NIH study section? Nope. Is this an NSF panel? Again, no. This is a peer review for another G7 nation, to be unnamed in this post.

What struck me wasn’t that this country did peer review differently than NSF or NIH. What struck me was how similar it was. Same careful attention to mentoring. Same suspicion of bibliometrics. Same concern about overlaps with existing funding. I could have been in any panel room I’d sat in over three decades in Washington. And that’s when it hit me: among the wealthy nations that fund science, we’re all running variations on the same basic system. We argue about details – overhead rates, review criteria, funding durations – but we share fundamental assumptions about how science should work.

Thanks for reading sciencepolicyinsider! Subscribe for free to receive new posts and support my work.

Or so I thought. Until I stepped outside the world of science funding and began looking at how other countries organize technical knowledge. My second book project examines how Boeing, Airbus, and Embraer design commercial aircraft – and that research has revealed something I’d missed in all my years in government and academia.

Civic Epistemologies

The scholar Sheila Jasanoff has a concept called ‘civic epistemologies’ – the idea that different societies have fundamentally different ways of producing and validating knowledge. It’s not about organizational charts or funding mechanisms. It’s deeper than that. It’s about cultural assumptions: What questions are worth asking? What counts as evidence? Who gets to decide? How do we measure success?

When Americans design an airplane, we assume that technical decisions should be made by engineers based on data, with regulators checking compliance after the fact. Europeans embed social and labor concerns directly into the design process – workers’ councils have a say in production methods, and safety regulators are involved earlier. Brazilians organize around different assumptions entirely, shaped by their position as a developing economy entering a market dominated by established players.

Same engineering principles. Same physics. The same goal of building a safe, efficient aircraft. But fundamentally different answers to the question: Who should decide how this gets done?

I saw the same pattern as a working neuroscientist. American neuroscience tends to bet on fundamental discovery—map the circuits, understand the mechanisms, and applications would follow. Recording sea slug neurons during my training embodied this approach: study simpler systems, find conserved principles, apply them to humans. Europeans start closer to the clinic, organizing major research programs around disease categories and patient needs. Japanese neuroscience builds unusually tight links between academic labs and industry—electronics and engineering companies actively embedded in research networks, with clear paths toward commercialization: same neurons, same biology—different assumptions about how knowledge should flow from laboratory to society.

My new book project

So, where is this taking me? The short answer is I’m working on a new book about how American, European, and Brazilian cultures (think Boeing, Airbus, and Embraer) shape commercial aviation technology. Why planes? In my lifetime, I experienced firsthand the jet revolution: I started on the Comet, went on to the Pan Am 707s, and these days still enjoy the grandeur of the big twin aisle giants that connect us across oceans.

In the new book, I’m interested in comparing technical cultures through the lens of those jets (as technical artifacts). But beyond my lifetime fascination with aviation, the same questions apply to science policy itself: why do different countries organize technological knowledge differently? What can we learn from how other G7 nations fund science? And what cultural assumptions shape what gets built (airplanes OR research programs)?

Science Policy Insider Expands Its Scope

This brings me back to Science Policy Insider and where we’re headed. We are broadening our remit. In the future, we’ll expand to include a comparative analysis of research funding systems—both public agencies and private industry—drawing on insights from my aviation research. We’ll examine how different countries handle current challenges: AI governance, climate research, and research security.

On the practical side, we’ll provide insights for American researchers who work internationally or plan to—from navigating different grant systems to understanding why collaborations succeed or fail across cultural boundaries. And above all, we’ll consider what viewing American science policy from the outside reveals about our own system.

We’ll maintain our bi-weekly publishing schedule.

Science Policy Insider started with my promise to explain how American science policy really works from someone who was inside the system. Now we’re also going to explore what it looks like from the outside and what that perspective reveals about our own system.

I continue to invite readers’ questions, now not only about how things work in our own American discovery machine, but also about international science policy.

Grades posted: another semester in the books

Fall semester 2025 is now complete. Both my undergraduates and my grad students performed very well. When we begin this all again in Spring, I’ll be teaching my favorite class on Space governance, using Kim Stanley Robinson’s Mars Trilogy as our text. And we’ll be teaching our crisis management class again — this time to a new cohort of Schar School master’s students.

But for now, it’s time for the winter break. I’m 2/3 of the way through my reread of Moby Dick. And James Joyce’s Ulysses is on deck. I’ve got a ton of grant proposals headed my way to review for the Canadian NSERC to relieve the excess literary fiction.