When Scientists Fall in Bad Love With Their Own Ideas
Approximately four decades ago, I became a witness in a scientific misconduct case. The charges had been brought by an international postdoc in the lab where I had also worked before moving on, and I cannot remember many of the details, except that my written testimony stated that I knew nothing. But I do remember, in the context of more recent high-profile cases, that the essence of the accusation then was the same as it is now: altering experimental data to support the ‘party line’.
The recent disruption to American science has been extensively documented. Given how deeply intertwined government research dollars are with the budget models for R1 universities and the large academic medical centers, it’s not surprising that those funds were chosen for their leverage, and that the consequence of their being in jeopardy will profoundly alter the course of pursuing Vannevar Bush’s version of the endless frontier.
But I want to explore a different question raised by that long-ago case. When I recall that the essence involved “altering data to support the party line,” I need to ask: whose party line was it? In that case, and in many since, the party line wasn’t imposed by some external authority. It was the PI’s own hypothesis, their pet theory, the idea they’d invested years in developing and defending. The fraud wasn’t about serving power—it was about rescuing a cherished belief from contradictory evidence.
This raises uncomfortable questions about how we organize biomedical research. The current system—hypothesis-driven projects led by individual PIs who develop deep attachments to specific ideas—contains structural flaws that push even honest scientists toward motivated reasoning and occasionally push the dishonest ones past the line into fraud.
The Romantic Model of Science
Our funding system enshrines a particular vision of how science works. A brilliant investigator conceives a hypothesis. They design clever experiments to test it. They write a compelling grant proposal. If funded, they spend 3-5 years testing their idea. Success means publishing papers that confirm the hypothesis, which leads to more grants to extend the work.
This model has romantic appeal. It positions the PI as the creative genius whose insight drives discovery. It makes science a battle of ideas where the best hypotheses prevail. It creates clear narratives: an investigator proposes a theory, designs experiments to test it, and demonstrates it is correct. This is how we teach science, how we write about it in popular accounts, how we celebrate it in awards and prizes.
The problem is that this romantic model creates precisely the conditions under which fraud becomes tempting and honest self-deception becomes nearly inevitable.
When Hypothesis Becomes One’s Identity
Here’s what happened in numerous misconduct cases from the 1980s onward: A researcher develops a hypothesis. It’s not just any hypothesis—it’s their hypothesis, the idea that defines their research program, the theory that distinguishes them from competitors. They build a laboratory around it, recruit students and postdocs to test it, and write grants that promise to extend it.
The hypothesis becomes their professional identity. Colleagues know them as “the person who works on that theory.” Graduate students join their lab specifically to work on that problem. Papers in high-impact journals describe their unique contribution. Tenure committees evaluate whether the hypothesis has generated sufficient publications. Grant review panels judge whether the approach is likely to continue producing results.
Then experiments start yielding contradictory data. Not every experiment—if every experiment failed, the researcher might abandon the hypothesis. However, when enough experiments yield ambiguous or contradictory results, the careful scientist should begin to question the core idea.
This is where the system’s design creates problems. Walking away from the hypothesis means walking away from professional identity, from grants that depend on that research program, from students and postdocs whose projects are built on that framework. It means admitting that years of work may have been directed toward the wrong question. It means watching competitors promote alternative theories.
The pressure isn’t external—nobody is ordering the researcher to maintain their hypothesis. The pressure is structural, built into how we organize careers and evaluate success. When your identity, your lab’s funding, and your scientific reputation all depend on a particular idea being correct, it takes extraordinary intellectual honesty to acknowledge that idea might be wrong.
On the Spectrum: From Delusion to Fraud
Most scientists don’t fabricate data. But many engage in practices that fall short of fraud while still distorting the scientific record. These practices stem from the same structural problem: excessive investment in a specific hypothesis.
Selective reporting occurs when experiments yielding inconvenient results are dismissed as “technical problems,” whereas experiments supporting the hypothesis are published. The researcher isn’t fabricating data—they’re making judgments about which data are “good.” But those judgments are biased by investment in the hypothesis.
Data massaging occurs when researchers make analytical decisions that favor their theory. Which outliers to exclude? How to set cutoffs? Which statistical tests to use? Each decision seems defensible individually, but collectively, they bias results toward the preferred outcome. Again, this isn’t fabrication—it’s motivated reasoning dressed up as methodological choice.
Hypothesis rescue manifests as increasingly elaborate explanations for why experiments that should have supported the theory failed. Maybe the conditions weren’t quite right. Maybe there’s an additional factor we didn’t control for. Maybe the effect is context-dependent. Some auxiliary hypotheses are legitimate scientific refinements. Others are epicycles added to save a failing theory.
Selective collaboration and citation appear when researchers preferentially cite papers supporting their view while ignoring contradictory work. They collaborate with scientists who share their hypothesis, while avoiding those who promote alternatives. This creates echo chambers where a contested theory looks like a consensus because the believers only talk to each other.
These practices aren’t fraud in the legal sense. They’re what happens when intelligent, well-meaning scientists become too invested in particular ideas. The investment doesn’t require conscious dishonesty—it just requires the normal human tendency to see what we expect to see, to value evidence confirming our beliefs more highly than evidence challenging them.
The Cases We Remember
The 1980s wave of misconduct cases illuminates this pattern. Take John Darsee at Harvard Medical School. His fraudulent cardiology research wasn’t random fabrication—it was data manufactured to support his ongoing research program. He was so invested in demonstrating that his approach worked that he fabricated results when experiments didn’t cooperate. His extraordinary productivity should have raised red flags, but it fit the romantic model: the brilliant investigator producing breakthrough after breakthrough.
The Baltimore affair involved Thereza Imanishi-Kari’s immunology data that Margot O’Toole couldn’t replicate. The decade-long controversy ended in 1996 when an appeals board cleared Imanishi-Kari of all misconduct charges. But the case revealed how competing interpretations of the same data can arise when different investigators bring different assumptions to their analysis, and how difficult it becomes to distinguish between legitimate scientific disagreement and potential misconduct when researchers are deeply invested in their theories.
Eric Poehlman’s obesity research fraud—falsifying data in 17 grant applications and 10 publications—followed the same pattern. He had a research program, a reputation, and a stream of funding dependent on showing that his hypotheses about aging and obesity were correct. When data didn’t cooperate, he made them cooperate.
The common thread isn’t that these individuals were uniquely evil. It’s that they were operating in a system where too much depended on specific hypotheses being correct. The same pressures that led them to commit fraud push others into questionable practices and drive everyone toward motivated reasoning.
The Structural Alternative: Team Science
Consider how differently science works in fields that have moved away from the PI-centered hypothesis-driven model.
Large-scale genomics operates with diverse teams interrogating datasets rather than testing specific hypotheses. The question isn’t “Is my theory correct?” but “What patterns exist in these data?” Multiple investigators with different backgrounds and biases analyze the same datasets. Results require replication across labs. The data-sharing infrastructure enables other groups to independently verify findings.
Nobody’s career depends on a specific gene being associated with a particular disease. If your analysis suggests gene X matters but another team’s analysis contradicts that, there’s no professional catastrophe. You’re contributing to collective understanding rather than defending personal theories.
The BRAIN Initiative that I helped launch during my tenure at NSF was designed in part to avoid the hypothesis trap. Rather than funding individual PIs to test specific theories about brain function, it funded tool development, data collection, and infrastructure that multiple investigators could use. The bet was that understanding the brain required comprehensive data and analytical capabilities, not just clever hypotheses.
This doesn’t eliminate all bias—researchers still have preferences about which tools to develop or which brain regions to map. But it reduces the intense personal investment in any particular theory about how the brain works. The focus shifts from testing hypotheses to building shared resources.
Particle physics has worked this way for decades. Nobody at CERN builds a career on predicting a specific particle will or won’t be found. The infrastructure supports collective inquiry. Results require consensus across large collaborations. Data are shared immediately. Multiple teams analyze the same detector output.
Can you imagine a particle physicist fabricating Higgs boson data? The system makes it nearly impossible—not because particle physicists are more ethical, but because the organizational structure distributes both credit and accountability across large teams working with shared data.
The Biomedical Research Counterfactual
What would biomedical research look like if we designed it to minimize the hypothesis trap?
Separation of hypothesis generation from testing. One team develops theories and predictions. A different team, with no stake in the theory’s success, conducts the experiments. The testing team is rewarded for rigorous methods and clear results, not for confirming or refuting specific hypotheses. This isn’t unprecedented—clinical trials often use this model, with statisticians who haven’t seen interim results conducting final analyses.
Registered reports and pre-registration. Require researchers to specify hypotheses, methods, and analyses before collecting data. Journals commit to publishing based on methodological quality, not results. This removes the temptation to massage data because publication is already guaranteed. The researcher benefits from doing careful work, not from obtaining specific results.
Adversarial collaboration. When competing theories exist, fund collaborations between proponents to design jointly agreed-upon decisive tests. Each side specifies in advance what results would falsify their theory. The collaboration is rewarded for clarity and rigor, not for one side winning.
Collective attribution and team leadership. Move away from the PI model toward team leadership with distributed authority. Make it normal for multiple investigators to share senior authorship without hierarchical ordering. Reward contributions to collective projects, not just defending personal theories. This reduces the intensity of individual investment in specific hypotheses.
Diverse parallel approaches. Rather than funding one investigator to test one hypothesis over five years, fund multiple teams to simultaneously test competing hypotheses. Make this explicit: “We think question X is important but don’t know which of three theories is correct, so we’re funding all three approaches.” The field benefits from comparative testing; individual investigators aren’t catastrophically invested in one answer.
The Objections
These suggestions will provoke immediate resistance, much of it justified. The romantic model of science—brilliant individual investigator pursuing visionary ideas—isn’t entirely fiction. Great insights do come from individuals. Breakthrough theories do require conviction to pursue against skepticism. Hypothesis-driven research has produced genuine discoveries.
Moreover, team science and collective approaches have their own challenges. Large collaborations can become bureaucratic. Consensus-building can delay needed action. Distributing credit across many people may reduce individual incentive for excellence. Pre-registration can be gamed by enrolling multiple studies and selectively reporting which ones to complete.
The adversarial collaboration model assumes good faith from competing investigators, which isn’t always present. Separating hypothesis generation from testing may slow progress if the best experiments require an intimate understanding of the theory. Distributed leadership creates coordination problems.
These are real concerns. I’m not arguing for the complete abandonment of hypothesis-driven research or the PI model. But I am arguing that we’ve over-indexed on one way of organizing science—a way that creates predictable problems around motivated reasoning and hypothesis attachment—without seriously considering alternatives that might mitigate those problems.
The Incentive Redesign
The deeper issue is incentive structure. We reward:
- Publications in high-impact journals (which prefer dramatic confirmations of interesting hypotheses)
- Grant funding (which requires convincing reviewers you’re pursuing important ideas likely to yield results)
- Citations (which accumulate for papers making strong claims, not for careful null results)
- Awards and prizes (which celebrate breakthroughs, not rigorous refutations)
- Tenure and promotion (based on establishing an independent research program—meaning a distinctive hypothesis)
Each incentive encourages researchers to develop strong attachments to specific theories. The scientist who carefully tests a hypothesis, finds ambiguous results, and concludes, “This is more complicated than we thought,” doesn’t thrive under these incentives. The scientist who generates a provocative theory, designs experiments to support it, and publishes dramatic results thrives—even if the theory is ultimately wrong.
We could design different incentives:
- Reward rigorous replication attempts
- Fund adversarial collaborations that test competing theories
- Celebrate careful negative results that prevent the field from pursuing dead ends
- Promote scientists who change their minds when evidence demands it
- Value contributions to infrastructure and methods that enable collective progress
None of this is unprecedented. Clinical trial statisticians build careers on methodological rigor, not therapeutic breakthroughs. Methods developers in genomics gain recognition for creating tools others use. Psychology researchers are valued for independently testing whether published findings hold up.
The question is whether biomedical research, more broadly, is willing to diversify its incentive structures and organizational models. The field is enormously successful—NIH funding, breakthrough therapeutics, extended lifespans. Why change a winning formula?
BACK TO THAT 1980S CASE
The postdoc who brought misconduct charges understood something important: when data are being altered to support “the party line,” someone needs to object. That takes courage—postdocs are vulnerable, whistleblowers face retaliation, and questioning senior scientists is risky.
But here’s what I’ve come to understand that I didn’t fully appreciate forty years ago: the party line wasn’t imposed from outside. It emerged from structural features of how we organize research. The PI who allegedly manipulated data wasn’t serving some external master. They were serving their own hypothesis, the idea they’d built a career around, the theory their lab existed to develop.
That makes the problem both worse and better than simple corruption. Worse, because it means well-meaning scientists with good intentions can slide into questionable practices without recognizing it. The same motivated reasoning that drives fraud also drives less dramatic but equally problematic biases in how we collect, analyze, and report data.
Better because it means organizational redesign might help. We can’t eliminate human fallibility or the emotional attachment scientists develop to their ideas. However, we can design systems that reduce the extent to which outcomes depend on any particular hypothesis being correct. We can create structures where admitting you were wrong is professionally survivable. We can reward rigor over drama, collective progress over individual breakthroughs.
The Path Forward
I’m not optimistic about radical transformation. The biomedical research enterprise is vast, successful, and institutionally entrenched. The romantic model of the lone investigator testing brilliant hypotheses is deeply embedded in how we tell science stories, train graduate students, and allocate prestige.
But incremental change is possible:
Funding agencies can require pre-registration for hypothesis-driven research while also funding more exploratory, team-based approaches. NIH’s BRAIN Initiative and precision medicine programs already point in this direction. Expanding these models would diversify how research gets organized.
Journals can mandate data sharing and the use of registered reports. Some journals already do this; others resist for fear of losing exciting submissions to competitors. But collective action could shift norms. If high-impact journals required rigorous transparency, researchers would adapt.
Universities can broaden tenure criteria to value methodological rigor, replication, infrastructure development, and collaborative contributions, alongside traditional metrics of independent research. This requires courage because it means promoting faculty who don’t fit the standard template, but it’s feasible.
Training programs can teach critical evaluation of one’s own hypotheses. Rather than just training students to design clever experiments and write compelling grants, we can teach them to actively look for ways they might be wrong, to value evidence against their theories, and to see changing one’s mind as a strength rather than a weakness. This is partly cultural, partly structural.
Funders can experiment with alternative models. Fund some research explicitly as adversarial collaboration. Fund some as team science with distributed leadership. Fund some as infrastructure development. Create parallel tracks so researchers can build careers through multiple pathways, reducing the pressure to develop intense attachment to specific hypotheses.
None of this will eliminate fraud—there will always be individuals who cheat. However, it might reduce the structural pressures that push honest scientists toward motivated reasoning and, in some cases, scientists toward outright fabrication.
Integrity is More Than Honesty
That 1980s case I barely remember continues to inform my thinking, not because I have clear memories of it but because it captures something essential: scientific integrity requires more than individual honesty. It requires organizational structures that don’t push even honest people toward biased reasoning.
The postdoc filing charges was practicing integrity. But they were fighting against a system where a PI’s attachment to their hypothesis created pressure—probably unconscious, probably rationalized, but pressure nonetheless—to make the data fit the theory. One brave postdoc can’t fix structural problems alone.
We’ve built an enormously productive research enterprise. Biomedical science has achieved genuine miracles. The hypothesis-driven, PI-centered model has generated breakthrough after breakthrough. I’m not arguing it’s failed—clearly it hasn’t.
However, I argue it’s flawed in predictable ways. The same features that make it successful—individual investigators developing strong convictions about important ideas and pursuing them relentlessly—also create conditions for motivated reasoning, questionable research practices, and occasional fraud.
Acknowledging those flaws doesn’t diminish the achievements. It opens space for experimentation with alternative models that might reduce the problematic incentives while preserving the creative energy that drives discovery. The question is whether we’re willing to diversify how we organize research or whether we’ll continue over-relying on a single model because it’s familiar and has worked in the past.
The endless frontier that Vannevar Bush envisioned shouldn’t be endless in just one direction. It should include exploring different ways of pursuing knowledge, different structures for organizing inquiry, and different incentives for rewarding contributions to collective understanding.
That’s the real challenge: not just preventing fraud but creating systems where the pressures toward fraud—and toward less dramatic but equally problematic biases—are reduced. Where changing your mind based on evidence is professionally rewarded rather than punished. Where attachment to ideas is balanced by commitment to collective truth-seeking.
The party line that worries me most isn’t imposed by political power. It’s the party line we impose on ourselves when we become too attached to our own hypotheses, when our professional identities become too entangled with specific theories, when the systems we’ve built make admitting error too costly. We need to extend that understanding to object not only to individual fraud but also to organizational structures that make such fraud more likely. Building not just oversight systems but alternative models of how to pursue science.
That’s the integrity challenge for the next forty years.


