How Will You Know You’ve Succeeded? A BRAIN story

August 2008: A summer day in Mountain View California. The previous year, In 2007, The Krasnow Institute for Advanced Study, which I was leading at George Mason University, had developed a proposal to invest tons of money in figuring out how mind emerges from brains and now I had to make the case that it deserved to be a centerpiece of a new administration’s science agenda. Three billion dollars is not a small ask, especially in the context of the 2008 financial crisis that was accelerating.

Before this moment, the project had evolved organically: a kickoff meeting at the Krasnow Institute near D.C., a joint manifesto published in Science Magazine, and then follow-on events in Des Moines, Berlin and Singapore to emphasize the broader aspects of such a large neuroscience collaboration. There even had been a radio interview with Oprah.

When I flew out to Google’s Mountain View headquarters in August 2008 for the SciFoo conference, I didn’t expect to be defending the future of neuroscience over lunch. But the individual who was running the science transition for the Obama Presidential Campaign, had summoned me for what he described as a “simple” conversation: defend our idea for investing $3 billion over the next decade in neuroscience with the audacious goal of explaining how “mind” emerges from “brains.” It was not the kind of meeting I was ready for.

I was nervous. As an institute director, I’d pitched for million-dollar checks. This was a whole new scale of fundraising for me. And though, California was my native state, I’d never gone beyond being a student body president out there. Google’s headquarters in summer of 2008 was an altar to Silicon Valley power.

SciFoo itself was still in its infancy then – the whole “unconference” concept felt radical and exciting, a fitting backdrop for pitching transformational science. But the Obama campaign wasn’t there for the unconventional meeting format. Google was a convenient meeting spot. And they wanted conventional answers.

I thought I made a compelling case: this investment could improve the lives of millions of patients with brain diseases. Neuroscience was on the verge of delivering cures. (I was wrong about that, but I believed it at the time.) The tools were ready. The knowledge was accumulating. We just needed the resources to put it all together.

Then I was asked the question that killed my pitch: “How will we know we have succeeded? What’s the equivalent of Kennedy’s moon landing – a clear milestone that tells us we’ve achieved what we set out to do?” You could see those astronauts come down the ladder of the lunar module. You could see that American flag on the moon. No such prospects with a large neuroscience initiative.

I had no answer.

I fumbled through some vague statements about understanding neural circuits and developing new therapies, but even as the words left my mouth, I knew they were inadequate. The moon landing worked as a political and scientific goal because it was binary: either we put a man on the moon or we didn’t. Either the flag was planted or it wasn’t.

But “explaining how mind emerges from brains”? When would we know we’d done that? What would success even look like?

The lunch ended politely. I flew back to DC convinced it had been an utter failure.

But that wasn’t the end of it. Five years later, at the beginning of Obama’s second presidential term, we began to hear news of a large initiative driven by the White House called the Brain Activity Map or BAM for short. The idea was to comprehensively map the functional activity of brains at high spatial and temporal resolution beyond that available at the time. It was like my original pitch both in scale (dollars) and in the notion that it was important to understand how mind emerges from brain function. The goal for the new BAM project was to be able to map between the activity and the brain’s emergent “mind”-like behavior, both in the healthy and pathological cases. But the BAM project trial balloon, even coming from the White House, was not an immediate slam dunk.

There was immediate push-back from large segments of the neuroscience community that felt excluded from BAM, but with a quick top-down recalibration from the White House Office of Science and Technology Policy and a whole of government approach that included multiple science agencies, BRAIN (Brain Research through Advancing Innovative Neurotechnologies) was born in April of 2013.

A year later, in April of 2014, I was approached to head Biological Sciences at the US National Science Foundation. When I took the job that October, I was leading a directorate with a budget of $750 million annually that supported research across the full spectrum of the life sciences – from molecular biology to ecosystems. I would also serve as NSF’s co-lead for the Obama Administration’s BRAIN Initiative—an acknowledgement of the failed pitch in Mountain View, I guess.

October 2014: sworn in and meeting with my senior management team–now here I was, a little more than a year into BRAIN. I had gotten what I’d asked for in Mountain View. Sort of. We had the funding, we had the talent, we had review panels evaluating hundreds of proposals. But I kept thinking about the question—the one I couldn’t answer then and still struggled with now. We had built this entire apparatus for funding transformational research, yet we were asking reviewers to apply the same criteria that would have rejected Einstein’s miracle year. How do you evaluate research when you can’t articulate clear success metrics? How do you fund work that challenges fundamental assumptions when your review criteria reward preliminary data and well-defined hypotheses?

Several months later, testifying before Congress about the BRAIN project, I remember fumbling again at the direct question of when we would deliver cures for dreaded brain diseases like ALS and Schizophrenia. I punted: that was an NIH problem (even though the original pitch had been about delivering revolutionary treatments. At NSF, we were about understanding the healthy brain. In fact, how could you ever understand brain disease without a deep comprehension of the non-pathological condition?

It was a reasonable bureaucratic answer. NIH does disease; NSF does basic science. Clean jurisdictional boundaries. But sitting there in that hearing room, I realized I was falling into the same trap that had seemingly doomed our pitch in 2008: on being asked for the delivery date of a clear criterion for success, I was waffling. Only this time, I was the agent for the funder: the American taxpayer.

The truth was uncomfortable. We had launched an initiative explicitly designed to support transformational research – research that would “show us how individual brain cells and complex neural circuits interact” in ways we couldn’t yet imagine. But when it came time to evaluate proposals, we fell back on the same criteria that favored incrementalism: preliminary data, clear hypotheses, established track records, well-defined deliverables. We were asking Einstein for preliminary data on special relativity.

And we weren’t unique. This was the system. This was how peer review worked across federal science funding. We had built an elaborate apparatus designed to be fair, objective, and accountable to Congress and taxpayers. What we had built was a machine that systematically filtered out the kind of work that might transform neuroscience.

All of this was years before the “neuroscience winter”—where massive scientific misconduct was unearthed in neurodegenerative disease research—which included Alzheimer’s. But the modus operandi of BRAIN foreshadowed it.

Starting in 2022, a series of investigations revealed that some of the most influential research on Alzheimer’s disease—work that had shaped the field for nearly two decades and guided billions in research funding—was built on fabricated data. Images had been manipulated. Results had been doctored. And this work had sailed through peer review at top journals, had been cited thousands of times, and had successfully competed for grant funding year after year. The amyloid hypothesis, which this fraudulent research had bolstered, had become scientific orthodoxy not because the evidence was overwhelming, but because it fit neatly into the kind of clear, well-defined research program that review panels knew how to evaluate.

Here was the other side of the Einstein problem that I’ve mentioned in previous posts. The same system that would have rejected Einstein’s 1905 papers for lack of preliminary data and institutional support had enthusiastically funded research that looked rigorous but was fabricated. Because the fraudulent work had all the elements that peer review rewards: clear hypotheses, preliminary data, incremental progress building on established findings, well-defined success metrics. It looked like good science. It checked all the boxes.

Meanwhile, genuinely transformational work—the kind that challenges fundamental assumptions, that crosses disciplinary boundaries, that can’t provide preliminary data because the questions are too new—struggles to get funded. Not because reviewers are incompetent or malicious, but because we’ve built a system that is literally optimized to make these mistakes. We’ve created an apparatus that rewards the appearance of rigor over actual discovery, that favors consensus over challenge, that funds incrementalism and filters out transformation.

So, what’s the real function of peer review? It’s supposed to be about identifying transformative research, but I don’t think that the real purpose. To my mind, the real purpose of peer review panels at NSF, the study sections at NIH, is to make inherently flawed funding decisions defensible—both to Congress and the American taxpayer. The criteria, intellectual merit, broader impacts at NSF, make awarding grant dollars auditable and fair seeming, not because they identify breakthrough work.

But honestly, there’s a real dilemma here: if you gave out NSF’s annual budget based on a program officer’s feeling that “this seems promising”, you’d face legitimate questions about cronyism, waste and arbitrary decision-making. The current system’s flaws aren’t bad policy accidents; they are the price we pay for other values we also care about.

So, did the BRAIN Initiative deliver on that pitch I made in Mountain View in 2008? Did we figure out how ‘mind’ emerges from ‘brains’? In retrospect, I remain super impressed by NSF’s  NeuroNex program: we got impressive technology – better ways to record from more neurons, new imaging techniques, sophisticated tools. We trained a generation of neuroscientists. But that foundational question – the one that made the political case, the one that justified the investment – we’re not meaningfully closer to answering it. We made incremental progress on questions we already knew how to ask. Which is exactly what peer review is designed to deliver. Oh, and one other thing that was produced: NIH’s parent agency, the Department of Health and Human Services,  got a trademark issued on the name of the initiative itself, BRAIN.

I spent four years as NSF’s co-lead on BRAIN trying to make transformational neuroscience happen within this system. I believed in it. I still believe in federal science funding. But I’ve stopped pretending the tension doesn’t exist. The very structure that makes BRAIN funding defensible to Congress made the transformational science we promised nearly impossible to deliver.

That failed pitch at Google’s headquarters in 2008. Turns out that the question was spot on we just never answered it.

“What Grant Reviewers Actually Look For (and What They Ignore)”

A close colleague of mine at a major US research university begins the process of preparing a grant proposal by creating something he calls a “storyboard”.  When I was growing up in LA, the concept of a storyboard was very familiar to me.  Many of my high school friends, at the time, aspired to careers in the locally dominant entertainment industry. The storyboard, invented by Walt Disney, used pictures to visualize a movie’s plot flow before production—often even before a screenplay was complete. In the LA movie business, you could look at a storyboard and pretty much get right away what a movie is about.

Back to the colleague of mine who uses storyboard to create grant proposals—his key idea is that you’re done making the storyboard, when someone outside the group can come in, look at it, and come away with a good understanding of what the grant is all about. If the storyboard is coherent, then it’s easy to make the proposal coherent as well. Further, the storyboard often gets reused in a modified fashion as the grant’s central graphic. Yes, a picture is worth several thousand words.

My colleague is onto something profound about how grant review works, across all funders, including those in the private sector.  But for this issue of Science Policy Insider, we’re going to consider the agency where I headed up Biological Sciences, the NSF. What about NIH, you may ask? A lot of the principles here go for both agencies. But here, we’re going to focus, laser-like, on the National Science Foundation, even as it undergoes drastic changes.

The Brutal Reality of NSF Panel Review

After sitting through too many grant panels at NSF, I can tell you this: most proposals get 15-20 minutes of discussion time in a panel that’s reviewing 30-50 proposals over three days. Your carefully crafted 15-page research plan? The primary reviewer read it thoroughly. The other two panelists skimmed it. Everyone else glanced at the summary.

This isn’t because reviewers are lazy. They’re exhausted, brilliant researchers who read proposals outside their immediate expertise, often late at night, while also worrying about their own grants, their trainees, and the paper referee statements they owe.

The storyboard approach works because it acknowledges this reality: reviewers are looking for a straightforward narrative they can grasp quickly and defend to the panel.

What Actually Happens in Review Panels

Here’s how it typically unfolds:

9:00 AM, Day Two of panel: The primary reviewer presents your proposal. They have 5 minutes to summarize your aims, approach, and why it matters. If they struggle to articulate your story coherently, you’re in trouble—not because your proposed science is bad, but because they can’t effectively advocate for you.

The secondary and tertiary reviewers add their perspectives. Then the panel discusses. The program officers watch for enthusiasm, coherence of the argument, and whether anyone is deeply opposed.

The proposals that succeed have champions—reviewers who “get it” immediately and can explain why it matters to others. The storyboard method facilitates championing reviewability.

What Reviewers Actually Look For

After watching this process play out thousands of times, here’s what I learned reviewers truly care about:

1. Can I explain this to the panel in 3 minutes?

If your research plan requires a flowchart to understand, the primary reviewer will simplify it—possibly incorrectly. Better to give them the simplified version yourself.

2. Is the question worth answering?

Not “is this interesting?” but “will anyone care about the answer?” Reviewers need to justify spending taxpayer money. Give them that justification explicitly.

3. Can this person actually do this?

No matter what is written down in the solicitation, preliminary data matters enormously, but not for the reason applicants think. It’s not about proving the hypothesis—it’s about proving you have the technical capability and haven’t missed an obvious problem.

4. Is this the right approach?

Reviewers are surprisingly forgiving about whether your specific hypothesis is correct. They’re much less forgiving about whether you’re using appropriate methods or have thought through alternatives.

5. Will this move the field forward?

Notice: not “revolutionize” or “transform”—just move forward. Incremental progress from a well-designed study beats a transformative idea with unclear methods. But doesn’t the call state that the proposed work should change the world? Sure, but from a practical standpoint, what counts for the reviewers is steady progress. And here’s the tricky part: while steady is key for the reviewers, transformative really is important for the program officers who make the penultimate decision. So, a balance is necessary.

What Reviewers Ignore (Even Though You Spent Weeks on It)

The extensive literature review: They skim it to see if you know the field. The 47 citations demonstrating your comprehensive knowledge? They checked that you cited the key papers and moved on.

Your detailed budget justification: Unless something looks wildly off, reviewers assume you know what your research costs. The line-by-line explanation of why you need that particular microscope? Skimmed.

Your publication list: They look at: Do you publish in good journals? Are you productive? Have you published on this topic before? That’s it. The distinction between your 47th and 52nd paper doesn’t matter.

The broader impacts section that you agonized over: I feel guilty about this because, I’ve often harped about broader impacts as a central criterion. Truth: most reviewers read this quickly to verify you addressed it competently. Unless it’s either exceptional or terrible, it rarely drives funding decisions. And these days, broader impacts means how the work will benefit all American citizens (think public health) or US National security.

The Elements That Actually Drive Decisions

Clarity of the research goals: Can the reviewer recite your three main questions without looking at the proposal? If not, rewrite.

Logical flow: Does each aim build on the previous one? Or are they three unrelated projects stapled together? Reviewers can tell.

Feasibility signals: Preliminary data, established collaborations, access to necessary resources, realistic timeline. These say, “this person will actually complete this work.”

Positioning: Is this filling a real gap, or are you slightly tweaking someone else’s approach? Reviewers want to fund work that moves us somewhere new, even if incrementally.

The writing quality: Clear, direct prose suggests clear thinking. Dense, jargon-heavy writing suggests unclear thinking (even if that’s unfair).

The Most Common Mistake

Applicants try to impress reviewers with complexity and comprehensiveness. They want to show they’ve thought of everything, considered every alternative, read every paper.

But reviewers are looking for clarity and confidence. They want to understand quickly what you’re proposing and why it matters. They want to feel confident you’ll succeed.

The storyboard method works because it forces simplicity. If you can’t draw a simple picture of your proposal that an outsider immediately understands, you don’t have a fundable story yet.

But Wait, There’s More

As hinted at above, at NSF, that panel review…. it’s strictly advisory. I’ve personally seen proposals with excellent reviews get declined and the reverse. The key decisional person? That’s the cognizant program officer for the solicitation. These days, there’s an additional vetting to look for alignment with the Administration’s political goals, but that’s a topic for a future newsletter.

What This Means for Your Proposal

Before you write a single word:

  • Can you explain your project in three sentences?
  • Can someone outside your subfield understand why it matters?
  • Do you have a clear narrative arc from question to approach to impact?

If not, you’re not ready to write. You’re ready to storyboard.

Build the simple, clear story first. Then elaborate carefully, making sure every detail serves that core narrative.

Reviewers are smart, busy people trying to identify good science under time pressure. Don’t make them work to understand your brilliance. Give them a story they can grasp, defend, and champion.

That’s what my colleague understood. And based on his funding success rate, the reviewers appreciate it.

The Replication Crisis Is a Market Failure (And We Designed It That Way)

Also published on my newsletter

The replication crisis isn’t a mystery. After presiding over the review for thousands of grants at NSF’s Biological Sciences Directorate, I can tell you exactly why science struggles to reproduce its own findings: we built incentives that reward novelty and punish verification.

A 2016 Nature survey found that over 70% of scientists have failed to reproduce another researcher’s experiments. But this isn’t about sloppy science or bad actors. It’s straightforward economics.

Thanks for reading sciencepolicyinsider! Subscribe for free to receive new posts and support my work.

The Researcher’s Optimization Problem

You have limited time and resources. You can either:

  1. Pursue novel findings → potential Nature paper, grant funding, tenure
  2. Replicate someone’s work → maybe a minor publication, minimal funding, colleagues questioning your creativity

The expected value calculation is obvious. Replication is a public good with privatized costs.

How NSF Review Panels Work

At NSF, I watched this play out in every review panel. Proposals to replicate existing work faced an uphill battle. Reviewers—themselves successful researchers who got there by publishing novel findings—naturally favor creative, untested ideas over verification work.

We tried various fixes. Some programs explicitly funded replication studies. Some review criteria emphasized robustness over novelty. But the core incentive remained: breakthrough science gets you the next grant; careful verification doesn’t.

The problem runs deeper than any single agency. Universities want prestigious publications. Journals want citations. Researchers want tenure. Nobody’s optimization function includes “produces reliable knowledge that someone else can build on.”

The Information Market Is Broken

Even when researchers try to replicate, they’re working with incomplete information. Methods sections in papers are sanitized versions of what actually happened in the lab. “Cells were cultured under standard conditions” means something different in every lab. One researcher’s gentle mixing is another’s vigorous shaking.

This information asymmetry makes replication attempts inherently inefficient. You’re trying to reproduce a result while missing critical details that the original researcher might not even realize mattered.

The Time Horizon Problem

NSF grants run 3-5 years. Tenure clocks run 6-7 years. But scientific truth emerges over decades. We’re optimizing for the wrong timescale.

During my time at NSF, I saw brilliant researchers make pragmatic choices: publish something surprising now (even if it might not hold up) rather than spend two years carefully verifying it. That’s not a moral failing—it’s responding rationally to the incentives we created.

What Would Actually Fix This

Make replication profitable:

  • Count verification studies equally with novel findings in grant review and tenure decisions
  • Fund researchers whose job is rigorous replication—make it a legitimate career path
  • Require data and detailed methods sharing as a funding condition, not an afterthought
  • Make failed replications as publishable as successful ones

The challenge isn’t technical. It’s institutional. We designed a market that overproduces flashy results and underproduces reliable knowledge. Until we fix the incentives, we’ll keep getting exactly what we’re paying for.

On Reproducibility: Physics versus Life Sciences

Photo by CaptainFrank_ on Pexels.com

Scientific reproducibility—the ability of researchers to obtain consistent results when repeating an experiment—sits at the heart of the scientific method. During my years at the bench and later as the leader of an Institute, it became clear that not all sciences struggle equally with this fundamental principle. Physics experiments tend to be more reproducible than those in life sciences, where researchers grapple with what many call a “reproducibility crisis.” Understanding why reveals something profound about the nature of these disciplines.

The State of Reproducibility Across Sciences

A 2016 Nature survey of over 1,500 researchers revealed the scope of the challenge: more than 70% of scientists have failed to reproduce another researcher’s experiments. The rates varied by field—87% of chemists, 77% of biologists, and 69% of physicists and engineers reported such failures. Notably, 52% of respondents agreed that a significant reproducibility crisis exists.

These numbers tell us something important: reproducibility challenges exist across all scientific disciplines, but they manifest with different severity. Physics hasn’t been immune to these issues, but it has been affected less severely than fields like psychology, clinical medicine, and biology. This isn’t a story of success versus failure—it’s a story of different sciences confronting different kinds of complexity.

The Physics Advantage

When a physicist measures the speed of light or the charge of an electron, they’re studying fundamental constants of nature. These values don’t change based on the lab, the researcher, or the day of the week. A particle accelerator in Geneva produces the same collision energies as one in Illinois. The laws governing pendulum motion work identically whether you’re in Cambridge or Kyoto.

This consistency extends beyond fundamental constants. Physics experiments typically involve controlled, isolated systems where researchers can eliminate or account for confounding variables. A physics experiment might study a single particle in a vacuum, far removed from the messy complexity of the real world. Precise measurement tools, refined over centuries, allow astonishing accuracy. NSF’s LIGO, for instance, can detect gravitational waves by measuring changes smaller than one ten-thousandth the width of a proton—equivalent to noticing a hair’s width change in the distance to the nearest star. The centuries of theoretical understanding that physics has developed makes the field less susceptible to reproducibility failures.

The Life Sciences Challenge

Life sciences researchers face a fundamentally different landscape. They’re not studying isolated particles obeying immutable laws; they’re investigating complex, adaptive systems shaped by evolution, environment, and chance.

Consider a seemingly simple experiment: testing how a drug affects cancer cells. Those cells aren’t uniform entities like electrons. Research has revealed extensive genetic variation across supposedly identical cancer cell lines. The same cell line obtained from different sources can show staggering differences—studies have found that at least 75% of compounds that strongly inhibit some strains of a cell line are completely inactive in others. Each cell line has accumulated unique mutations through genetic drift as they’re independently passaged in different laboratories.

The cells’ behavior changes based on how many times they’ve been cultured, what nutrients they receive, even the material of the culture dish. Research has documented profound variability even in highly standardized experiments, with factors like cell density, passage number, temperature, and medium composition all significantly affecting results. The researcher’s technique in handling the cells matters. Countless variables play roles that are difficult or impossible to fully control.

This complexity manifests in several ways:

Biological variability is the norm, not the exception. No two mice are identical, even if they’re genetically similar. Human patients are wildly variable. A treatment that works brilliantly for one person may fail completely for another with the “same” disease.

Emergent properties mean that biological systems exhibit behaviors that can’t be predicted simply by understanding their components. You can’t predict consciousness by studying individual neurons, just as you can’t predict ecosystem dynamics by studying single organisms.

Context dependence is paramount. A gene doesn’t have a single function—its effects depend on the organism, developmental stage, tissue type, and environmental conditions. The same protein can play entirely different roles in different contexts.

Reframing the “Crisis”

It’s worth questioning whether “crisis” is the right word for what’s happening in life sciences. Some researchers argue that the apparent reproducibility problem may be partly a statistical phenomenon. When fields explore bold, uncertain hypotheses—as life sciences often do—a certain rate of non-replication is expected and even healthy. A hypothesis that’s unlikely to be true a priori may still test positive, and subsequent studies revealing the truth represent science’s self-correcting mechanisms at work rather than a failure.

The complexity of biological systems means that two experiments may differ in ways researchers don’t fully understand, leading to different results not because of poor methodology but because of hidden variables or context sensitivity. This doesn’t excuse sloppy work, but it does suggest we should expect life sciences to have inherently lower replication rates than physics due to the nature of what’s being studied.

The Methodological Gap

These fundamental differences create practical challenges. Physics papers often provide enough detail for precise replication: “We used a 532nm laser with 10mW power at normal incidence…” Life sciences papers might say “cells were cultured under standard conditions”—but what’s “standard” varies between labs. One lab’s “gentle mixing” is another’s vigorous shaking.

The statistical approaches differ too. Physics can often work with small sample sizes because measurement precision is high and variability is low. Life sciences need larger samples to overcome biological variability, yet often work with small sample sizes due to cost, time, or ethical constraints. This makes studies underpowered and results less reliable.

Moving Forward

Recognition of reproducibility challenges has sparked essential reforms. Pre-registration of studies, open data sharing, more rigorous statistical practices, and standardized protocols all help. Some fields are developing reference cell lines and model organisms to reduce variability between labs. Journals are implementing checklists to ensure critical details are reported. These efforts are making a real difference.

Yet we must also accept that perfect reproducibility may be neither achievable nor always desirable in life sciences. Biological variability is a feature, not a bug—it’s the raw material of evolution and the reason life adapts to changing environments. The goal shouldn’t be to make biology as reproducible as physics, but to develop methods appropriate for studying complex, variable systems and to be transparent about the limitations and uncertainties inherent in this work.

Understanding the Divide

The reproducibility divide between physics and life sciences doesn’t reflect a failure in the life sciences. It reflects the reality that living systems are profoundly different from the physical systems that physicists study. Both approaches to science are valid and necessary; they’re simply tackling different kinds of problems with appropriately different tools.

Even physics, with all its advantages, sees nearly 70% of researchers unable to reproduce some experiments. The difference is one of degree, not kind. All science involves uncertainty, iteration, and gradual convergence on truth through many studies rather than single definitive experiments.

Understanding these differences helps us appreciate both the elegant precision of physics and the challenging complexity of life. And perhaps most importantly, it reminds us that the scientific method must be flexible enough to accommodate the full diversity of natural phenomena we seek to understand—from the fundamental particles that never change to the living systems that are constantly evolving.

The Unsung Hero: Why Exploratory Science Deserves Equal Billing with Hypothesis-Driven Research

For decades, the scientific method taught in classrooms has followed a neat, linear path: observe, hypothesize, test, conclude. This hypothesis-driven approach has become so deeply embedded in our understanding of “real science” that research proposals without clear hypotheses often struggle to secure funding. Yet some of the most transformative discoveries in history emerged not from testing predictions, but from simply looking carefully at what nature had to show us.

It’s time we recognize exploratory science—sometimes called discovery science or descriptive science—as equally valuable to its hypothesis-testing counterpart.

What Makes Exploratory Science Different?

Hypothesis-driven science starts with a specific question and a predicted answer. You think protein X causes disease Y, so you design experiments to prove or disprove that relationship. It’s focused, efficient, and satisfyingly definitive when it works.

Exploratory science takes a different approach. It asks “what’s out there?” rather than “is this specific thing true?” Researchers might sequence every gene in an organism, catalog every species in an ecosystem, or map every neuron in a brain region. They’re generating data and looking for patterns without knowing exactly what they’ll find.

The Case for Exploration

The history of science is filled with examples where exploration led to revolutionary breakthroughs. One of my lab chiefs at NIH was Craig Venter, famous for his exploratory project: sequencing the human genome. The Human Genome Project didn’t test a hypothesis—it mapped our entire genetic code, creating a foundation for countless subsequent discoveries. Darwin’s theory of evolution emerged from years of cataloging specimens and observing patterns, not from testing a pre-formed hypothesis. The periodic table organized elements based on exploratory observations before anyone understood atomic structure.

More recently, large-scale exploratory efforts have transformed entire fields. The Sloan Digital Sky Survey mapped millions of galaxies, revealing unexpected structures in the universe. CRISPR technology was discovered through exploratory studies of bacterial immune systems, not because anyone was looking for a gene-editing tool. The explosive growth of machine learning has been fueled by massive exploratory datasets that revealed patterns no human could have hypothesized in advance.

Why Exploration Matters Now More Than Ever

We’re living in an era of unprecedented technological capability. We can sequence genomes for hundreds of dollars, image living brains in real time, and collect environmental data from every corner of the planet. These tools make exploration more powerful and more necessary than ever.

Exploratory science excels at revealing what we don’t know we don’t know. When you’re testing a hypothesis, you’re limited by your current understanding. You can only ask questions you’re smart enough to think of. Exploratory approaches let the data surprise you, pointing toward phenomena you never imagined.

This is particularly crucial in complex systems—ecosystems, brains, economies, climate—where interactions are so intricate that predicting specific outcomes is nearly impossible. In these domains, careful observation and pattern recognition often outperform narrow hypothesis testing.

The Complementary Relationship

None of this diminishes the importance of hypothesis-driven science. Testing specific predictions remains essential for establishing causation, validating mechanisms, and building reliable knowledge. The most powerful scientific progress often comes from the interplay between exploration and hypothesis testing.

Exploratory work generates observations and patterns that inspire hypotheses. Hypothesis testing validates or refutes these ideas, often raising new questions that require more exploration. It’s a virtuous cycle, not a competition.

Overcoming the Bias

Despite its value, exploratory science often faces skepticism. It’s sometimes dismissed as “fishing expeditions” or “stamp collecting”—mere data gathering without intellectual rigor. This bias shows up in grant reviews, promotion decisions, and journal publications.

This prejudice is both unfair and counterproductive. Good exploratory science requires tremendous rigor in experimental design, data quality, and analysis. It demands sophisticated statistical approaches to avoid false patterns and careful validation of findings. The difference isn’t in rigor but in starting point.

We need funding mechanisms that support high-quality exploratory work without forcing researchers to shoehorn discovery-oriented projects into hypothesis-testing frameworks. We need to train scientists who can move fluidly between both modes. And we need to celebrate exploratory breakthroughs with the same enthusiasm we reserve for hypothesis confirmation.

Looking Forward

As science tackles increasingly complex challenges—understanding consciousness, predicting climate change, curing cancer—we’ll need every tool in our methodological toolkit. Exploratory science helps us map unknown territory, revealing features of reality we didn’t know existed. Hypothesis-driven science helps us understand the mechanisms behind what we’ve discovered.

Both approaches are essential. Both require creativity, rigor, and insight. And both deserve recognition as legitimate, valuable paths to understanding our world.

The next time you hear about a massive dataset, a comprehensive catalog, or a systematic survey, don’t dismiss it as “just descriptive.” Remember that today’s exploration creates the foundation for tomorrow’s breakthroughs. In science, as in geography, you can’t know where you’re going until you know where you are.

How America Built Its Science Foundation Before the War Changed Everything

Photo by Adarsh Rajput on Pexels.com

Most people think America’s scientific dominance began with the Manhattan Project or the space race. That’s not wrong, but it misses the real story. By the time World War II arrived, we’d already spent decades quietly building the infrastructure that would make those massive wartime projects possible.

The foundation was laid much earlier, and in ways that might surprise you. What’s more surprising is how close that foundation came to crumbling—and what we nearly lost along the way.

The Land-Grant Revolution

The story really starts in 1862 with the Morrill Act—arguably the most important piece of science policy legislation most Americans have never heard of. While the Civil War was tearing the country apart, Congress was simultaneously creating a network of universities designed to teach “agriculture and the mechanic arts.”

This wasn’t just about farming. The land-grant universities were America’s first systematic attempt to connect higher education with practical problem-solving. Schools like Cornell, Penn State, and the University of California weren’t just teaching Latin and philosophy—they were training engineers, studying crop diseases, and developing new manufacturing techniques.

But here’s what’s remarkable: this almost didn’t happen. The 1857 version of Morrill’s bill faced heavy opposition from Southern legislators who viewed it as federal overreach and Western states who objected to the population-based allocation formula. It passed both houses by narrow margins, only to be vetoed by President Buchanan. The legislation succeeded in 1862 primarily because Southern opponents had left Congress to join the Confederacy.

Private Money Fills a Critical Gap

What’s fascinating—and telling—is how much of early American scientific investment came from private philanthropy rather than government funding. The industrial fortunes of the late 1800s flowed into research, but this created a system entirely dependent on individual wealth and personal interest.

The Carnegie Institution of Washington, established in 1902, essentially functioned as America’s first NSF decades before the actual NSF existed. Andrew Carnegie’s $10 million endowment was enormous—equal to Harvard’s entire endowment and vastly more than what all American universities spent on basic research combined. The Rockefeller Foundation transformed medical education and research on a similar scale.

But imagine if Carnegie had been less interested in science, or if the robber baron fortunes had flowed entirely into art collections and European estates instead. This mixed ecosystem worked, but it was inherently unstable. When economic conditions tightened, private funding could vanish. When wealthy patrons died, research priorities shifted with their successors’ interests.

Corporate Labs: Innovation with Built-In Vulnerabilities

By the 1920s, major corporations were establishing research laboratories. General Electric’s lab, founded in 1900 as the first industrial research facility in America, became the model. Bell Labs, created in 1925 through the consolidation of AT&T and Western Electric research, would later become legendary for discoveries that shaped the modern world.

These corporate labs solved an important problem, bridging the gap between scientific discovery and commercial application. But they also created troubling dependencies. Research priorities followed profit potential, not necessarily national needs. Breakthrough discoveries in fundamental physics might be abandoned if they didn’t promise immediate commercial returns.

More concerning, these labs were vulnerable to economic cycles. During the Great Depression, even well-established research programs faced significant budget cuts and staffing reductions.

Government Stays Reluctantly on the Sidelines

Through all of this, the federal government remained a hesitant, minor player. The National Institute of Health, created in 1930 with a modest $750,000 for building construction, was one of the few exceptions—and even then, the federal government rarely funded medical research outside its own laboratories before 1938.

Most university science departments survived on whatever they could patch together from donors, industry partnerships, and minimal federal grants. The system worked, but precariously. During the Depression, university budgets were slashed, enrollment dropped, and research programs had to be scaled back or eliminated. The National Academy of Sciences saw its operating and maintenance funds drop by more than 15 percent each year during the early 1930s.

The Foundation That Held—Barely

By 1940, America had assembled what looked like a robust scientific infrastructure, but it was actually a precarious arrangement held together by fortunate timing and individual initiative. Strong universities teaching practical skills, generous private funding that could shift with economic conditions, corporate labs vulnerable to business cycles, and minimal federal involvement.

When the war suddenly demanded massive scientific mobilization, the infrastructure held together long enough to support the Manhattan Project, radar development, and other crucial innovations. But it was a closer thing than most people realize. The Depression had already demonstrated the system’s vulnerabilities—funding cuts, program reductions, and the constant uncertainty that came with depending on private largesse.

What We Nearly Lost

Looking back, what’s remarkable isn’t just how much America invested in science before 1940, but how easily much of it could have been lost to economic downturns, shifting private interests, or political opposition. That decentralized mix of public and private initiatives created innovation capacity, but it also created significant vulnerabilities.

The war didn’t just expand American science—it revealed how unstable our previous funding system had been and demonstrated what sustained, coordinated investment could accomplish. The scientific breakthroughs that defined the next half-century emerged not from the patchwork system of the 1930s, but from the sustained federal commitment that followed.

Today’s scientific leadership isn’t an accident of American ingenuity. It’s the direct result of lessons learned from a system that worked despite its fragility—and the decision to build something more reliable in its place. The question is whether we remember why that change was necessary, and what we might lose if we return to depending on unstable, decentralized funding for our most critical research needs.

Post lunch conversation with a colleague: trust in science

Yesterday, I had lunch with a colleague at a favorite BBQ spot in Arlington. Both of us work in science communication, so naturally our conversation drifted to the question that’s been nagging at many of us: why has public trust in scientific institutions declined in recent years? By the time we finished our, actually healthy food, we’d both come to the same conclusion—the current way scientists communicate with the public might be contributing to the problem.

From vaccine hesitancy to questions about research reliability, the relationship between science and society has grown more complex. To understand this dynamic, we need to examine not only what people think about science but also how different cultures approach the validation of knowledge itself.

Harvard scholar Sheila Jasanoff offers valuable insights through her concept of “civic epistemologies”—the cultural practices societies use to test and apply knowledge in public decision-making. These practices vary significantly across nations and help explain why scientific controversies unfold differently in different places.

American Approaches to Knowledge Validation

Jasanoff’s research identifies distinctive features of how Americans evaluate scientific claims:

Public Challenge: Americans tend to trust knowledge that has withstood open debate and questioning. This reflects legal traditions where competing arguments help reveal the truth.

Community Voice: There’s a strong expectation that affected groups should participate in discussions about scientific evidence that impacts them, particularly in policy contexts.

Open Access: Citizens expect transparency in how conclusions are reached, including access to underlying data and reasoning processes.

Multiple Perspectives: Rather than relying on single authoritative sources, Americans prefer hearing from various independent institutions and experts.

How This Shapes Science Communication

These cultural expectations help explain some recent communication challenges. When public health recommendations changed during the COVID-19 pandemic, this appeared to violate expectations for thorough prior testing of ideas. Similarly, when social platforms restricted specific discussions, this conflicted with preferences for open debate over gatekeeping.

In scientific fields like neuroscience, these dynamics have actually driven positive reforms. When research reliability issues emerged, the American response emphasized transparency solutions: open data sharing, study preregistration, and public peer review platforms. Major funding agencies now require data management plans that promote accountability.

Interestingly, other countries have addressed similar scientific quality concerns in different ways. European approaches have relied more on institutional reforms and expert committees, while American solutions have emphasized broader participation and transparent processes.

Digital Platforms and Knowledge

Online platforms have both satisfied and complicated American expectations. They provide the transparency and diverse voices people want, but the sheer volume of information makes careful evaluation difficult. Platforms like PubPeer enable post-publication scientific review that aligns with cultural preferences for ongoing scrutiny; however, the same openness can also amplify misleading information.

Building Better Science Communication

Understanding these cultural patterns suggests more effective approaches:

Acknowledge Uncertainty: Present science as an evolving process rather than a collection of final answers. This matches realistic expectations about how knowledge develops.

Create Meaningful Participation: Include affected communities in research priority-setting and policy discussions, following successful models in patient advocacy and environmental research.

Increase Transparency: Share reasoning processes and data openly. Open science practices align well with cultural expectations for accountability.

Recognize Broader Concerns: Understand that skepticism often reflects deeper questions about who participates in knowledge creation and whose interests are served.

Moving Forward

Public skepticism toward science isn’t simply a matter of misunderstanding—it often reflects tensions between scientific institutions and cultural expectations about legitimate authority. Rather than dismissing these expectations, we might develop communication approaches that honor both scientific rigor and democratic values.

The goal isn’t eliminating all skepticism, which serves essential functions in healthy societies. Instead, it channels critical thinking in ways that strengthen our collective ability to address complex challenges that require scientific insight.

Zero-based budgeting experiment: US STEM

Photo by Pixabay on Pexels.com

At research universities, zero-based budgeting is pretty rare. It means starting from zero expenditures and then justifying each budget line to reach an annual budget. It is frowned upon for long-term R&D projects for the apparent reason that it’s pretty challenging to predict a discovery that could be exploited to produce a measurable outcome.

Nevertheless, it’s worth considering using the process to optimize the entire US STEM/Biomedical enterprise from scratch.

Why Research Resists Zero-Based Budgeting

The resistance to zero-based budgeting in research environments stems from legitimate concerns. Academic institutions seldom adhere to a zero-based budget model because, as I stated above, scientific discovery is inherently unpredictable, and zero-based budgets require a significant amount of time and labor from units and university administrators to prepare, and this model can seriously encumber long-term planning.

Research requires substantial upfront investments in equipment, facilities, and human capital that only pay dividends over extended periods. The peer review system, while imperfect, has evolved as a way to allocate resources based on scientific merit rather than easily quantifiable metrics.

The Case for a National Reset

Despite these concerns, there’s a compelling argument for applying zero-based budgeting principles to the broader American STEM enterprise. Not at the individual project level, but at the systemic level—questioning fundamental assumptions about how we organize, fund, and conduct research.

Addressing Systemic Inefficiencies

Our current research ecosystem has evolved organically over decades, creating layers of bureaucracy, redundant administrative structures, and misaligned incentives. Universities compete for the same federal funding while maintaining parallel administrative infrastructures. A zero-based approach would force examination of whether these patterns serve our ultimate goals of scientific progress and national competitiveness.

Responding to Global Competition

The US still retains a healthy lead, spending $806 billion on R&D, both public and private, in 2021, but China is rapidly closing the gap. The Chinese government recently announced a massive $52 billion investment in research and development for 2024 — a 10% surge over the previous year, while the U.S. cut total investment in research and development for fiscal 2024 by 2.7%.

China had significantly increased its R&D investment, contributing over 24 percent of total global funding according to data from the Congressional Research Service, while the U.S. total remains strong, CRS data show that its share of total global expenditure dropped to just under 31 percent in 2020, down from nearly 40 percent in 2000.

Realigning with National Priorities

AI, pandemic preparedness, cybersecurity, and advanced manufacturing require coordinated, interdisciplinary approaches that don’t always fit neatly into existing departmental structures or funding categories. Starting from zero would allow us to design funding mechanisms that better align with strategic priorities while preserving fundamental research.

A Practical Framework

Implementing zero-based budgeting for the STEM enterprise could be approached systematically:

Phase 1: Comprehensive Mapping Begin by mapping the current research ecosystem—funding flows, personnel, infrastructure, outputs, and outcomes. This alone would be valuable, as we currently lack a complete picture of resource allocation.

Phase 2: Goal Setting Involve stakeholders in defining desired outcomes. What should American STEM research accomplish in the next 10-20 years? How do we balance basic research with applied research?

Phase 3: Pilot Implementation Rather than overhauling everything at once, implement zero-based approaches in specific domains or regions to identify what works while minimizing disruption.

Potential Benefits and Risks

A thoughtful application could yield improved efficiency by eliminating redundant processes, better alignment with national priorities, enhanced collaboration across institutional silos, and increased agility to respond to emerging threats.

However, any major reform involves significant risks. There’s danger of disrupting productive research programs, alienating talented researchers, or creating unintended bureaucratic complications. The political and logistical challenges would be immense.

Moreover, China has now surpassed the US in “STEM talent production, research publications, patents, and knowledge-and technology-intensive manufacturing”, suggesting that while spending matters, other factors are equally important.

Preserving What Works

Zero-based budgeting shouldn’t mean discarding what has made American research successful. The peer review system has generally identified quality research. The tradition of investigator-initiated research has fostered creativity and serendipitous discoveries. The partnership between universities, government, and industry has created a dynamic innovation ecosystem.

The goal isn’t elimination but examination of whether these elements are being implemented most effectively.

Conclusion

The idea of applying zero-based budgeting to American STEM research deserves serious consideration. By questioning assumptions, eliminating inefficiencies, and realigning priorities, we can create a research enterprise better positioned to tackle 21st-century challenges.

The process itself—careful examination of how we conduct and fund research—could be as valuable as specific reforms. In an era when Based on current enrollment patterns, China is projected to produce more than 77,000 STEM PhD graduates per year compared to approximately 40,000 in the United States by 2025, representing nearly double the US output., the ability to thoughtfully reimagine our institutions may be our greatest asset.

The question isn’t whether we can afford to undertake such a comprehensive review. The question is whether we can afford not to.

Reproducibility redux…

The crisis of reproducibility in academic research is a troubling trend that deserves more scrutiny. I’ve blogged and written about this before, but as 2024 begins, it’s worth returning to the issue. Anecdotally, I’ve noticed that most of my scientist colleagues have experienced the inability to reproduce published results on at least one occasion. For a good review of the actual numbers, see here. Why are the findings from prestigious universities and journals seemingly so unreliable?

There are likely multiple drivers behind the reproducibility meme. Scientists face immense pressure to publish groundbreaking positive results. Null findings and replication studies are less likely to be accepted by high-impact journals. This incentivizes scientists to follow flashier leads before they are thoroughly vetted. Researchers must also chase funding, which increasingly goes to bold proposals touting novel discoveries over incremental confirmations. The high competition induces questionable practices to get an edge.

The institutional incentives seem to actively select against rigor and verification. But individual biases also contaminate research integrity. Remembering back to my postdoctoral experiences at NIH, it was clear even then that scientists get emotionally invested in their hypotheses and may unconsciously gloss over contrary signals. Or they may succumb to confirmation bias, doing experiments in ways that stack the deck to validate their assumptions. This risk seems to increase as the prominence of the researcher increases. It’s no surprise that findings thus tainted turn out to be statistical flukes unlikely to withstand outside scrutiny.

More transparency, data sharing, and independent audits of published research could quantify the scale of irreproducibility across disciplines. Meanwhile, funders and academics should alter incentives to emphasize rigor as much as innovation. Replication studies verifying high-profile results deserve higher status and support. Journals can demand stricter methodological reporting to catch questionable practices before publication. Until the institutions that produce, fund, publish and consume academic research value rigor and replication as much as novelty, the problem may persist. There seem to be deeper sociological and institutional drivers at play than any single solution can address overnight. But facing the depth of the reproducibility crisis is the first step.

Happy New Year!