Skip to content
Back to Blog
Cell identityQCReproducibility

Your Cells Passed QC. That May be the Problem.

Standard QC can confirm what cells look like, but not what they are doing. That gap gets expensive downstream.

May 2, 20267 min readCapyBio Team
Stylized scientific illustration of cell populations and gene-regulatory programs.

Have you ever worked with cells that didn’t behave as expected, even after they passed the usual checks?

  • The differentiation worked
  • The panel looked right

Then the next experiment failed. Or worse, the data looked convincing but led to the wrong decision.

This hidden gap, often missed by standard quality checks, could be the reason behind a failed project. This isn’t just a theoretical issue. It’s a real flaw in how we validate cell identity, and it can affect later experiments in ways that are rarely traced back to the source.

A positive result for specific markers does not guarantee true cell identity.

In every cell engineering process, there’s a moment when everything seems to make sense. The differentiation protocol is done, the marker panel looks good, and the cell population appears correct. It feels natural to move forward.

But this confidence often comes from limited information. Markers and transcription factor panels indicate which genes are present, but not whether the broader regulatory program is coordinated, stable, and appropriate for the experiment[1]. That coordination happens in the gene regulatory network (GRN), which controls cell behavior. The GRN is mostly hidden from standard checks.

This matters because cell identity isn’t just an on/off switch. It’s a regulatory program that needs to be broken down, rebuilt, and stabilized over time. As cells change identity, they go through a range of transcriptional states, and this process is often incomplete.

So, even if a few canonical markers look right, the regulatory state might still be different. The cells do not always look wrong; sometimes they look good enough to move forward. This gap is real, measurable, and important.

Cells with unresolved identity can be present in populations that seem successful.

Research from CapyBio’s scientific founders looked at 56 independent cell engineering studies and found that reprogrammed and differentiated cells often kept molecular signatures from their original identity, even when standard tests said conversion was successful[1].

Building on this, CapyBio’s team created a single-cell identity-scoring system that measures cell identity as a spectrum, not just a fixed label. With this approach, they saw that many “differentiated” cells are not fully one type or another but instead are in hybrid states, running transcriptional programs from more than one identity at the same time[2].

These cells are not just technical noise; they are real biological states and can pass standard quality control without being noticed. This means the population that moves forward may look uniform, but functionally, it is not. That is the group that advances to the next step in the pipeline.

Even a small identity error can become a drug screening error.

This is where the problem becomes expensive, not just theoretical. In drug screening, unresolved cell identity can not only add random noise, but it can also bias results in systematic ways[3].

For example:

  • A compound that seems inactive may be targeting a pathway that was never properly set up in the model.
  • A compound that looks promising may be acting on leftover programs from the original cell type.

Sometimes, a reproducibility issue is not caused by the assay but by the biological state of the cells[4]. In other words, the assay might work, the readout might be accurate, but the model itself could be wrong.

What makes this even harder is that these errors are rarely obvious when you first see your data. The screen gives you numbers; you analyze them and then decide which results to follow up on or which compounds to drop. All of this relies on an assumption about cell identity made weeks earlier. By the time you notice something isn’t reproducible, the original cell identity problem is often forgotten.

The dangerous part is that none of this necessarily announces itself as a cell identity problem. It may show up as:

  • Noisy data
  • Weak signal
  • Inconsistent potency
  • Failed validation
  • Confusing biology

Repeated cell passage can cause identity drift, even when standard markers remain unchanged.

Passage number brings another challenge. Even if early differentiation looks consistent, cells can start to change before most checkpoints would notice.

CapyBio’s research tracked individual cells during reprogramming and found that cells likely to fail conversion were already different at the transcriptional level by day 3, even though they looked the same as successful cells by standard tests[5]. The problem isn’t that the cells look wrong; it’s that they look right for longer than they actually are.

Serial passage adds more complexity. In iPSC-derived sensory neurons, pluripotency markers stayed statistically similar across low, intermediate, and high passage numbers, but differentiation outcomes and functional properties changed a lot[6].

While standard QC metrics stayed the same, the properties that actually affect experimental results did not. The cells may look stable, but their underlying biology has already changed.

This matters because iPSC-derived sensory neurons are used in neurotoxicity and peripheral neuropathy screening. Cantor et al. did not directly test drug response across passage numbers, but they showed that standard pluripotency QC can remain unchanged while sodium channel function and neuronal identity shift — exactly the kinds of properties that can influence how a drug assay is interpreted.

The most challenging identity issues are not always the result of obvious authentication failures.

The field is starting to tackle the most obvious version of this problem: clear cases of cell line misidentification. The International Cell Line Authentication Committee tracks hundreds of contaminated or misidentified cell lines, and studies show that many human cell lines used in research are misidentified[7].

But these are the issues we can see and name. The harder problem is more subtle: cells that seem to be the right type but are still unresolved at the transcriptional level.

These cells pass authentication, aren’t flagged in peer review, and may pass standard QC. However, they still retain traces of their original identity or harbor mixed regulatory programs, which can later manifest as failed replication, inconsistent screening, or unexplained variability.

There isn’t yet a standard system to measure this problem, but evidence of its existence and importance has been growing for years.

Quality control should assess the regulatory programs active within the cell.

Single cell transcriptional profiling changes the whole approach. There are three different questions a QC step can answer, and most pipelines stop at the first two:

  • Authentication asks whether the cell line is what it claims to be.
  • Marker panels ask whether expected genes or proteins are present.
  • Cell-state analysis asks whether the cell is running the correct gene programs required to sufficiently express in vivo biology in a dish.

This shift in benchmarking perspective matters the most when cells move from production to decision-critical experiments, because after that, identity is never rechecked.

Not just what the markers show or what the protocol aimed for, but what the cells are doing at the regulatory level on the day they’re used in the experiment. Right now, most pipelines don’t have a clear answer to this.

For teams building cell models, organoids, engineered cells, or differentiation protocols, the question is not just whether the cells passed QC; it is what kind of QC they passed. Before the next screen, validation study, or protocol decision, know what your QC is capturing — and what it is missing.

References

  1. 01

    Cahan P, Li H, Morris SA, Lummertz da Rocha E, Daley GQ, Collins JJ. CellNet: Network Biology Applied to Stem Cell Engineering. Cell. 2014;158(4):903–915.

    doi:10.1016/j.cell.2014.07.020
  2. 02

    Kong W, Fu YC, Holloway EM, et al. Capybara: A computational tool to measure cell identity and fate transitions. Cell Stem Cell. 2022;29(4):635–649.e11.

    doi:10.1016/j.stem.2022.03.001
  3. 03

    Eckers JC, Swick AD, Kimple RJ. Identity Crisis — rigor and reproducibility in human cell lines. Radiat Res. 2018;189(6):551–552.

    doi:10.1667/RR15086.1
  4. 04

    Six factors affecting reproducibility in life science research and how to handle them. Nature Index. Accessed May 1, 2026.

    Source
  5. 05

    Jindal K, Adil MT, Yamaguchi N, et al. Single-cell lineage capture across genomic modalities with CellTag-multi reveals fate-specific gene regulatory changes. Nat Biotechnol. 2024;42(6):946–959.

    doi:10.1038/s41587-023-01931-4
  6. 06

    Cantor EL, Shen F, Jiang G, et al. Passage number affects differentiation of sensory neurons from human induced pluripotent stem cells. Sci Rep. 2022;12:15869.

    doi:10.1038/s41598-022-19018-6
  7. 07

    Weiskirchen R. Misidentified cell lines: failures of peer review, varying journal responses to misidentification inquiries, and strategies for safeguarding biomedical research. Res Integr Peer Rev. 2025;10:12.

    doi:10.1186/s41073-025-00170-2

Want this kind of QC on your cells?

Capybara™ measures cell identity at the regulatory level, not just at the marker panel. Talk to us about benchmarking your model before the next screen.