Peer review versus quantitative metrics in ranking research quality
Both approaches have strengths and blind spots. Combining them thoughtfully produces better assessments than either alone.
The two traditions of research assessment
Research assessment in university rankings draws on two distinct traditions. The first is peer review, in which experts in a field evaluate the quality and significance of research outputs. This is the model used by national research assessment exercises, such as the Research Excellence Framework in the United Kingdom, and it underlies the reputation surveys that many rankings use. The second tradition is quantitative metrics, primarily bibliometric indicators such as citation counts, journal impact factors, and h-indices. Most rankings use some mix of both.
These two approaches have fundamentally different strengths and weaknesses. Peer review can capture dimensions of quality that numbers cannot: originality, rigor, significance, and contribution to the field. It can assess outputs in their proper context, recognizing that a paper that advances a niche subfield may be more important than one that attracts broad but superficial attention. However, peer review is expensive, slow, subject to reviewer bias, and difficult to scale to thousands of institutions. It works best at the level of individual researchers or small departments, not entire universities.
The seductive precision of metrics
Quantitative metrics offer speed, scalability, and the appearance of objectivity. They can be calculated automatically from large databases, compared across institutions and countries, and tracked over time. This makes them attractive to ranking publishers who need to process data on thousands of institutions. The risk is that the convenience of metrics leads to over-reliance. A citation count is a proxy for research impact, not a direct measure of it. A highly cited paper may be highly cited because it is controversial, because it contains a useful method, or because it is the default reference that everyone cites without reading.
Moreover, the availability of metrics shapes what gets measured and valued. Because citations are easy to count, they become central to research assessment. Because creative works, policy impacts, and teaching materials are hard to count, they are marginalized. Over time, this creates an incentive structure that pushes researchers toward producing citable journal articles rather than other valuable outputs. The metric becomes the target, and the target distorts the behavior it is supposed to measure—a phenomenon known as Goodhart's law.
Combining peer review and metrics responsibly
The most thoughtful approaches to research assessment combine peer review and quantitative metrics in a structured way. In the United Kingdom's Research Excellence Framework, expert panels review a sample of research outputs from each department, using citation data as supplementary information rather than as a primary determinant. This approach preserves the contextual judgment of peer review while using metrics to inform rather than replace that judgment.
Some ranking publishers are moving toward similar hybrid models. Rather than relying solely on citation counts, they are experimenting with indicators that capture the diversity of research outputs, the proportion of research that involves international collaboration, and the influence of research beyond academia. These approaches are promising but remain at an early stage. For now, university rankings remain heavily reliant on metrics that are easy to compute rather than those that are most meaningful.
What this means for ranking users
For ranking users, the practical implication is to understand that research quality is inherently difficult to measure and that all ranking indicators are imperfect approximations. A high research score does not guarantee that a university is producing important, rigorous, or original work. It indicates that the institution performs well on the particular proxy measures that the ranking has chosen. To get a fuller picture, look beyond the composite research score to the underlying indicators. Check whether the ranking relies primarily on surveys, citations, or a mix. If possible, find subject-level assessments from professional bodies or government agencies that use more thorough methods.
Also consider the type of research environment you want. A university that scores highly on citation metrics but relies heavily on a small number of superstar researchers may offer fewer opportunities for graduate students to engage in research than a university with a broader base of active researchers. Metrics can guide you toward institutions with strong research profiles, but only direct investigation—reading faculty profiles, looking at recent publications, speaking to current students—can tell you whether the research environment is one in which you will thrive.
For students selecting a university, the distinction between peer review and metrics carries practical implications. If you are looking for a strong research environment in a specific discipline, look for evidence that the department's work is respected by peers—through invitations to present at conferences, editorial board memberships, and collaborative projects with leading scholars. These signals of peer esteem are often more telling than citation counts, which can be inflated by the dynamics described above. Use rankings as a starting filter, but evaluate research quality through the judgment of the scholarly community itself.