Methodology

How the rankings are built

This page documents how the Top 100 list is constructed, what's in the data, and what's deliberately out. The Riemann-hypothesis ranking uses the same three-source composite design as the sister Goldbach site (arXiv preprint output, OpenAlex topical citations, and zbMATH MSC classifications). The Riemann hypothesis is a heavily arXiv-based field, so the current ranking is built mainly from the arXiv signal, with OpenAlex contributing where it overlaps; the zbMATH MSC layer is being integrated.

Data sources

Source	What it gives	Limitations
arXiv (math.NT)	Preprint-level: titles, abstracts, authors, dates, co-author graph	Biased toward people who post preprints. Senior figures who publish only in journals are undercounted.
OpenAlex	Author-level: paper count, citations, affiliations, country	Concept tagging is noisy in math; surname-only matching can misidentify. The phrase `random matrix` pulls in wireless and signal-processing engineers, so it is excluded from the OpenAlex queries.
zbMATH Open	Curated math review database; canonical author codes; editor-assigned MSC classification (we use the two Riemann-hypothesis-core classes)	Coverage of older non-Western mathematicians is the best of the three sources; the REST API is gated behind a one-time Terms-of-Use acceptance. This layer is being folded into the ranking.

Pipeline

Title-weighting. A paper can mention the Riemann hypothesis without being about it, for example a paper that cites it in its introduction as a famous open problem. To separate genuine work from passing mentions, the arXiv and OpenAlex pipelines weight a keyword match by where it appears: a match in the paper title counts at full weight, and a match only in the abstract counts at half (a factor of 0.5). zbMATH is not title-weighted, because its documents are classified by human editors, so the subject class itself is the relevance signal.

arXiv pull: 15 search terms (Riemann hypothesis, Riemann zeta function, Dirichlet L-functions, critical line, critical strip, pair correlation, zero-density, zeros of the zeta function, nontrivial zeros, Selberg class, Mertens function, moments of the zeta function, Lindelof, Beurling-Nyman, and random matrix) restricted to the math.NT category. Each paper's contribution to an author is title-weighted as above. A co-authorship graph is built and eigenvector centrality is the second factor in an arXiv composite of 0.60 * pr(weighted papers) + 0.40 * pr(eigen). Authors with at least 3 topical papers qualify.
OpenAlex pull: 14 phrase queries (the arXiv terms minus random matrix, which without a category guard floods the results with wireless and random-matrix-theory engineers), with an author cap of 10 per work to remove physics megapapers. Works and their citations are title-weighted as above. Composite: 0.60 * pr(weighted works) + 0.40 * pr(weighted citations). Result: 137 qualifying authors. Because the Riemann hypothesis is so strongly an arXiv-preprint field, OpenAlex overlaps with only a handful of the top-ranked researchers directly; the others are carried by their arXiv signal.
zbMATH pull: documents tagged with either of the two Riemann-hypothesis-core MSC classes, 11M26 (zeros of zeta and L-functions and the Riemann hypothesis) or 11M50 (relations of the zeta function with random matrices and physics). The editor-assigned MSC classes correct a systematic gap in the other sources: pre-1995 number theorists and specialists who publish in journals with sparse arXiv presence. This pull is in progress and is being folded into the merged ranking.
Merge and scoring: the rankings are surname-deduplicated and joined. The available ranks are combined with a weighted order statistic: each researcher's ranks are sorted and weighted 0.70 on the best, 0.20 on the middle, and 0.10 on the worst. Sorting before weighting means the method rewards excellence in any one Riemann-hypothesis pipeline (a researcher who is top in zbMATH but absent from arXiv would still score well), while a researcher strong across all of them still finishes ahead. Lower combined score ranks higher. An earlier design simply summed the ranks, which punished anyone outstanding in one source but weak in another; the weighted order statistic fixes that.
Estimating a missing rank (interpolation): a researcher ranked by only one of the pipelines is not given a flat penalty. To estimate a missing rank, we order the whole pool by a pipeline the researcher does appear in, then walk outward to the two nearest researchers above and the two nearest below who carry a real rank in the missing pipeline, and average those (up to four) values. One rule protects the scoring: the 0.70 top weight may only land on a measured rank, so an estimate can support a researcher's score but can never be their headline signal. Estimated ranks show in [square brackets] on the Top 100 table; measured ranks show plain.
Recency weighting: each paper is weighted by how recent it is, so currently active researchers are favoured over those long inactive, with no hard cutoff. Relative to the current year, a paper from the last five years counts at full weight, one six to ten years old counts at 0.6, one eleven to twenty years old counts at 0.3, and anything older still counts, at 0.1. The weight is applied to each source signal before ranks are formed: the arXiv paper count, the OpenAlex work and citation counts, and the zbMATH document count. Older work is never discarded; it simply contributes less. A paper still counts toward whether a researcher qualifies for a pipeline; recency only changes the order. The arXiv and OpenAlex layers carry a year for every paper, so recency applies in full there. The zbMATH layer does not yet record a per-document year, so its documents are all weighted equally until that field is added.
Hand-curated edits: an exclusions file removes researchers the automated pipeline surfaced in error (see Audit decisions). The merge does not hand-place any researcher; everyone earns their rank from the pipeline scores.

Audit decisions

Excluded

A small number of authors surfaced by the automated pipeline are removed by hand. Some work in unrelated fields, for example signal processing, wireless communications, or coding theory, and were pulled in by surname collisions or by the noisy random matrix topic before it was dropped from the OpenAlex queries. A few others are self-published authors whose output is not part of mainstream research. The specific names are kept internal: listing them here would only give them visibility, which is the opposite of the point.

What's not in this list

Researchers without a strong digital footprint. The pipeline indexes arXiv well and OpenAlex moderately, so figures who publish mainly in journals are undercounted until the zbMATH MSC layer is fully integrated.
Subjective importance. A theorist whose entire body of Riemann-hypothesis work is one influential paper may rank lower than a productive researcher with many adjacent papers. We rank by output, not by depth.
Adjacent topics. The list covers the Riemann hypothesis and adjacent problems, so some of the 100 work mainly on related questions (pair correlation of zeros, moments of the zeta function, Dirichlet L-functions, the Selberg class, the random-matrix connection) rather than on the Riemann hypothesis directly. Title-weighting reduces, but does not eliminate, the appearance of researchers whose connection to the hypothesis itself is incidental.

Sources of error

Surname matching is fragile. Mathematicians with common surnames may be conflated. The pipeline handles the worst cases but misses are possible.
Citation counts are influenced by adjacent fields. Riemann-hypothesis work overlaps with random matrix theory, analytic number theory, and mathematical physics, so a researcher whose work bridges those fields may rank higher purely because those fields cite more. The OpenAlex term random matrix in particular pulls in wireless and signal-processing engineers, so it is excluded from the OpenAlex queries.
Coverage is biased toward digital publishing. The Riemann hypothesis is a heavily arXiv-based field, which the pipeline indexes well, but some senior figures who publish mainly in journals are undercounted, and the zbMATH MSC layer that corrects for this is still being integrated.
The Top 100 is not a verdict. It is a starting point. Use it alongside MathSciNet, your advisor, and your own reading.

Acknowledgments

Data sources: arXiv, OpenAlex, zbMATH Open.