We are having a new downstairs bathroom fitted, and, with a freshly screeded concrete floor, I decided to inscribe the equation of the R statistic from Stekel, Git and Falciani 2000. This is the second time I have inscribed an equation somewhere unusual: the first time being the T-cell recirculation equation from Stekel 1997 in the snow at Annapurna Base Camp – more on that story later. The R statistic equation will soon be covered by some nice new cushioned floor, and will remain covered for many years.
When the bathroom fitter saw it he was somewhat bemused: he was more used to seeing people’s names in such situations! Interestingly, he asked “Is this the cure for cancer?”. I said, well, not exactly, but it can be used to compare cancer cells with healthy cells to find genes that could help with a cure. He and his colleague looked suitably impressed! Actually, I wrote this equation because I now recognize that this equation is, so far, my most impactful piece of original research. According to Google Scholar, that paper has now been cited 247 times; only my book has been cited more often. Funnily enough, although the original application was in cancer, the most prominent citations are mostly in plant and crop science, where the use of EST libraries has been particularly valuable.
Of course, now EST libraries are rarely used, and next generation sequencing has taken their place. This R statistic isn’t really appropriate for NGS gene expression comparison: the statistic is derived on the basis of Poisson distribution of sequence counts, which is not the case for NGS, where the sequence counts are better described by negative binomial distributions. Estimating good parameters for negative binomials is tougher, and several groups have written good software for comparing gene expression with NGS including DESeq and edgeR.
Nonetheless, the R statistic remains popular. The paper was not perfect (I most regret not being aware of Benjamini and Hochberg’s false discovery rate at the time), but I still think that, where the Poisson assumption is appropriate, the use of this statistic remains the best possible approach. So I am proud of my contribution and wrote the equation into the concrete.
Now to the Annapurna story. In April 2001 I went trekking in Nepal, reaching Annapurna Base Camp on a fairly classic walk. Feeling frustrated about my contributions to science while working in the pharmaceutical / biotechnology sector, I inscribed into the snow the equation from my PhD that described the interaction between recirculating T-cells and dentritic cells. At the time this was the work I was most proud of. Of course, as I have written previously, that mechanism has since been proven wrong. What is even more ironic is that I had already published the R statistic, although it had not yet received many citations. Moreover, the work for the R statistic was carried out while Francesco Falciani and I were working at Glaxo, and Yoav Git was working for an investment bank! Such is hindsight.