Why BetterScholar

Antonio Norelli
11 min readMay 7, 2024

--

BetterScholar aims to be a better alternative to Google Scholar to quickly get an overview of a scientist.

Here’s a typical Google Scholar profile page:

These pages are widely used and have a major impact on science today. We believe not in the best way. Starting from principles, we built a new scholar profile page from scratch.

BetterScholar design principles.

  1. For the significant works of a researcher, ask the researcher.
  2. Breakthroughs are like earthquakes, they should be measured with a logarithmic scale.
  3. We value the brightness of a showcase, not the size of a warehouse.
  4. The co-authors of a paper play different roles, which should be transparent
  5. For the research area of a researcher, look where they like to publish.

For the significant works of a researcher, ask the researcher.

In BetterScholar, papers are not forcibly ranked by citations or recency. Instead, the researchers themselves rank their papers and curate a limited selection. They can choose the works they are most proud of, which only sometimes coincide with their most cited.

We want to empower researchers, in contrast to modern academia, where even successful researchers often feel like gears of a mechanism that spin without their control. Constrained funding brings them to write papers they themselves do not believe in, and the increasingly sloppy review process fails more and more often in recognizing their best works.

We want researchers to take back control, starting from their public scholar profile. We want them to tell us what their best works are. In which ideas they really believe. In a researcher profile, we want to see the works that best represent them:

I am interested in this researcher; let’s go to their Google Scholar profile to find their works. What papers should I read?

Shall I go for the most cited? But there they appear as one of the many authors.

In the second most cited, they are also the first author, but this work is from 20 years ago…

If only the researcher could tell me!

In BetterScholar, they can. This not only liberates researchers, but also creates information of remarkable value to navigate the ocean full of insignificant papers published today: which papers we should pay attention to, in the opinion of their authors.

On BetterScholar profiles, the authors who decided to showcase a paper are immediately visible: they appear in bold in the list of co-authors. A PhD student can be proud of having a paper with their senior coauthors in bold: this means the senior decided to spend a spot of their limited showcase on their work!

Breakthroughs are like earthquakes, they should be measured with a logarithmic scale.

BetterScholar measures the impact of papers and researchers using the log2 of the citations received. This is the only metric shown in BetterScholar, which we call level. A paper reaches level 10 when it hits 2**10=1024 citations, level 11 when it doubles these citations: 2**11=2048, and so on.

Scientific ideas resemble an earthquake more than they do the formation of a stalactite; they are disruptive, and we are interested in how much they change the panorama: whether they caused some cracks or a city should be rebuilt –not in every single drop that grows them. Indeed, the significance of citations is not linear: 1000 citations are extremely meaningful when comparing two papers with 6 and 1006 citations –an imperceptible tremor versus cracks in your ceiling– while they are essentially noise when the papers have 82k and 83k citations –a new city should be built anyway. This is why we need a logarithmic scale.

Using log2, all papers are squeezed in a range between 0 and ~20, where every step is significant: a difference of one level means the citations have doubled.

In fact, the level is extremely more readable and easy to parse than the current raw number of citations shown by Google Scholar:

This metric starkly contrasts with the h-index, the competing easy number provided by Google Scholar, which is also the main driver of its popularization and widespread adoption.

It is rare to find a researcher who likes the h-index, and there is a reason: it is a misleading metric for scientific impact, with multiple flaws, as we will discuss in the next paragraph.

We value the brightness of a showcase, not the size of a warehouse.

In BetterScholar, the amount of papers appearing in a profile is limited. We call on researchers to curate a showcase of their best work, to consider what to include and leave out. The showcase size is intended to be tight for everyone and corresponds to the log2 of the total citations the researcher has received. Early career researchers can choose only a few, but even the most influential scientists do not reach twenty.

This design choice follows the principle that in scientific works, what really matters is quality over quantity, and that a scientist is better represented by the impact of their best work rather than the paper weight of articles published under their name.

In contrast, Google Scholar follows the opposite principle, and its design choices reward the raw number of authored papers in different ways: by showing all the papers rather than a selection, and especially by adopting the h-index as the main metric, presenting it as the easy number to take home.

We believe that the h-index* is a flawed and harmful metric to assess scholars, because it fails to measure how much the new knowledge created by a researcher is used, and because it is easy to manipulate, and optimizing for it produces terrible science.

The h-index is blind to the most impactful works. It sees them as an average paper collecting a few dozen citations. The h-index sees a seminal paper with 10,000 citations that ignited a new subfield as just a standard paper with a modest number of citations.

What is worse is that the h-index is easy to manipulate –by crafting many papers that cite each other– and optimizing for it not only produces insignificant works that increase the noise in the system and hide the gems, but it also pushes bright minds away from working on what they find interesting, which should be the main drive of a scientist.

In BetterScholar, we replace the simple number provided by the h-index with the log2 of the total citations of the works a researcher has decided to showcase, which we call the overall level. The color of this number reflects the age of the featured papers.

This metric is sensitive to the full impact of a researcher’s best works, while it is blind to the raw number of authored papers. Optimizing for this number is not easy: to increase it, a researcher must produce a few works that garner as many citations as their current best ones.

*A scholar has an h-index of x if they have at least x papers with at least x citations each. For example, a researcher who authored 5 papers with 10,000 citations each has an h-index of 5, while a researcher who authored 10 papers with 10 citations each has an h-index of 10.

The co-authors of a paper play different roles, which should be transparent

A BetterScholarprofile shows the contribution of the author in each of their research papers; this appears as an emoji in the column labeled “Role.”

There are four possible roles:

  • 🦁 Project lead
  • 🐬 Key contributor
  • 🐜 Supportive contributor
  • 🐢 Advisor

The author’s contribution is automatically guessed from the order of names in the paper and the number of coauthors, but the project lead can adjust it on the edit page.

We believe that providing information about a researcher’s contribution to the papers appearing in their profile is essential to represent the researcher’s work fairly. Indeed, many scientific papers indicate the authors’ contribution, either with typographic signs or extensive descriptions. Even when the indication does not appear explicitly, in many fields, the order of the authors represents their role in the paper.

In this regard, Google Scholar is completely lacking. It does not indicate the author’s contribution, and even adding simple asterisks, e.g., to signal co-first authorships, is impossible. Furthermore, Google Scholar is opaque also with respect to the number of coauthors. A paper with 8 authors and one with 400 authors are indistinguishable on a scholar profile page.

In contrast, BetterScholar shows the author’s contribution to each paper, and the number of coauthors when there are more than a few. We believe that our four roles strike the right balance between informativity and simplicity, which is essential for pages often skimmed.

The four role labels are clear enough, and we do not want to specify them further to leave a convenient space for interpretation, useful to fit different cases.

Our roles may also turn handy in lab discussions about attribution, offer a standard to signal author contribution on papers, and make implicit attributions –missed by many profile visitors– more transparent.

The roles of a paper can be adjusted by any project lead 🦁. To preserve informativity, each role comes with a weight, and the sum of the weights of all coauthors of a paper cannot exceed 1: 🦁=0.3, 🐬=0.1, 🐜=0 🐢=0.2. That is, for instance, it is not possible to make everyone a project lead for papers with more than three authors, or to have more than seven key contributions in a paper with a project lead. The project lead is not mandatory; the first author can adjust the roles if missing.

For the research area of a researcher, look where they like to publish.

A popular piece of information sought about scholars is their area of research. BetterScholar shows the venues of the showcased works on top to fulfill this need. The most frequent venues appear first and are larger.

In contrast, Google Scholar places in the same positions the areas of interest that researchers express by compiling a free text form. These areas are clickable and lead to a page with the most cited researchers in that area.

The problem with this choice is that the areas may not be faithful to the actual works of the researcher. For instance, consider the profile of one of the world’s top 20 researchers in Deep Learning today according to Google Scholar:

They are clearly doing particle physics, as can be recognized from the article titles and venues.

It may well be possible that a researcher has changed interests and today is focused on another area, like deep learning. Indeed, in BetterScholar, this researcher can choose to showcase their latest works in the new area, and the corresponding deep learning venues will appear on top. Doing so would also affect their overall level: it would be lower if the new works have fewer citations than the old ones, but also bluer, signaling that their showcased research is recent.

BetterScholar also introduces pages for venues. They closely resemble researcher profiles, showcasing the most cited papers published in the venue. As with scholars, the number of papers shown corresponds to the log2 of the total citations of papers published in the venue, while the overall level is the log2 of the total citations of the showcased works. In these pages, we have the authors who appear most in the showcased works instead of the venues. These authors are the heroes of that venue and have it emphasized on their profile.

Google Scholar receives over 4 million daily visitors, with a significant fraction involving Scholar profiles. These pages shape the public opinion on researchers, and condition critical decisions:

  • Future PhD students use them to figure out who could be a good advisor.
  • Researchers use them to look for collaborations.
  • The media use them to choose the experts to interview.
  • Companies use them to decide who to hire.
  • Academic committees use them to assess the best candidate for a professorship position, an award, or a million-dollar grant.

Google Scholar also influences your behavior: you may find yourself frequently checking your profile, contemplating your public academic persona, and strategizing ways to enhance it. This platform subtly guides your focus, suggesting topics to explore and methods to employ in your research.

However, these profile pages are not a neutral measuring tool. They make deliberate choices on what to show and what to hide, what information should be asked of researchers, and what should be out of their control. They craft some numbers into catchy nuggets and put others behind a click.

These design choices reflect a precise conception of how science should function and how researchers ought to be represented. We propose an alternative vision, embodied by five guiding principles, and have developed a new platform based on these ideas, offering a different approach from Google Scholar.

We are not against the very concept of Google Scholar profiles. We believe that such a tool is needed, and its usage demonstrates this. But we also think that today, this concept is not serving science in the best way, and we have an idea on how to do it better.

Try BetterScholar.

BetterScholar is powered by SemanticScholar data, which is free to use and allows us to build our vision, while Google Scholar does not allow to use its data. The information provided by Semantic Scholar is complete and mostly accurate, but any inaccuracy there is also reflected in BetterScholar.

Most of these inaccuracies result from ambiguities unsolved by their crawler, which can be fixed manually, like the presence of multiple profiles for the same author. Any fix on Semantic Scholar will also propagate to BetterScholar.

BetterScholar is a project by Antonio Norelli and Bardh Prenkaj. We are two postdocs who decided to turn many lunch discussions into something useful. We started it as a side project and learned web development while doing it. BetterScholar, in its current state, is not polished, but it is ready enough to share our vision.

If you want to help us make BetterScholar something more than a demo, please get in touch.

We will present BetterScholar in person in Vienna during ICLR 24, the International Conference on Learning Representations. If you are there, join us on Thursday, May 9th, and Friday, May 10th, at 12:45 p.m. in the social “Your new Scholar profile”!

The little owl with a graduation hat in our logo is the Owl of Minerva, the Roman goddess of Wisdom. She is often depicted with her sacred creature, the owl, one of the most ancient symbols of “knowledge, wisdom, perspicacity and erudition.” We chose it also to tribute Rome, where this project was born, and our Alma Mater, La Sapienza, which features a legendary Minerva statue at the center of the campus.

--

--

Antonio Norelli
Antonio Norelli

Written by Antonio Norelli

Postdoc in AI and ML at the University of Oxford, creator of BetterScholar • I love teaching, especially to machines.