lemmy/docs/src/about_ranking.md
Dessalines 464ea862b1
Preferred usernames, banners and icons. (#1055)
* Re-organizing federation tests. #746 #1040

* Adding federation support for user bios. Fixes #992

* Adding icons, banners, and preferred usernames.

- Added optional community icons, and community banners.
- Added user banners.
- Added Site icon and banner, with custom favicon.
- Set up preferred usernames. Fixes #1017
- Added an additional post sort: Active
  - Hot rank now uses the published time.
  - Active uses the most recent comment time, and is default.
- DB Migration was required to add all these fields to the views.
- Added transfercommunity helper function.
- Removed title column from communities page.
- Abstracted an image-upload-form.tsx, and a banner-icon-header.tsx
- Fixes #899

* Some navbar fixes.

* Fixing css

* Some fixes.

- Showing correct user icon and banner after save without page reload.
- Abstracting diesel update overwrite.
- Adding some docs.

* Adding @ when a user doesn't have a preferred username.
2020-08-05 12:03:46 -04:00

2 KiB

Trending / Hot / Best Sorting algorithm

Goals

  • During the day, new posts and comments should be near the top, so they can be voted on.
  • After a day or so, the time factor should go away.
  • Use a log scale, since votes tend to snowball, and so the first 10 votes are just as important as the next hundred.

Reddit Sorting

Reddit's comment sorting algorithm, the wilson confidence sort, is inadequate, because it completely ignores time. What ends up happening, especially in smaller subreddits, is that the early comments end up getting upvoted, and newer comments stay at the bottom, never to be seen. Research showed that nearly all top comments are just the first ones posted.

Hacker News Sorting

The Hacker New's ranking algorithm is great, but it doesn't use a log scale for the scores.

My Algorithm

Rank = ScaleFactor * log(Max(1, 3 + Score)) / (Time + 2)^Gravity

Score = Upvotes - Downvotes
Time = time since submission (in hours)
Gravity = Decay gravity, 1.8 is default
  • Lemmy uses the same Rank algorithm above, in two sorts: Active, and Hot.
    • Active uses the post votes, and latest comment time (limited to two days).
    • Hot uses the post votes, and the post published time.
  • Use Max(1, score) to make sure all comments are affected by time decay.
  • Add 3 to the score, so that everything that has less than 3 downvotes will seem new. Otherwise all new comments would stay at zero, near the bottom.
  • The sign and abs of the score are necessary for dealing with the log of negative scores.
  • A scale factor of 10k gets the rank in integer form.

A plot of rank over 24 hours, of scores of 1, 5, 10, 100, 1000, with a scale factor of 10k.