Skip to content

【翻译】How Hacker News ranking algorithm works

原文地址:http://amix.dk/blog/post/19574

All rights belong to Amir Salihefendic, I just do the translation.

==========================分割线========================

How Hacker News ranking algorithm works

In this post I’ll try to explain how Hacker News ranking algorithm works and how you can reuse it in your own applications. It’s a very simple ranking algorithm and works surprising well when you want to highlight hot or new stuff.

在这篇博文里我会尝试解析Hacker News网站的排序算法是怎么工作的和告诉你如果在你自己的程序里使用这个算法。这个算法其实很简单但是在你想要高亮最新最in的东西的时候却十分的给力靠谱。

Digging into news.arc code

Hacker News is implemented in Arc, a Lisp dialect coded by Paul Graham. Hacker News is opensource and the code can be found at arclanguage.org. Digging through the news.arc code you can find the ranking algorithm which looks like this:

Hacker News网站是用Arc开发的,Arc是一种由Paul Graham大神改编的,基于Lisp的编程语言。Hacker News网站是开源的,源码可以在上面那个org网站找到。在挖掘探索news.arc的代码的时候你可以发现其排序算法长这样:

In essence the ranking performed by Hacker News looks like this:

这么看很难理解吧?本质上这个算法长这样:

分数 = (P-1) / (T+2)^G

P = 一个东西的得分,减一是为了减掉发帖者自己的评分

T = 帖子发布的时间(按小时计)

G = 重力参数,在news.arc里预设值为1.8

As you see the algorithm is rather trivial to implement. In the upcoming section we’ll see how the algorithm behaves.

正如你看到的,这算法真·不难。接下来我们一起来看看它的表现如何。

Effects of gravity (G) and time (T)

Gravity and time have a significant impact on the score of an item. Generally these things hold true:

  • the score decreases as T increases, meaning that older items will get lower and lower scores
  • the score decreases much faster for older items if gravity is increased

To see this visually we can plot the algorithm to Wolfram Alpha.

重力和时间参数对于一个帖子的得分有重大影响。通常来说下面这些理论是正确的:

  1. 当T增加的时候,分数随之下降,这意味着老帖子会得到越来越低的分数 (保证新鲜度)
  2. 对于老帖子的来说,如果重力参数增加,其分数掉的更快 (真·重力)

How score is behaving over time

Score 24 hours

As you can see the score decreases a lot as time goes by, for example a 24 hour old item will have a very low score regardless of how many votes it got.

分数与时间的关系:

正如你看到的,随着时间的推移,分数逐渐下降。举个栗子,一个已发布24小时的老帖子会根据其得到的评分而得到一个非常低的分数。

Plot query:

How gravity parameter behaves

Gravity effects

As you can see by the graph the score decreases a lot faster the larger the gravity is.

分数与重力参数的关系:

如图,当重力参数越大,分数掉的也越快。

Plotting query:

Python implementation

As already stated it’s rather simple to implementing the score function:

以上是如何用python实现这个算法。

The most crucial aspect is understanding how the algorithm behaves and how you can customize it for your application and I hope I have contributed that knowledge 🙂

人生已经如此艰难,有些算法就不必深究,最重要的是你知道这算法表现如何和如果应用在自己的程序里就行了。让我们大家来感谢作者吧(此处应有掌声)

Happy hacking!

Edit:
You can view comments to this post and a lot more thoughts on HN’s ranking here:

Edit:
Paul Graham has shared the updated HN ranking algorithm:

 

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.