mirror of
https://github.com/gnosygnu/xowa.git
synced 2026-03-02 03:49:30 +00:00
c
This commit is contained in:
@@ -451,7 +451,7 @@
|
||||
The new XOWA Search Engine uses PageRank to rate pages by importance. Although this works well for Wikipedia, it sometimes overrates pages which exist for encyclopedic book-keeping.
|
||||
</p>
|
||||
<p>
|
||||
For example, a lot of Wikipedia pages will have a small box called "Authority Control" at the bottom of the page. This box will have links to other pages like <a href="/site/en.wikipedia.org/wiki/Integrated_Authority_Control">https://en.wikipedia.org/wiki/Integrated_Authority_Control</a> If a million pages have this Integrated Authority Control link, then PageRank rates this page highly. ("1 million pages link to it!") However, the page itself is fairly short, and is not really one of the most important articles in Wikipedia (it would score higher than India, Insect, Italy, etc).
|
||||
For example, a lot of Wikipedia pages will have a small box called "Authority Control" at the bottom of the page. This box will have links to other pages like <a href="https://en.wikipedia.org/wiki/Integrated_Authority_Control" rel="nofollow" class="external free">https://en.wikipedia.org/wiki/Integrated_Authority_Control</a> If a million pages have this Integrated Authority Control link, then PageRank rates this page highly. ("1 million pages link to it!") However, the page itself is fairly short, and is not really one of the most important articles in Wikipedia (it would score higher than India, Insect, Italy, etc).
|
||||
</p>
|
||||
<p>
|
||||
v3.6.3 tries to reduce the importance of these pages if these articles are "short". This heuristic was already present in the previous versions of the search engine, but has been further tweaked.
|
||||
|
||||
Reference in New Issue
Block a user