mirror of
				https://github.com/papers-we-love/papers-we-love.git
				synced 2025-06-13 12:54:28 +00:00 
			
		
		
		
	Merge pull request #364 from Khalian/master
Adding the paper which introduced the bm25 similarity measure
This commit is contained in:
		
						commit
						450632415c
					
				| @ -18,3 +18,19 @@ The included documents are | ||||
|   paper won a honorable mention at CIKM 2013. | ||||
| 
 | ||||
| * [The Anatomy of a Large-Scale Hypertextual Web Search Engine](http://infolab.stanford.edu/~backrub/google.html) | ||||
| 
 | ||||
| * [:scroll:](ocapi-trec3.pdf) [Okapi System](http://trec.nist.gov/pubs/trec3/papers/city.ps.gz) - Stephen E. Robertson, Steve Walker, Susan Jones, Micheline Hancock-Beaulieu, and Mike Gatford | ||||
| 
 | ||||
|   This paper introduces the now famous Okapi information retrieval | ||||
|   framework which introduces the BM25 ranking function for ranked  | ||||
|   retrieval. It is one of the first implementations of the probabilistic | ||||
|   retrieval frameworks in literature. BM25 is a bag of words retrieval  | ||||
|   function. The IDF(Inverse document frequency) term can be interpreted | ||||
|   via information theory. If a query q appears in n(q) docs the probability | ||||
|   of picking a doc randomly and it containing that term :p(q) = n(q) / D,  | ||||
|   where D is the number of documents. The information content based on  | ||||
|   shannon's noisy channel model is = -log(p(q)) = log (D / n(q)). Smoothing | ||||
|   by adding a constant to both numberator and demoninator leads to IDF term | ||||
|   used in BM25. BM25 has been shown to be one of the best probabilistic  | ||||
|   weighting schemes. While the paper was in postscript form, the committer has | ||||
|   changed the format to pdf as per guidelines of papers we love via ps2pdf. | ||||
|  | ||||
							
								
								
									
										
											BIN
										
									
								
								information_retrieval/ocapi-trec3.pdf
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										
											BIN
										
									
								
								information_retrieval/ocapi-trec3.pdf
									
									
									
									
									
										Normal file
									
								
							
										
											Binary file not shown.
										
									
								
							
		Loading…
	
		Reference in New Issue
	
	Block a user