mirror of
https://github.com/papers-we-love/papers-we-love.git
synced 2024-10-27 20:34:20 +00:00
Merge pull request #72 from shriphani/master
added tw-idf paper and description
This commit is contained in:
commit
78e2d7ecc7
19
information_retrieval/README.md
Normal file
19
information_retrieval/README.md
Normal file
@ -0,0 +1,19 @@
|
|||||||
|
## Information Retrieval
|
||||||
|
|
||||||
|
Information retrieval is the activity of obtaining information resources relevant to an information need from a collection of information resources. (Says Wikipedia).
|
||||||
|
|
||||||
|
The included documents are
|
||||||
|
|
||||||
|
*
|
||||||
|
[Graph of Word and TW-IDF](http://www.lix.polytechnique.fr/~rousseau/papers/rousseau-cikm2013.pdf) - Francois Rousseau & Michalis Vazirgiannis
|
||||||
|
|
||||||
|
The traditional IR system stores term-specific statistics (typically
|
||||||
|
a term's frequency in each document - which we call TF) in an index.
|
||||||
|
Such a model ignores dependencies between terms and considers a
|
||||||
|
document's terms to occur independently of each other (and is aptly
|
||||||
|
called the bag-of-words model). In this paper the authors use a
|
||||||
|
statistic that uses a graph representation of a document to encode
|
||||||
|
dependencies between terms and replace the TF statistic with a new
|
||||||
|
TW statistic based on the graph constructed and achieve
|
||||||
|
significantly better results that popular existing models. This
|
||||||
|
paper won a honorable mention at CIKM 2013.
|
BIN
information_retrieval/graph_of_word_and_tw_idf.pdf
Normal file
BIN
information_retrieval/graph_of_word_and_tw_idf.pdf
Normal file
Binary file not shown.
Loading…
Reference in New Issue
Block a user