Apache

NAVIGATION
CATEGORIES
REFERRENCE
LINKS
  • Lucene search formula

    2 answers - 436 bytes - related search similar search Add To My Delicious Add To My Stumble Upon Add To My Google Mark Add To My Facebook Add To My Digg Add To My Reddit

    Hello,
    I was recently looking thru the lucene in action book and came across the
    scoring formula. I was wondering if the formula has changed since the book
    was written?
    Also was wondering if someone can breifly explain what the IDF(t) term in
    the formula means? In the book it says that it's the inverse document
    frequency of the term but doesn't explain beyond that?
    thanks,
    rajiv
  • No.1 | | 918 bytes | |

    : I was recently looking thru the lucene in action book and came across the
    : scoring formula. I was wondering if the formula has changed since the book
    : was written?

    no, but the book has some mistakes, and the scoring formula is one of
    them

    : Also was wondering if someone can breifly explain what the IDF(t) term in
    : the formula means? In the book it says that it's the inverse document
    : frequency of the term but doesn't explain beyond that?

    1) google is your friend.
    2) it's pretty much exactly what it sounds like it's the inverse of
    the document frequency for that term the more frequent the term is,
    the more documents it appears in, the lower the value is.

    -Hoss

    To unsubscribe, e-mail: java-user-unsubscribe (AT) lucene (DOT) apache.org
    For additional commands, e-mail: java-user-help (AT) lucene (DOT) apache.org
  • No.2 | | 700 bytes | |

    I have written a paper about Topic Detection and Tracking, where I also
    explain the TF-IDF-scheme. If you like, i can send you the paper.

    Aleksander

    Fri, 07 Jul 2006 04:46:52 +0200, Rajiv Roopan <rajiv.roopan (AT) gmail (DOT) com
    wrote:

    Hello,
    I was recently looking thru the lucene in action book and came across
    the
    scoring formula. I was wondering if the formula has changed since the
    book
    was written?

    Also was wondering if someone can breifly explain what the IDF(t) term
    in
    the formula means? In the book it says that it's the inverse document
    frequency of the term but doesn't explain beyond that?

    thanks,
    rajiv

Re: Lucene search formula


max 4000 letters.
Your nickname that display:
In order to stop the spam: 4 + 3 =
QUESTION ON "Apache"

EMSDN.COM