Apache

NAVIGATION
CATEGORIES
REFERRENCE
LINKS
  • modify existing non-indexed field

    10 answers - 290 bytes - related search similar search Add To My Delicious Add To My Stumble Upon Add To My Google Mark Add To My Facebook Add To My Digg Add To My Reddit

    Is it possible to modify a stored field but not indexed? for example, if I
    have a field like this:
    new Field("address", address, Field.Store.YES, Field.Index.N)
    and I want to modify it like this:
    hits.doc(i).getField("address").set("11 Diana Street");
    Is it possible?
  • No.1 | | 755 bytes | |

    I don't think you've done anything to the index. This code is really
    equivalent to something like

    Field field = hits.doc(i).getField('address");
    field.set("11 Diana Street");

    You've changed the value of the field instance, but that is essentially a
    local variable (even though not explicit in your original snippet).

    It's been discussed often that Lucene doesn't allow in-place modifications
    of a document, you have to drop and re-add it. If your snippet worked, it'd
    allow in-place modifications and I'd be surprised if the experts knew that
    but never mentioned it to us common folk <G>

    course I've been totally wrong before

    Best
    Erick
  • No.2 | | 217 bytes | |

    but if you remove it and add it again, you'll need to re-index it again.
    don't you? When you do re-index, you'll have to close the reader, which
    would pause the search. Any better way of doint it?
  • No.3 | | 251 bytes | |

    Samstag 08 Juli 2006 00:03, dan2000 wrote:
    When you do re-index, you'll have to close the reader, which
    would pause the search. Any better way of doint it?
    Try using IndexModifier (added in Lucene 1.9).
    Regards
    Daniel
  • No.4 | | 1263 bytes | |

    Thanks a lot Doron.

    I thought I had to close all readers and searchers in order to new
    IndexWriter, otherwise I keep getting Lock timeout exception in a
    multithreaded environment. In my case, modifying/adding/deleting only
    happens occasionally, so I don't have a IndexWritter that is open all the
    time. In order to avoid lock timeout exception, I had to put a real lock in
    my search method, and a write lock in my add/modify method. I know this is
    not efficient enough since all the readers/searcher are re-opened each time
    adding/modifing happens.

    From what you said, I'm thinking of switching to IndexModifier. Before that,
    I have a few more questions:
    1. without using an exclusive lock, is there anyway to new a IndexModifier
    or IndexWritter while IndexReaders are serving search queries? or do I have
    to new a IndexWriiter when system starts and keep it open all the time?
    2. When adding a new document, and calling optimize(), do I have to close
    the readers/searchers?
    3. In my case, updating non-idexed fields happens a lot, can you explain the
    side data store in more detail? I'm not sure with this approch. Do you mean
    creating a seperate database and stores lucene's ids?
  • No.5 | | 336 bytes | |

    Thanks a lot Doron. I'm gonna give it a try now. The problem I've had before
    was that I set my writer to null right after close it. That's why I got lock
    timeout exception when i try to create a the writer again. Guess I just need
    to close it, and re-open it would avoid the locking problems then.

    Thanks.
  • No.6 | | 2626 bytes | |

    Here is the simplified code that causes problem (Lock obtain timed out).
    MyIndexer is used for indexing and searching. IndexTest starts 5 threads for
    indexing and 100 threads for searching.

    MyIndexer.java
    public class MyIndexer
    {
    File m_IndexFile;
    IndexReader m_IndexReader;
    Directory m_Directory;
    Searcher m_Searcher;
    Analyzer m_Analyzer;
    ReentrantReadWriteLock lock = new ReentrantReadWriteLock();

    public MyIndexer(File a_IndexFile) throws IException
    {
    m_IndexFile = a_IndexFile;
    m_Analyzer = new StandardAnalyzer();
    m_Directory = FSDirectory.getDirectory(m_IndexFile, false);
    m_IndexReader = IndexReader.open(m_Directory);
    m_Searcher = new IndexSearcher(m_IndexReader);
    }

    public void addAndIndex(String a_Name) throws IException
    {
    // lock.writeLock().lock();
    try
    {
    IndexWriter indexWriter = new IndexWriter(m_Directory,
    m_Analyzer,
    false);
    Document doc = new Document();
    doc.add(new Field("name", a_Name, Field.Store.YES,
    Field.Index.TKENIZED));
    indexWriter.addDocument(doc);
    indexWriter.optimize();
    indexWriter.close();
    }
    finally
    {
    // lock.writeLock().unlock();
    }
    }

    public Hits search(String a_Query)
    {
    try
    {
    // lock.readLock().lock();
    Query query = new QueryParser("name",
    m_Analyzer).parse(a_Query);
    Hits hits = m_Searcher.search(query);
    return hits;
    }
    catch (Exception e)
    {
    e.printStackTrace();
    }
    finally
    {
    // lock.readLock().unlock();
    }
    return null;
    }
    }

    IndexTest.java
    public class IndexTest extends Thread
    {
    private boolean m_DoIndex = true;
    static MyIndexer m_IndexEntity;

    public IndexTest(boolean a_DoIndex)
    {
    m_DoIndex = a_DoIndex;
    }

    public static void main(String[] args) throws Exception
    {
    m_IndexEntity = new MyIndexer(new File(
    "C:/mytest/myindex"));

    for (int i = 0; i < 5; i++)
    {
    new IndexTest(true).start();
    }

    for (int i = 0; i < 100; i++)
    {
    new IndexTest(false).start();
    }

    }

    @
    public void run()
    {
    for (int i = 0; i < 50; i++)
    {
    try
    {
    if (m_DoIndex)
    {
    String name = "Nick" + i;

    System.out.println("Add and Indexing");
    m_IndexEntity.addAndIndex(name);
    Thread.sleep(100);
    }
    else
    {
    System.out.println("Searching");
    m_IndexEntity.search("name:Nick" + i);
    Thread.sleep(20);
    }
    }
    catch (Exception e)
    {
    e.printStackTrace();
    }
    }
    }
    }
  • No.7 | | 170 bytes | |

    Thanks for your advice Doron. I've tried changing to one indexing thread
    (instead of 5) but still get the same problem. can't figure out why this
    happens.
  • No.8 | | 133 bytes | |

    I did clean everything but still getting the same problem. I'm using lucene
    2.0. Do you get the same problem on your machine?
  • No.9 | | 228 bytes | |

    can't access the file:
    Forbidden
    Remote Host: [62.172.205.164]
    You do not have permission to access
    Data files must be stored on the same site they are linked from.
    Thank you for using 20m.com
  • No.10 | | 360 bytes | |

    Thanks Doron. The site you provided doesn't support firefox, that's why had
    problem with downloading.

    Your code works fine and I've just noticed I didn't change the create
    parameter to false when I've leaned the index directory. Sorry for my
    mistake.

    Thanks a lot for your help Doron. You advice helped me a lot.

Re: modify existing non-indexed field


max 4000 letters.
Your nickname that display:
In order to stop the spam: 1 + 1 =
QUESTION ON "Apache"

EMSDN.COM