Java

NAVIGATION
CATEGORIES
REFERRENCE
LINKS
  • mod_jk 1.2.17+ Recover time

    29 answers - 1311 bytes - related search similar search Add To My Delicious Add To My Stumble Upon Add To My Google Mark Add To My Facebook Add To My Digg Add To My Reddit

    I've been testing the 1.2.17 (soon to be 1.2.18) release and have
    noticed a problem in worker recovery.
    If I restart a Tomcat instance and mod_jk notices that it went down,
    mod_jk waits 60 seconds recovery time before it tries again (see
    jk_lb_worker.h WAIT_BEFRE_RECVER and struct jk_shm_worker
    recover_wait_time).
    However, Tomcat will typically recover in just a handful of seconds so
    this results in nearly a minute of downtime when downtime should only
    be perhaps 10 seconds.
    Compounding this problem it doesn't appear to be possible to override
    this behavior either through the worker configuration and the status
    module forces a minimum of 60 seconds.
    The only workaround to this problem I see is to setup a Tomcat
    cluster, but this isn't feasible in all cases.
    I'm more than happy to help work up a patch to allow configuring of
    this parameter in the workers file and to allow a lower minimum
    recover_wait_time as well if such a patch would be accepted.
    course, if a mod_jk developer can cook it up quickly that is fine too,
    I will test it.
    -Dave
    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.1 | | 1764 bytes | |

    Is the 60 seconds hard-coded?

    I'd hope not

    you have some interesting web apps in Tomcat it often takes a bit
    longer than 10 seconds -- and on my laptop just took a full 60 seconds,
    but that is rather unusual (a restart thereafter only took 18).

    David Rees wrote:
    I've been testing the 1.2.17 (soon to be 1.2.18) release and have
    noticed a problem in worker recovery.

    If I restart a Tomcat instance and mod_jk notices that it went down,
    mod_jk waits 60 seconds recovery time before it tries again (see
    jk_lb_worker.h WAIT_BEFRE_RECVER and struct jk_shm_worker
    recover_wait_time).

    However, Tomcat will typically recover in just a handful of seconds so
    this results in nearly a minute of downtime when downtime should only
    be perhaps 10 seconds.

    Compounding this problem it doesn't appear to be possible to override
    this behavior either through the worker configuration and the status
    module forces a minimum of 60 seconds.

    The only workaround to this problem I see is to setup a Tomcat
    cluster, but this isn't feasible in all cases.

    I'm more than happy to help work up a patch to allow configuring of
    this parameter in the workers file and to allow a lower minimum
    recover_wait_time as well if such a patch would be accepted.
    course, if a mod_jk developer can cook it up quickly that is fine too,
    I will test it.

    -Dave

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.2 | | 568 bytes | |

    7/18/06, Jess Holle <jessh (AT) ptc (DOT) comwrote:
    Is the 60 seconds hard-coded?

    I'd hope not

    you have some interesting web apps in Tomcat it often takes a bit
    longer than 10 seconds -- and on my laptop just took a full 60 seconds,
    but that is rather unusual (a restart thereafter only took 18).

    Yes, it's hard-coded. See my references in my first post.
    -Dave

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.3 | | 835 bytes | |

    Well a new show stopper for 1.2.18 ;(

    2006/7/18, David Rees <drees76 (AT) gmail (DOT) com>:
    7/18/06, Jess Holle <jessh (AT) ptc (DOT) comwrote:
    Is the 60 seconds hard-coded?

    I'd hope not

    you have some interesting web apps in Tomcat it often takes a bit
    longer than 10 seconds -- and on my laptop just took a full 60 seconds,
    but that is rather unusual (a restart thereafter only took 18).

    Yes, it's hard-coded. See my references in my first post.

    -Dave

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
    --

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.4 | | 117 bytes | |

    Henri Gomez wrote:
    Well a new show stopper for 1.2.18 ;(
    Why it would be?
    JK 1.2.18 is still not tagged.
  • No.5 | | 2623 bytes | |

    No, I think it's not:

    1) This is not a regression, it was always implemented like that.

    2) The recover feature is used in the load balancer and the first way of
    avoiding errors is meant to be retries, the second way is failover.
    then comes recovery.

    3) A worker that goes into error state is something
    serious/heavy-weight. Timeouts leading to error state should not be
    chosen to small, so that workers go into errors just because of regular
    long running requests.

    4) Recovering a worker is not something lightweight, because a stuck
    tomcat might mean, that every recovery times out at full length.
    Remember: we are doing recovery with real requests. I think it's not a
    good idea to try recovering with real requests very often. That's the
    reason for only trying to recover rarely.

    5) we might have seperate management threads in mod_proxy_ it would
    make sense to probe failed workers more often.

    6) We could make the interval configurable, but there is a real danger
    of users thinking, that a low recovery interval, like 10 seconds would
    make things better, whereas it is very likely, that it would make there
    whole system kind of oscillate.

    Reagrds

    Rainer

    to the full timeouts in the worker

    Henri Gomez wrote:
    Well a new show stopper for 1.2.18 ;(

    2006/7/18, David Rees <drees76 (AT) gmail (DOT) com>:

    >7/18/06, Jess Holle <jessh (AT) ptc (DOT) comwrote:
    >Is the 60 seconds hard-coded?
    >>

    >I'd hope not
    >>

    >you have some interesting web apps in Tomcat it often takes a bit
    >longer than 10 seconds -- and on my laptop just took a full 60 seconds,
    >but that is rather unusual (a restart thereafter only took 18).
    >>

    >Yes, it's hard-coded. See my references in my first post.
    >>

    >-Dave
    >>

    >
    >To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    >For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
    >>
    >>


    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.6 | | 461 bytes | |

    Henri Gomez wrote:
    Well a new show stopper for 1.2.18 ;(

    Committed a fix that allows to have a
    worker.name.recover_time lower then 60 seconds.
    Previously the minimum value was 60 seconds, and
    now is 1 second.
    The default is still the same (60 seconds)

    Regards,
    Mladen.

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.7 | | 695 bytes | |

    Rainer Jung wrote:

    6) We could make the interval configurable, but there is a real danger
    of users thinking, that a low recovery interval, like 10 seconds would
    make things better, whereas it is very likely, that it would make there
    whole system kind of oscillate.

    Well, I don't wish to limit the users.
    If someone thinks that it should retry on the lower intervals,
    let him do that.
    Anyhow, why would 60 second be optimal value?
    It could as well be 90, 100, 180, etc

    Regards,
    Mladen.

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.8 | | 1379 bytes | |

    Mladen Turk wrote:
    Rainer Jung wrote:

    >>

    >6) We could make the interval configurable, but there is a real danger
    >of users thinking, that a low recovery interval, like 10 seconds would
    >make things better, whereas it is very likely, that it would make
    >there whole system kind of oscillate.
    >>


    Well, I don't wish to limit the users.
    If someone thinks that it should retry on the lower intervals,
    let him do that.

    Yes, but we are doing recovery centralized, that is not every process
    does it on it's own. We moved it to global maintenance, so to really
    profit in a reliable way from a decreased recovery_time a user would
    also need to lower worker.maintain. I'm hoing to addd a few doc lines
    about that later today (at least if you are not faster :) ).

    Anyhow, why would 60 second be optimal value?
    It could as well be 90, 100, 180, etc

    Increasing is something totally different. I just want to avoid people
    ending with a system that changes error/ok states with a high frequency,
    so that the whole system gets instable.

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.9 | | 770 bytes | |

    Rainer Jung wrote:
    No, I think it's not:

    1) This is not a regression, it was always implemented like that.

    Right, and the reason it was never changed was because
    no one gave any reason to change it.
    Like said, 60 seconds recover timeout was probably used
    since someone thought it should be 'fine'.
    TH the proper value would be 240, cause that's the
    common value of TCP_CLSE_WAIT timeout.

    Anyhow, leaving the current 60 second default value,
    but allowing lower intervals will change nothing to
    the 99.99% of the users.

    Regards,
    Mladen.

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.10 | | 968 bytes | |

    Rainer Jung wrote:
    Mladen Turk wrote:

    >Anyhow, why would 60 second be optimal value?
    >It could as well be 90, 100, 180, etc
    >>

    Increasing is something totally different. I just want to avoid people
    ending with a system that changes error/ok states with a high frequency,
    so that the whole system gets instable.

    Right, but why would 60 seconds be an optimal value?
    It can be 10 seconds as well. I don't know, cause I
    never (and I doubt anyone has) did some profiling
    that would tell how that impact the overall performance.
    We can only presume and guess, but until someone
    gives some real figures, why limiting something we
    only presume it might cause problems?

    Regards,
    Mladen.

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.11 | | 1391 bytes | |

    I'm K with your change.

    I think we should try to educate the users via doc, that they need to be
    careful when lowering these values to very small numbers. I don't know,
    if that's the right term, but the system needs some damping to keep it
    from switching very frequently between states. At least in most cases.

    Mladen Turk wrote:
    Rainer Jung wrote:

    >Mladen Turk wrote:
    >>

    Anyhow, why would 60 second be optimal value?
    It could as well be 90, 100, 180, etc

    >Increasing is something totally different. I just want to avoid people
    >ending with a system that changes error/ok states with a high
    >frequency, so that the whole system gets instable.
    >>


    Right, but why would 60 seconds be an optimal value?
    It can be 10 seconds as well. I don't know, cause I
    never (and I doubt anyone has) did some profiling
    that would tell how that impact the overall performance.
    We can only presume and guess, but until someone
    gives some real figures, why limiting something we
    only presume it might cause problems?

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.12 | | 403 bytes | |

    Henri Gomez wrote:
    Well a new show stopper for 1.2.18 ;(

    Why ? With the current implementation, low values will have extremely
    bad behavior in some cases. You should actually configure long
    intervals, without retries.

    R

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.13 | | 1478 bytes | |

    Rainer Jung wrote:
    No, I think it's not:

    1) This is not a regression, it was always implemented like that.

    2) The recover feature is used in the load balancer and the first way of
    avoiding errors is meant to be retries, the second way is failover.
    then comes recovery.

    3) A worker that goes into error state is something
    serious/heavy-weight. Timeouts leading to error state should not be
    chosen to small, so that workers go into errors just because of regular
    long running requests.

    4) Recovering a worker is not something lightweight, because a stuck
    tomcat might mean, that every recovery times out at full length.
    Remember: we are doing recovery with real requests. I think it's not a
    good idea to try recovering with real requests very often. That's the
    reason for only trying to recover rarely.

    5) we might have seperate management threads in mod_proxy_ it would
    make sense to probe failed workers more often.

    6) We could make the interval configurable, but there is a real danger
    of users thinking, that a low recovery interval, like 10 seconds would
    make things better, whereas it is very likely, that it would make there
    whole system kind of oscillate.

    I completely agree with everything here.

    R

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.14 | | 886 bytes | |

    Rainer Jung wrote:
    I'm K with your change.

    I think we should try to educate the users via doc, that they need to be
    careful when lowering these values to very small numbers. I don't know,
    if that's the right term, but the system needs some damping to keep it
    from switching very frequently between states. At least in most cases.

    Correctly. 'Be careful' is a proper term.
    If there is a direct link between mod_jk and Tomcat
    the most S-es will detect the broken link immediately.
    However if you have a faulty firewall that does not
    pass the FIN packets, then even the default 60 second
    is less lower then a 240 second system default.

    Regards,
    Mladen.

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.15 | | 3332 bytes | |

    Rainer Jung wrote:

    No, I think it's not:

    1) This is not a regression, it was always implemented like that.

    2) The recover feature is used in the load balancer and the first way
    of avoiding errors is meant to be retries, the second way is failover.
    then comes recovery.

    3) A worker that goes into error state is something
    serious/heavy-weight. Timeouts leading to error state should not be
    chosen to small, so that workers go into errors just because of
    regular long running requests.

    4) Recovering a worker is not something lightweight, because a stuck
    tomcat might mean, that every recovery times out at full length.
    Remember: we are doing recovery with real requests. I think it's not a
    good idea to try recovering with real requests very often. That's the
    reason for only trying to recover rarely.

    5) we might have seperate management threads in mod_proxy_ it
    would make sense to probe failed workers more often.

    I am preparing a health checker separed process from httpd to health
    check the workers. If not healthy no retries failover directly, the
    recovering will only occurs when the worker is marked healty again by
    the health checker process.

    6) We could make the interval configurable, but there is a real danger
    of users thinking, that a low recovery interval, like 10 seconds would
    make things better, whereas it is very likely, that it would make
    there whole system kind of oscillate.

    The next problem is to find a way to tell TC that its connexions have
    been closed (by a stupid firewall that eats the closes for example).
    That is nice to recover but how to make sure the TC part knows that
    something has went wrong.

    Cheers

    Jean-Frederic

    Reagrds

    Rainer

    to the full timeouts in the worker

    Henri Gomez wrote:
    >
    >Well a new show stopper for 1.2.18 ;(
    >>

    >2006/7/18, David Rees <drees76 (AT) gmail (DOT) com>:
    >>

    7/18/06, Jess Holle <jessh (AT) ptc (DOT) comwrote:
    Is the 60 seconds hard-coded?
    >

    I'd hope not
    >

    you have some interesting web apps in Tomcat it often takes a
    bit
    longer than 10 seconds -- and on my laptop just took a full 60
    seconds,
    but that is rather unusual (a restart thereafter only took 18).

    Yes, it's hard-coded. See my references in my first post.

    -Dave

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org


    >>

    >
    >To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    >For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
    >
    >


    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
    --

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.16 | | 443 bytes | |

    >
    The next problem is to find a way to tell TC that its connexions have
    been closed (by a stupid firewall that eats the closes for example).
    That is nice to recover but how to make sure the TC part knows that
    something has went wrong.

    FW who eat the FIN-CLSE ?

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.17 | | 750 bytes | |

    Henri Gomez wrote:

    >>

    >The next problem is to find a way to tell TC that its connexions have
    >been closed (by a stupid firewall that eats the closes for example).
    >That is nice to recover but how to make sure the TC part knows that
    >something has went wrong.
    >
    >

    FW who eat the FIN-CLSE ?

    Yes, if not something like that.

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
    --

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.18 | | 1469 bytes | |

    I'd say that it's a feature enhancement, not a show stopper.
    And not a regression nor a bug :)

    Jul 19, 2006, at 5:16 AM, Henri Gomez wrote:

    Well a new show stopper for 1.2.18 ;(

    2006/7/18, David Rees <drees76 (AT) gmail (DOT) com>:
    >7/18/06, Jess Holle <jessh (AT) ptc (DOT) comwrote:
    >Is the 60 seconds hard-coded?
    >>

    >I'd hope not
    >>

    >you have some interesting web apps in Tomcat it often takes
    >a bit
    >longer than 10 seconds -- and on my laptop just took a full 60
    >seconds,
    >but that is rather unusual (a restart thereafter only took 18).
    >>

    >Yes, it's hard-coded. See my references in my first post.
    >>

    >-Dave
    >>

    >
    >To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    >For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
    >>
    >>

    >


    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.19 | | 1751 bytes | |

    ok :)

    2006/7/19, Jim Jagielski <jim (AT) jagunet (DOT) com>:
    I'd say that it's a feature enhancement, not a show stopper.
    And not a regression nor a bug :)

    Jul 19, 2006, at 5:16 AM, Henri Gomez wrote:

    Well a new show stopper for 1.2.18 ;(

    2006/7/18, David Rees <drees76 (AT) gmail (DOT) com>:
    >7/18/06, Jess Holle <jessh (AT) ptc (DOT) comwrote:
    >Is the 60 seconds hard-coded?
    >>

    >I'd hope not
    >>

    >you have some interesting web apps in Tomcat it often takes
    >a bit
    >longer than 10 seconds -- and on my laptop just took a full 60
    >seconds,
    >but that is rather unusual (a restart thereafter only took 18).
    >>

    >Yes, it's hard-coded. See my references in my first post.
    >>

    >-Dave
    >>

    >
    >To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    >For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
    >>
    >>

    >


    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
    >
    >
    >


    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
    --

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.20 | | 598 bytes | |

    7/19/06, Mladen Turk <mturk (AT) apache (DOT) orgwrote:

    Committed a fix that allows to have a
    worker.name.recover_time lower then 60 seconds.
    Previously the minimum value was 60 seconds, and
    now is 1 second.
    The default is still the same (60 seconds)

    Thanks that should work around my issue quite nicely. I'll check out
    SVN and give a whirl (unless a new tag is to be rolled again shortly?)
    -Dave

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.21 | | 1531 bytes | |

    7/19/06, Rainer Jung <rainer.jung (AT) kippdata (DOT) dewrote:
    No, I think it's not:

    1) This is not a regression, it was always implemented like that.

    Really? I know it's been like this for a few releases now, but I
    remember some very old versions of mod_jk (from a couple years ago?)
    used to recover nearly instantly when Tomcat became available again.
    So it may not be a new regression, but at one time it did seem to work
    as I expected

    3) A worker that goes into error state is something
    serious/heavy-weight. Timeouts leading to error state should not be
    chosen to small, so that workers go into errors just because of regular
    long running requests.

    True, but the ping/pong feature avoids this problem quite nicely.

    4) Recovering a worker is not something lightweight, because a stuck
    tomcat might mean, that every recovery times out at full length.
    Remember: we are doing recovery with real requests. I think it's not a
    good idea to try recovering with real requests very often. That's the
    reason for only trying to recover rarely.

    But when all your workers are down, what is the harm in trying to
    recover more quicky? If you have at least one good worker available,
    then yes, I see no point in trying to rush recovery, but if none are
    available
    -Dave

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.22 | | 2117 bytes | |

    David Rees wrote:

    7/19/06, Rainer Jung <rainer.jung (AT) kippdata (DOT) dewrote:
    >
    >No, I think it's not:
    >>

    >1) This is not a regression, it was always implemented like that.
    >
    >

    Really? I know it's been like this for a few releases now, but I
    remember some very old versions of mod_jk (from a couple years ago?)
    used to recover nearly instantly when Tomcat became available again.
    So it may not be a new regression, but at one time it did seem to work
    as I expected
    >
    >3) A worker that goes into error state is something
    >serious/heavy-weight. Timeouts leading to error state should not be
    >chosen to small, so that workers go into errors just because of regular
    >long running requests.
    >
    >

    True, but the ping/pong feature avoids this problem quite nicely.
    >
    >4) Recovering a worker is not something lightweight, because a stuck
    >tomcat might mean, that every recovery times out at full length.
    >Remember: we are doing recovery with real requests. I think it's not a
    >good idea to try recovering with real requests very often. That's the
    >reason for only trying to recover rarely.
    >
    >

    But when all your workers are down, what is the harm in trying to
    recover more quicky?

    Because the TC on the other side is probably busy and you may cause a
    huge increase of threads (X2) And that will not help for the recovery.

    Cheers

    Jean-Frederic

    If you have at least one good worker available,
    then yes, I see no point in trying to rush recovery, but if none are
    available

    -Dave

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
    --

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.23 | | 666 bytes | |

    7/19/06, Jean-frederic Clere <jfclere (AT) gmail (DOT) comwrote:
    But when all your workers are down, what is the harm in trying to
    recover more quicky?

    Because the TC on the other side is probably busy and you may cause a
    huge increase of threads (X2) And that will not help for the recovery.

    Like I mentioned earlier, using the ping/pong feature nearly
    completely works around any issues with Apache processes/threads
    getting stuck waiting for Tomcat to recover.
    -Dave

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.24 | | 744 bytes | |

    7/19/06, Mladen Turk <mturk (AT) apache (DOT) orgwrote:
    Committed a fix that allows to have a
    worker.name.recover_time lower then 60 seconds.
    Previously the minimum value was 60 seconds, and
    now is 1 second.
    The default is still the same (60 seconds)

    While the change you made allows you to configure the worker to a
    recover_time lower than 60 seconds, it doesn't let you change it to a
    value lower than 60 using the status worker.

    Still investigating, but it looks like there are a number of other
    places it should be changed.
    -Dave

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.25 | | 492 bytes | |

    David Rees wrote:
    While the change you made allows you to configure the worker to a
    recover_time lower than 60 seconds, it doesn't let you change it to a
    value lower than 60 using the status worker.

    Still investigating, but it looks like there are a number of other
    places it should be changed.

    Done.

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.26 | | 348 bytes | |

    David Rees wrote:
    Thanks that should work around my issue quite nicely. I'll check out
    SVN and give a whirl (unless a new tag is to be rolled again shortly?)

    Try 1.2.18.

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.27 | | 468 bytes | |

    7/19/06, Rainer Jung <rainer.jung (AT) kippdata (DOT) dewrote:
    David Rees wrote:
    Thanks that should work around my issue quite nicely. I'll check out
    SVN and give a whirl (unless a new tag is to be rolled again shortly?)

    Try 1.2.18.

    1.2.18 works much better, thanks!
    -Dave

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.28 | | 749 bytes | |

    7/24/06, David Rees <drees76 (AT) gmail (DOT) comwrote:
    7/19/06, Rainer Jung <rainer.jung (AT) kippdata (DOT) dewrote:
    David Rees wrote:
    Thanks that should work around my issue quite nicely. I'll check out
    SVN and give a whirl (unless a new tag is to be rolled again shortly?)

    Try 1.2.18.

    1.2.18 works much better, thanks!

    I spoke too soon. I've been testing 1.2.18 further, and recover time
    while appearing to change, I can not get mod_jk to actually recover
    any faster than 60 seconds, even when recover time is set to 1.
    -Dave

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org
  • No.29 | | 1339 bytes | |

    7/25/06, David Rees <drees76 (AT) gmail (DOT) comwrote:
    7/24/06, David Rees <drees76 (AT) gmail (DOT) comwrote:
    7/19/06, Rainer Jung <rainer.jung (AT) kippdata (DOT) dewrote:
    David Rees wrote:
    Thanks that should work around my issue quite nicely. I'll check out
    SVN and give a whirl (unless a new tag is to be rolled again shortly?)

    Try 1.2.18.

    1.2.18 works much better, thanks!

    I spoke too soon. I've been testing 1.2.18 further, and recover time
    while appearing to change, I can not get mod_jk to actually recover
    any faster than 60 seconds, even when recover time is set to 1.

    K, It seems that the docs could use some clarification (can't be good
    if the devs seem to be a bit confused as well as to what they do), but
    you also need to set worker.maintain to a low value as well. mod_jk
    acts like it uses the minimum of either worker.maintain or the
    lb_worker's recover_time to determine the minimum amount of time a
    worker should be down.

    By setting worker.maintain to 10 and lb_worker's recover_time to 10 I
    can get a reasonable recovery time for a worker.
    -Dave

    To unsubscribe, e-mail: dev-unsubscribe (AT) tomcat (DOT) apache.org
    For additional commands, e-mail: dev-help (AT) tomcat (DOT) apache.org

Re: mod_jk 1.2.17+ Recover time


max 4000 letters.
Your nickname that display:
In order to stop the spam: 2 + 1 =
QUESTION ON "Java"

EMSDN.COM