Databases

NAVIGATION
CATEGORIES
REFERRENCE
LINKS
  • Nested loops are killing throughput

    1 answers - 2532 bytes - related search similar search Add To My Delicious Add To My Stumble Upon Add To My Google Mark Add To My Facebook Add To My Digg Add To My Reddit

    Postgresql 8.1
    I've tried turning off nested loops to see what the query planner would choose instead of nested loops. It chose a hash join, and it still had a nested loop in the mix! How can I entice the query planner to use a simpler join scheme? What criteria is used to determine whether or not a merge join will be used?
    Here's an explain analyze
    Subquery Scan foo (cost=762.21762.50 rows=1 width=12) (actual time=10784.84910786.992 rows=9 loops=1)
    -GroupAggregate (cost=762.21762.24 rows=1 width=30) (actual time=10767.70210768.539 rows=9 loops=1)
    -Sort (cost=762.21762.21 rows=1 width=30) (actual time=10767.52510767.989 rows=360 loops=1)
    Sort Key: (dpdl.field_value)::timestamp with time zone
    -Nested Loop (cost=2.01762.20 rows=1 width=30) (actual time=283.82410749.007 rows=360 loops=1)
    -Nested Loop (cost=0.00744.28 rows=1 width=16) (actual time=31.2276980.765 rows=5436 loops=1)
    -Nested Loop (cost=0.00135.29 rows=101 width=16) (actual time=25.514273.660 rows=5436 loops=1)
    -Index Scan using ut_company_name_idx on user_table ut (cost=0.0021.96 rows=5 width=11) (actual time=6.5966.649 rows=9 loops=1)
    Index Cond: ((company_name)::text = Acme, Inc.'::text)
    -Index Scan using packet_user_idx on packet dp (cost=0.0019.89 rows=222 width=27) (actual time=16.93928.025 rows=604 loops=9)
    Index Cond: ((dp.username)::text = ("outer".username)::text)
    Filter: (trans_date (date_trunc('month'::text, (now() - '1 mon'::interval)) - '1 year 6 mons'::interval))
    -Index Scan using packet_status_puuid_pkey on packet_status dps (cost=0.006.02 rows=1 width=16) (actual time=1.2261.228 rows=1 loops=5436)
    Index Cond: (dps.packet_uuid = "outer".packet_uuid)
    Filter: ((status & 2) = 0)
    -Bitmap Heap Scan on packet_datalink dpdl (cost=2.0117.87 rows=4 width=30) (actual time=0.6850.688 rows=0 loops=5436)
    Recheck Cond: (dpdl.packet_uuid = "outer".packet_uuid)
    Filter: (((field_name)::text = 'event_date'::text) AND ((field_value)::date >= ('now'::text)::date))
    -Bitmap Index Scan on packet_dl_puuid_idx (cost=0.002.01 rows=4 width=0) (actual time=0.4020.402 rows=4 loops=5436)
    Index Cond: (dpdl.packet_uuid = "outer".packet_uuid)
    Total runtime: 10787.198 ms
    Also, no one here can figure out why the row count prediction is off on the packet table either. It's vacuumed and analyzed.
    (end of broadcast)
    TIP 6: explain analyze is your friend
  • No.1 | | 1017 bytes | |

    CG <cgg007 (AT) yahoo (DOT) comwrites:
    -Nested Loop (cost=0.00744.28 rows=1 width=16) (actual time=31.2276980.765 rows=5436 loops=1)
    -Nested Loop (cost=0.00135.29 rows=101 width=16) (actual time=25.514273.660 rows=5436 loops=1)
    -Index Scan using packet_status_puuid_pkey on packet_status dps (cost=0.006.02 rows=1 width=16) (actual time=1.2261.228 rows=1 loops=5436)
    Index Cond: (dps.packet_uuid = "outer".packet_uuid)
    Filter: ((status & 2) = 0)

    problem you've got is that the planner has no stats about the
    selectivity of that status condition. My advice would be to forget the
    cute bitmask and store one or more plain boolean columns. Easier to
    write queries against and way more tractable statistically.

    regards, tom lane

    (end of broadcast)
    TIP 1: if posting/reading through Usenet, please send an appropriate
    subscribe-nomail command to majordomo (AT) postgresql (DOT) org so that your
    message can get through to the mailing list cleanly

Re: Nested loops are killing throughput


max 4000 letters.
Your nickname that display:
In order to stop the spam: 9 + 8 =
QUESTION ON "Databases"

EMSDN.COM