reload getting the mode of (subreg:HI (reg:QI pseudo)) wrong
7 answers - 4556 bytes -

Hi.
In my i8086 backend experiments, I seem to have come across a corner case
which confuses reload. While building libgcc2 for one of the multilib
variants, GCC crashes:
libgcc2.c: In function '__muldc3':
libgcc2.c:1854: error: insn does not satisfy its constraints:
(insn:HI 2485 2483 2604 39 libgcc2.c:1825 (set (reg:HI 5 dh)
(ashiftrt:HI (reg:HI 2 a [641])
(const_int 15 [0xf]))) 31 {*ashrhi3_const15}
(insn_list:REG_DEP_TRUE 2480 (nil))
(nil))
libgcc2.c:1854: internal compiler error: in reload_cse_simplify_operands, at postreload.c:393
The "*ashrhi3_const15" instruction is defined like this:
(define_insn "*ashrhi3_const15"
[(set (match_operand:HI 0 "single_register_operand" "=d")
(ashiftrt:HI (match_operand:HI 1 "single_register_operand" "a")
(const_int 15)))]
)
The constraints are not met because the constraint "d" is register class
DX_REGS consisting of register 4 and 5, and (reg:HI 5 dh) spans register 5
and 6. Also, HARD_REGNNREGS (5, HImode) returns 0.
The lreg and greg dumps provide the clue that reload got the mode wrong:
(insn:HI 2485 2483 2486 39 libgcc2.c:1825 (set (subreg:HI (reg:QI 178) 0)
(ashiftrt:HI (reg:HI 641)
(const_int 15 [0xf]))) 31 {*ashrhi3_const15}
(insn_list:REG_DEP_TRUE 2480 (nil))
(expr_list:REG_DEAD (reg:HI 641)
(nil)))
Reloads for insn # 2485
Reload 0: reload_out (QI) = (reg:QI 178)
DX_REGS, RELAD_FRUTPUT (opnum = 0)
reload_out_reg: (reg:QI 178)
reload_reg_rtx: (reg:QI 5 dh)
Reload of course need to provide a HImode (or larger) register for a
HImode operand. Notice the paradoxical subreg as output operand before
reload.
At around reload.c line 1083, in push_reload(), we have:
if (out != 0 && GET_CDE (out) == SUBREG
&& (subreg_lowpart_p (out) || strict_low)
[skipping the rest of the if() condition until line 1125]
{
out_subreg_loc = outloc;
outloc = &SUBREG_REG (out);
out = *outloc;
#if ! defined (LAD_EXTENDP) && ! defined (WRD_REGISTERPERATINS)
gcc_assert (!MEM_P (out)
|| GET_MDE_SIZE (GET_MDE (out))
<= GET_MDE_SIZE (outmode));
#endif
outmode = GET_MDE (out);
}
IW, we set outmode to QImode. Later on, from around line 1298, we have:
i = n_reloads;
rld[i].in = in;
rld[i].out = out;
rld[i].class = class;
rld[i].inmode = inmode;
rld[i].outmode = outmode;
rld[i].reg_rtx = 0;
rld[i].optional = optional;
rld[i].inc = 0;
rld[i].nocombine = 0;
rld[i].in_reg = inloc ? *inloc : 0;
rld[i].out_reg = outloc ? *outloc : 0;
rld[i].opnum = opnum;
rld[i].when_needed = type;
rld[i].secondary_in_reload = secondary_in_reload;
rld[i].secondary_out_reload = secondary_out_reload;
rld[i].secondary_in_icode = secondary_in_icode;
rld[i].secondary_out_icode = secondary_out_icode;
rld[i].secondary_p = 0;
We still have a few uninitialized fields. When returning to find_reloads(),
from around line 4410, this happens:
/* Compute reload_mode and reload_nregs. */
for (i = 0; i < n_reloads; i++)
{
rld[i].mode
= (rld[i].inmode == VIDmode
|| (GET_MDE_SIZE (rld[i].outmode)
GET_MDE_SIZE (rld[i].inmode)))
? rld[i].outmode : rld[i].inmode;
rld[i].nregs = CLASS_MAX_NREGS (rld[i].class, rld[i].mode);
}
Both rld[i].outmode and rld[i].mode are now set to QImode and rld[i].nregs
becomes 1 instead of 2 because CLASS_MAX_NREGS() gets the wrong mode.
Later, we get to choose_reload_regs(), which calls allocate_reload_regs(),
which has some sort of spill reg round-robing caching mechanism and by pure
fluke picks register 5 the first time around. There are tests to filter out
invalid registers:
&& TEST_HARD_REG_BIT (reg_class_contents[class], regnum)
&& HARD_REGNMDEK (regnum, rld[r].mode)
HARD_REGNMDEK (5, HImode) returns 0, but we got the mode wrong
previously, so the test becomes HARD_REGNMDEK (5, QImode) which
returns 1. Finally, allocate_reload_regs() call set_reload_reg() to commit
the register choice.
It seems necessary to recover the mode of the operand when computing
rld[i].mode. There is operand_mode[rld[i].opnum] and in case of a "0"
constraint, I'm sure it is possible to find the matching operand too
somehow. Then that information can be used to ensure rld[i].mode is large
enough. Is there any reason that operand_mode[] would not be reliable?
No.1 | | 1166 bytes |
| 
Rask Ingemann Lambertsen <rask (AT) sygehus (DOT) dkwrites:
In my i8086 backend experiments, I seem to have come across a corner case
which confuses reload.
Interesting case.
It seems necessary to recover the mode of the operand when computing
rld[i].mode. There is operand_mode[rld[i].opnum] and in case of a "0"
constraint, I'm sure it is possible to find the matching operand too
somehow. Then that information can be used to ensure rld[i].mode is large
enough. Is there any reason that operand_mode[] would not be reliable?
In general, operand_mode[] will be unreliable in cases where it is not
specified. This is frowned upon but more or less permitted, and a few
backends take advantage of it for relatively nefarious purposes.
Still, I can't think of a better approach for this problem. It's
worth a try to set rld[i].mode to operand_mode[] when operand_mode[]
is larger. I don't see that you will have to worry about matching
constraints. A big-endian system might be an issue; hopefully
reload_adjust_reg_for_mode is called whereever necessary.
Ian
No.2 | | 936 bytes |
| 
Rask Ingemann Lambertsen wrote:
The constraints are not met because the constraint "d" is register class
DX_REGS consisting of register 4 and 5, and (reg:HI 5 dh) spans register 5
and 6. Also, HARD_REGNNREGS (5, HImode) returns 0.
The lreg and greg dumps provide the clue that reload got the mode wrong:
(insn:HI 2485 2483 2486 39 libgcc2.c:1825 (set (subreg:HI (reg:QI 178) 0)
(ashiftrt:HI (reg:HI 641)
(const_int 15 [0xf]))) 31 {*ashrhi3_const15}
(insn_list:REG_DEP_TRUE 2480 (nil))
(expr_list:REG_DEAD (reg:HI 641)
(nil)))
Probably the compiler doesn't in general like a paradoxical subreg that
can take more hard regs than its SUBREG_REG. I think this is probably
something that can be worked around with a proper combination of
MDES_TIEABLE_P, CANNT_CHANGE_MDE_CLASS, and maybe
REG_CANNT_CHANGE_MDE_P. What are your definitions of those macros?
Bernd
No.3 | | 347 bytes |
| 
Fri, Aug 04, 2006 at 02:30:34AM +0200, Rask Ingemann Lambertsen wrote:
The constraints are not met because the constraint "d" is register class
DX_REGS consisting of register 4 and 5, and (reg:HI 5 dh) spans register 5
and 6. Also, HARD_REGNNREGS (5, HImode) returns 0.
I meant to say that HARD_REGNMDEK (5, HImode) returns 0.
No.4 | | 1182 bytes |
| 
Fri, Aug 04, 2006 at 11:21:05AM +0200, Bernd Schmidt wrote:
Probably the compiler doesn't in general like a paradoxical subreg that
can take more hard regs than its SUBREG_REG. I think this is probably
something that can be worked around with a proper combination of
MDES_TIEABLE_P, CANNT_CHANGE_MDE_CLASS, and maybe
REG_CANNT_CHANGE_MDE_P. What are your definitions of those macros?
#define MDES_TIEABLE_P(MDE1, MDE2) \
(GET_MDE_SIZE(MDE1) 1 && GET_MDE_SIZE(MDE2) 1)
There are registers for which HARD_REGNMDEK (regno, QImode) returns
false but HARD_REGNMDEK (regno, HImode) (or larger) returns true.
#define CANNT_CHANGE_MDE_CLASS(FRM, T, CLASS) \
((FRM == QImode || (T) == QImode) \
&& reg_classes_intersect_p (HI_REGS, (CLASS)))
HI_REGS is a class containing registers which can't hold QImode values. They
are 16-bit only (%si, %di, %bp and %sp). Is there an undocumented assumption
here that if CANNT_CHANGE_MDE_CLASS() returns true, then the number of hard
regs used is unchanged?
I haven't heard of REG_CANNT_CHANGE_MDE_P before, and it isn't mentioned
in the documentation.
No.5 | | 2792 bytes |
| 
Thu, Aug 03, 2006 at 11:55:49PM -0700, Ian Lance Taylor wrote:
In general, operand_mode[] will be unreliable in cases where it is not
specified. This is frowned upon but more or less permitted, and a few
backends take advantage of it for relatively nefarious purposes.
I only have modeless operands in "call_value" and "setmemhi" and I don't
know how that can be avoided. But presumably, GET_MDE_BITSIZE (VIDmode)
returns something sensible like 0, so it shouldn't be a problem.
Still, I can't think of a better approach for this problem. It's
worth a try to set rld[i].mode to operand_mode[] when operand_mode[]
is larger. I don't see that you will have to worry about matching
constraints. A big-endian system might be an issue; hopefully
reload_adjust_reg_for_mode is called whereever necessary.
I tried this patch to find_reloads(), and it did solve the problem.
Index: gcc/reload.c
gcc/reload.c(revision 115810)
gcc/reload.c(working copy)
@@ -4416,6 +4428,12 @@
GET_MDE_SIZE (rld[i].inmode)))
? rld[i].outmode : rld[i].inmode;
+ /* Don't be fooled by e.g. paradoxical subreg operands.
+ We need a register which is at least as wide as the operand. */
+ if (GET_MDE_BITSIZE (rld[i].mode)
+ < GET_MDE_BITSIZE (operand_mode[rld[i].opnum]))
+rld[i].mode = operand_mode[i];
+
rld[i].nregs = CLASS_MAX_NREGS (rld[i].class, rld[i].mode);
}
Before reload:
(insn:HI 2485 2483 2486 39 (set (subreg:HI (reg:QI 178) 0)
(ashiftrt:HI (reg:HI 641)
(const_int 15 [0xf]))) 31 {*ashrhi3_const15} (insn_list:REG_DEP_TRUE 2480 (nil))
(expr_list:REG_DEAD (reg:HI 641)
(nil)))
After reload:
Reloads for insn # 2485
Reload 0: reload_out (QI) = (reg:QI 178)
DX_REGS, RELAD_FRUTPUT (opnum = 0)
reload_out_reg: (reg:QI 178)
reload_reg_rtx: (reg:HI 4 d)
(insn:HI 2485 2483 2604 39 (set (reg:HI 4 d)
(ashiftrt:HI (reg:HI 2 a [641])
(const_int 15 [0xf]))) 31 {*ashrhi3_const15} (insn_list:REG_DEP_TRUE 2480 (nil))
(nil))
(insn 2604 2485 2486 39 (set (mem/c:QI (plus:HI (reg/f:HI 10 bp)
(const_int -42 [0xffffffd6])) [10 S1 A8])
(reg:QI 4 d)) 34 {movqi} (nil)
(nil))
This is close to what it looked like before combine (in the life1 dump):
(insn 2483 2480 2485 40 (set (reg:HI 643)
(ashiftrt:HI (reg:HI 641)
(const_int 15 [0xf]))) 31 {*ashrhi3_const15} (insn_list:REG_DEP_TRUE 2480 (nil))
(expr_list:REG_DEAD (reg:HI 641)
(nil)))
(insn 2485 2483 2486 40 (set (reg:QI 178)
(subreg:QI (reg:HI 643) 0)) 34 {movqi} (insn_list:REG_DEP_TRUE 2483 (nil))
(expr_list:REG_DEAD (reg:HI 643)
(nil)))
But nice try anyway, combine. :-)
No.6 | | 1114 bytes |
| 
Bernd Schmidt <bernds_cb1 (AT) t-online (DOT) dewrites:
Rask Ingemann Lambertsen wrote:
The constraints are not met because the constraint "d" is register class
DX_REGS consisting of register 4 and 5, and (reg:HI 5 dh) spans register 5
and 6. Also, HARD_REGNNREGS (5, HImode) returns 0.
The lreg and greg dumps provide the clue that reload got the mode
wrong:
(insn:HI 2485 2483 2486 39 libgcc2.c:1825 (set (subreg:HI (reg:QI
178) 0)
(ashiftrt:HI (reg:HI 641)
(const_int 15 [0xf]))) 31 {*ashrhi3_const15}
(insn_list:REG_DEP_TRUE 2480 (nil))
(expr_list:REG_DEAD (reg:HI 641)
(nil)))
Probably the compiler doesn't in general like a paradoxical subreg
that can take more hard regs than its SUBREG_REG. I think this is
probably something that can be worked around with a proper combination
of MDES_TIEABLE_P, CANNT_CHANGE_MDE_CLASS, and maybe
REG_CANNT_CHANGE_MDE_P. What are your definitions of those macros?
This also suggests the possibility of writing a QImode version of the
insn which uses a scratch register.
Ian
No.7 | | 1962 bytes |
| 
Fri, Aug 04, 2006 at 11:21:05AM +0200, Bernd Schmidt wrote:
Probably the compiler doesn't in general like a paradoxical subreg that
can take more hard regs than its SUBREG_REG. I think this is probably
something that can be worked around with a proper combination of
MDES_TIEABLE_P, CANNT_CHANGE_MDE_CLASS, and maybe
REG_CANNT_CHANGE_MDE_P.
This should be documented! It is not obvious that this may be the case.
I backed out the patch I put in and changed my definition of
CANNT_CHANGE_MDE_CLASS() to this:
#define CANNT_CHANGE_MDE_CLASS(FRM, T, CLASS)\
(GET_MDE_SIZE(T) GET_MDE_SIZE(FRM)\
|| (((T) == QImode || (FRM) == QImode)\
&& reg_classes_intersect_p (HI_REGS, (CLASS))))
This disallows any mode change which will require more hard regs. Now, the
insn still looks the same before reload:
(insn:HI 2485 2483 2486 39 libgcc2.c:1825 (set (subreg:HI (reg:QI 178) 0)
(ashiftrt:HI (reg:HI 641)
(const_int 15 [0xf]))) 31 {*ashrhi3_const15}
(insn_list:REG_DEP_TRUE 2480 (nil))
(expr_list:REG_DEAD (reg:HI 641)
(nil)))
But reload now reloads the whole subreg instead of just the inner reg:
Reloads for insn # 2485
Reload 0: reload_out (HI) = (subreg:HI (reg:QI 178) 0)
DX_REGS, RELAD_FRUTPUT (opnum = 0)
reload_out_reg: (subreg:HI (reg:QI 178) 0)
reload_reg_rtx: (reg:HI 4 d)
Reload 1: reload_in (HI) = (reg:HI 0 c [641])
AX_REGS, RELAD_FR_INPUT (opnum = 1)
reload_in_reg: (reg:HI 0 c [641])
reload_reg_rtx: (reg:HI 2 a)
The resulting insns are correct:
(insn:HI 2485 2606 2605 39 libgcc2.c:1825 (set (reg:HI 4 d)
(ashiftrt:HI (reg:HI 2 a)
(const_int 15 [0xf]))) 31 {*ashrhi3_const15}
(insn_list:REG_DEP_TRUE 2480 (nil))
(nil))
(insn 2605 2485 2486 39 libgcc2.c:1825 (set (mem/c:QI (plus:HI (reg/f:HI 10 bp)
(const_int -42 [0xffffffd6])) [10 S1 A8])
(reg:QI 4 d)) 34 {movqi} (nil)
(nil))