Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Central CMakeLists, at root? #1

Open
AlecTaylor opened this issue Nov 8, 2013 · 0 comments
Open

Central CMakeLists, at root? #1

AlecTaylor opened this issue Nov 8, 2013 · 0 comments

Comments

@AlecTaylor
Copy link

Would it be possible to get some mechanism for building all projects, built in?

E.g.: a shell script; or preferably what is needed by cmake?

Thanks for considering

chapuni pushed a commit that referenced this issue Nov 19, 2013
order of slices of the alloca which have exactly the same size and other
properties. This was found by a perniciously unstable sort
implementation used to flush out buggy uses of the algorithm.

The fundamental idea is that findCommonType should return the best
common type it can find across all of the slices in the range. There
were two bugs here previously:

1) We would accept an integer type smaller than a byte-width multiple,
   and if there were different bit-width integer types, we would accept
   the first one. This caused an actual failure in the testcase updated
   here when the sort order changed.
2) If we found a bad combination of types or a non-load, non-store use
   before an integer typed load or store we would bail, but if we found
   the integere typed load or store, we would use it. The correct
   behavior is to always use an integer typed operation which covers the
   partition if one exists.

While a clever debugging sort algorithm found problem #1 in our existing
test cases, I have no useful test case ideas for #2. I spotted in by
inspection when looking at this code.
chapuni pushed a commit that referenced this issue Nov 21, 2013
------------------------------------------------------------------------
r195118 | chandlerc | 2013-11-19 01:03:18 -0800 (Tue, 19 Nov 2013) | 22 lines

Fix an issue where SROA computed different results based on the relative
order of slices of the alloca which have exactly the same size and other
properties. This was found by a perniciously unstable sort
implementation used to flush out buggy uses of the algorithm.

The fundamental idea is that findCommonType should return the best
common type it can find across all of the slices in the range. There
were two bugs here previously:

1) We would accept an integer type smaller than a byte-width multiple,
   and if there were different bit-width integer types, we would accept
   the first one. This caused an actual failure in the testcase updated
   here when the sort order changed.
2) If we found a bad combination of types or a non-load, non-store use
   before an integer typed load or store we would bail, but if we found
   the integere typed load or store, we would use it. The correct
   behavior is to always use an integer typed operation which covers the
   partition if one exists.

While a clever debugging sort algorithm found problem #1 in our existing
test cases, I have no useful test case ideas for #2. I spotted in by
inspection when looking at this code.
------------------------------------------------------------------------
chapuni pushed a commit that referenced this issue Dec 19, 2013
In those set of patches, Ashok changed Module::ResolveSymbolContextForAddress
so that if it failed to find a symbol for a pc, it could back up
the pc value by 1 and re-search for a symbol.

His change to RegisterContextLLDB.cpp partially duplicates that
behavior but it also removes the separate case where we find a
Symbol for the pc address but it's the wrong symbol -- we need to
handle this as well as the lookup-by-pc-finds-no-symbol case.

The most obvious fallout from this regression was that lldb on
Mac OS X couldn't backtrace past __assert_rtn() which tail-calls
abort().  e.g.

(lldb) bt
* thread #1: tid = 0x5d6ea1, 0x00007fff8ee80866 libsystem_kernel.dylib`__pthread_kill + 10, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
  * frame #0: 0x00007fff8ee80866 libsystem_kernel.dylib`__pthread_kill + 10
    frame #1: 0x00007fff8eb5835c libsystem_pthread.dylib`pthread_kill + 92
    frame #2: 0x00007fff8852ab1a libsystem_c.dylib`abort + 125
    frame #3: 0x00007fff884f49bf libsystem_c.dylib`__assert_rtn + 321
    frame #4: 0x0000000100000f2c a.out`main + 124

(lldb) dis -c 3 -s 0x7fff884f49b3
libsystem_c.dylib`__assert_rtn + 309:
   0x7fff884f49b3:  movq   %rax, -0x11b96242(%rip)   ; gCRAnnotations + 8
   0x7fff884f49ba:  callq  0x7fff8854fd2c            ; symbol stub for: abort

libsystem_c.dylib`basename:
   0x7fff884f49bf:  pushq  %rbp
(lldb)

in this case, __assert_rtn() is immediately followed by basename() and
the changes in r190812 didn't back up the pc value to get the correct
function name / unwind info.

<rdar://problem/15367233>
chapuni pushed a commit that referenced this issue Jan 7, 2014
This commit adds the pre-UAL aliases of fconsts and fconstd for
vmov.f32 and vmov.f64. They use an InstAlias rather than a
MnemonicAlias to properly support the predicate operand.

We need to support encoded 8-bit constants in order to implement the
pre-UAL fconsts/fconstd aliases for vmov.f32/vmov.f64, so this
commit also fixes parsing of encoded floating point constants used
in vmov.f32/vmov.f64 instructions. Now we can support assembly code
like this:

  fconsts s0, #0x70

which is equivalent to vmov.f32 s0, #1.0.

Most of the code was already in place to support this feature.
Previously the code was trying to accept encoded 8-bit float
constants for the vmov.f32/vmov.f64 instructions.  It looks like the
support for parsing encoded floats was lost in a refactoring in
commit r148556 and we did not have any tests in place to catch it.

The change in this commit is to keep the parsed value as a 32-bit
float instead of a 64-bit double because that is what the isFPImm()
function expects to find. There is no loss of precision by using a
32-bit float here because we are still limited to an 8-bit encoded
value in the end.

Additionally, we explicitly reject encoded 8-bit floats for
vmovf.32/64. This is the same as the current behavior, but we now do
it explicitly rather than accidently.
chapuni pushed a commit that referenced this issue Feb 14, 2014
1) Fix a specific bug when certain conversion functions are called in a program compiled as mips16 with hard float and
the program is linked as c++. There are two libraries that are reversed in the link order with gcc/g++ and clang/clang++ for
mips16 in this case and the proper stubs will then not be called. These stubs are normally handled in the Mips16HardFloat pass
but in this case we don't know at that time that we need to generate the stubs. This must all be handled later in code generation
and we have moved this functionality to MipsAsmPrinter. When linked as C (gcc or clang) the proper stubs are linked in from libc.

2) Set up the infrastructure to handle 90% of what is in the Mips16HardFloat pass in this new area of MipsAsmPrinter. This is a more
logical place to handle this and we have known for some time that we needed to move the code later and not implement it using
inline asm as we do now but it was not clear exactly where to do this and what mechanism should be used. Now it's clear to us
how to do this and this patch contains the infrastructure to move most of this to MipsAsmPrinter but the actual moving will be done
in a follow on patch. The same infrastructure is used to fix this current bug as described in #1. This change was requested by the list
during the original putback of the Mips16HardFloat pass but was not practical for us do at that time.
chapuni pushed a commit that referenced this issue Feb 25, 2014
  Don't actually Halt in the Interrupt handler for the Process, just
  send an AsyncInterrupt.  That's actually not async-signal-clean, but
  it is a lot safer than Halt...

The underlying problem is actually a nested pthread_cond_wait from the
signal handler.  Note frames 4, 13, 18 in the backtrace of the aborting
path below.

    frame #1: 0x000000080715fff9 libc.so.7`abort + 73 at abort.c:65
    frame #2: 0x0000000805d20fda libthr.so.3`_thread_exit(fname=<unavailable>, lineno=<unavailable>, msg=<unavailable>) + 58 at thr_exit.c:182
    frame #3: 0x0000000805d1fdc8 libthr.so.3`cond_wait_common [inlined] cond_wait_user(mp=<unavailable>, abstime=<unavailable>, cancel=<unavailable>) + 936 at thr_cond.c:223
    frame #4: 0x0000000805d1fd5b libthr.so.3`cond_wait_common(cond=<unavailable>, mutex=<unavailable>, abstime=<unavailable>, cancel=<unavailable>) + 827 at thr_cond.c:311
    frame #5: 0x00000008013450b5 liblldb.so.3.5`lldb_private::Condition::Wait(lldb_private::Mutex&, lldb_private::TimeValue const*, bool*) + 117
    frame #6: 0x00000008013411e8 liblldb.so.3.5`lldb_private::Predicate<bool>::WaitForValueEqualTo(bool, lldb_private::TimeValue const*, bool*) + 200
    frame #7: 0x00000008013eb34c liblldb.so.3.5`lldb_private::Listener::WaitForEventsInternal(lldb_private::TimeValue const*, lldb_private::Broadcaster*, lldb_private::ConstString const*, unsigned int, unsigned int, std::__1::shared_ptr<lldb_private::Event>&) + 876
    frame #8: 0x00000008013eb751 liblldb.so.3.5`lldb_private::Listener::WaitForEvent(lldb_private::TimeValue const*, std::__1::shared_ptr<lldb_private::Event>&) + 81
    frame #9: 0x00000008017c5bcf liblldb.so.3.5`lldb_private::Process::Halt(bool) + 783
    frame #10: 0x00000008017def3a liblldb.so.3.5`IOHandlerProcessSTDIO::Interrupt() + 74
    frame #11: 0x00000008013823d3 liblldb.so.3.5`lldb_private::Debugger::DispatchInputInterrupt() + 115
    frame #12: 0x00000008011d69c5 liblldb.so.3.5`lldb::SBDebugger::DispatchInputInterrupt() + 69
    frame #13: 0x000000000040b254 lldb`sigint_handler(int) + 68
    frame #14: 0x0000000805d1b3da libthr.so.3`handle_signal(actp=<unavailable>, sig=<unavailable>, info=<unavailable>, ucp=<unavailable>) + 234 at thr_sig.c:240
    frame #15: 0x0000000805d1afc2 libthr.so.3`thr_sighandler(sig=<unavailable>, info=<unavailable>, _ucp=<unavailable>) + 306 at thr_sig.c:183
    frame #16: 0x00007ffffffff003
    frame #17: 0x0000000805d1fc7e libthr.so.3`cond_wait_common [inlined] cond_wait_user(mp=<unavailable>, abstime=<unavailable>, cancel=1) + 239 at thr_cond.c:255
    frame #18: 0x0000000805d1fb8f libthr.so.3`cond_wait_common(cond=<unavailable>, mutex=<unavailable>, abstime=0x0000000000000000, cancel=1) + 367 at thr_cond.c:311
    frame #19: 0x00000008013450d2 liblldb.so.3.5`lldb_private::Condition::Wait(lldb_private::Mutex&, lldb_private::TimeValue const*, bool*) + 146
chapuni pushed a commit that referenced this issue Mar 7, 2014
After hitting the malloc() breakpoint on FreeBSD our top frame is actually
an inlined function malloc_init.

  * frame #0: 0x0000000800dcba19 libc.so.7`malloc [inlined] malloc_init at malloc.c:5397
    frame #1: 0x0000000800dcba19 libc.so.7`malloc(size=1024) + 9 at malloc.c:5949
    frame #2: 0x00000000004006e5 test_step_out_of_malloc_into_function_b_with_dwarf`b(val=1) + 37 at main2.cpp:29

Add a heuristic to keep stepping out until we come to a non-malloc caller,
before checking if it is our desired caller from the test code.

llvm.org/pr17944
chapuni pushed a commit that referenced this issue Mar 10, 2014
…ify-libcall

optimize a call to a llvm intrinsic to something that invovles a call to a C
library call, make sure it sets the right calling convention on the call.

e.g.
extern double pow(double, double);
double t(double x) {
  return pow(10, x);
}

Compiles to something like this for AAPCS-VFP:
define arm_aapcs_vfpcc double @t(double %x) #0 {
entry:
  %0 = call double @llvm.pow.f64(double 1.000000e+01, double %x)
  ret double %0
}

declare double @llvm.pow.f64(double, double) #1

Simplify libcall (part of instcombine) will turn the above into:
define arm_aapcs_vfpcc double @t(double %x) #0 {
entry:
  %__exp10 = call double @__exp10(double %x) #1
  ret double %__exp10
}

declare double @__exp10(double)

The pre-instcombine code works because calls to LLVM builtins are special.
Instruction selection will chose the right calling convention for the call.
However, the code after instcombine is wrong. The call to __exp10 will use
the C calling convention.

I can think of 3 options to fix this.

1. Make "C" calling convention just work since the target should know what CC
   is being used.

   This doesn't work because each function can use different CC with the "pcs"
   attribute.

2. Have Clang add the right CC keyword on the calls to LLVM builtin.

   This will work but it doesn't match the LLVM IR specification which states
   these are "Standard C Library Intrinsics".

3. Fix simplify libcall so the resulting calls to the C routines will have the
   proper CC keyword. e.g.
   %__exp10 = call arm_aapcs_vfpcc double @__exp10(double %x) #1

   This works and is the solution I implemented here.

Both solutions #2 and #3 would work. After carefully considering the pros and
cons, I decided to implement #3 for the following reasons.

1. It doesn't change the "spec" of the intrinsics.
2. It's a self-contained fix.

There are a couple of potential downsides.
1. There could be other places in the optimizer that is broken in the same way
   that's not addressed by this.
2. There could be other calling conventions that need to be propagated by
   simplify-libcall that's not handled.

But for now, this is the fix that I'm most comfortable with.
chapuni pushed a commit that referenced this issue Apr 23, 2014
LSan folks: LSan pointed to
    #0 0x4953e0 in operator new[](unsigned long) llvm/projects/compiler-rt/lib/asan/asan_new_delete.cc:64
    #1 0x7fb82af5372f in clang::RewriteRope::MakeRopeString(char const*, char const*) llvm/tools/clang/lib/Rewrite/Core/RewriteRope.cpp:796
instead of to the actual leak, which made tracking this down slower than
it could have been.
chapuni pushed a commit that referenced this issue May 1, 2014
LSan folks: LSan pointed to
    #0 0x4953e0 in operator new[](unsigned long) llvm/projects/compiler-rt/lib/asan/asan_new_delete.cc:64
    #1 0x7fb82af5372f in clang::RewriteRope::MakeRopeString(char const*, char const*) llvm/tools/clang/lib/Rewrite/Core/RewriteRope.cpp:796
instead of to the actual leak, which made tracking this down slower than
it could have been.
chapuni pushed a commit that referenced this issue May 1, 2014
For pattern like ((x >> C1) & Mask) << C2, DAG combiner may convert it
into (x >> (C1-C2)) & (Mask << C2), which makes pattern matching of ubfx
more difficult.
For example:
Given
  %shr = lshr i64 %x, 4
  %and = and i64 %shr, 15
  %arrayidx = getelementptr inbounds [8 x [64 x i64]]* @arr, i64 0, %i64 2, i64 %and
  %0 = load i64* %arrayidx
With current shift folding, it takes 3 instrs to compute base address:
  lsr x8, x0, #1
  and x8, x8, #0x78
  add x8, x9, x8

If using ubfx, it only needs 2 instrs:
  ubfx  x8, x0, #4, #4
  add x8, x9, x8, lsl #3

This fixes bug 19589
chapuni pushed a commit that referenced this issue May 30, 2014
… (PR19876)

http://llvm.org/bugs/show_bug.cgi?id=19876

The following C++1y code results in a crash:

struct X {
  int m = 10;
  int n = [this](auto) { return m; }(20);
};

When implicitly instantiating the generic lambda's call operator specialization body, Sema is unable to determine the current 'this' type when transforming the MemberExpr 'm' - since it looks for the nearest enclosing FunctionDeclDC - which is obviously null.

I considered two ways to fix this:

    1) In InstantiateFunctionDefinition, when the context is saved after the lambda scope info is created, retain the 'this' pointer.
    2) Teach getCurrentThisType() to recognize it is within a generic lambda within an NSDMI/default-initializer and return the appropriate this type.

I chose to implement #2 (though I confess I do not have a compelling reason for choosing it over #1).

Richard Smith accepted the patch:
http://reviews.llvm.org/D3935

Thank you!
chapuni pushed a commit that referenced this issue Jun 5, 2014
Due to what can only be described as a CRT bug, stdout and amazingly
even stderr are not always flushed upon process termination, especially
when the system is under high threading pressure.  I have found two
repros for this:

1) In lib\Support\Threading.cpp, change sys::Mutex to an
std::recursive_mutex and run check-clang.  Usually between 30 and 40
tests will fail.

2) Add OutputDebugStrings in code that runs during static initialization
and static shutdown.  This will sometimes generate similar failures.

After a substantial amount of troubleshooting and debugging, I found
that I could reproduce this from the command line without running
check-clang.  Simply make the mutex change described in #1, then
manually run the following command many times by running it once, then
pressing Up -> Enter very quickly:

D:\src\llvm\build\vs2013\Debug\bin\c-index-test.EXE -cursor-at=D:\src\llvm\tools\clang\test\Index\targeted-preamble.h:2:15 D:\src\llvm\tools\clang\test\Index\targeted-cursor.c -include D:\src\llvm\build\vs2013\tools\clang\test\Index\Output\targeted-cursor.c.tmp.h -Xclang -error-on-deserialized-decl=NestedVar1      -Xclang -error-on-deserialized-decl=TopVar    | D:\src\llvm\build\vs2013\Debug\bin\FileCheck.EXE D:\src\llvm\tools\clang\test\Index\targeted-cursor.c -check-prefix=PREAMBLE-CURSOR1

Sporadically they will fail, and attaching a debugger to a failed
instance indicates that stdin of FileCheck.exe is empty.

Note that due to the repro in #2, we can rule out a bug in the STL's
mutex implementation, and instead conclude that this is a real flake in
the windows test harness.

Test Plan:
Without patch: Ran check-clang 10 times and saw over 30 Unexpected failures on every run.
With patch: Ran check-clang 10 times and saw 0 unexpected failures across all runs.

Reviewers: rnk

Differential Revision: http://reviews.llvm.org/D4021

Patch by Zachary Turner!
chapuni pushed a commit that referenced this issue Jul 18, 2014
Since the result of a SETCC for AArch64 is 0 or -1 in each lane, we can
move unary operations, in this case [su]int_to_fp through the mask
operation and constant fold the operation away. Generally speaking:
  UNARYOP(AND(VECTOR_CMP(x,y), constant))
      --> AND(VECTOR_CMP(x,y), constant2)
where constant2 is UNARYOP(constant).

This implements the transform where UNARYOP is [su]int_to_fp.

For example, consider the simple function:
define <4 x float> @foo(<4 x float> %val, <4 x float> %test) nounwind {
  %cmp = fcmp oeq <4 x float> %val, %test
  %ext = zext <4 x i1> %cmp to <4 x i32>
  %result = sitofp <4 x i32> %ext to <4 x float>
  ret <4 x float> %result
}

Before this change, the code is generated as:
  fcmeq.4s  v0, v0, v1
  movi.4s v1, #0x1        // Integer splat value.
  and.16b v0, v0, v1      // Mask lanes based on the comparison.
  scvtf.4s  v0, v0        // Convert each lane to f32.
  ret

After, the code is improved to:
  fcmeq.4s  v0, v0, v1
  fmov.4s v1, #1.00000000 // f32 splat value.
  and.16b v0, v0, v1      // Mask lanes based on the comparison.
  ret

The svvtf.4s has been constant folded away and the floating point 1.0f
vector lanes are materialized directly via fmov.4s.

Rather than do the folding manually in the target code, teach getNode()
in the generic SelectionDAG to handle folding constant operands of
vector [su]int_to_fp nodes. It is reasonable (as noted in a FIXME) to do
additional constant folding there as well, but I don't have test cases
for those operations, so leaving them for another time when it becomes
appropriate.

rdar://17693791
chapuni pushed a commit that referenced this issue Aug 20, 2014
…dvantage of

the isRegSequence property.

This is a follow-up of r215394 and r215404, which respectively introduces the
isRegSequence property and uses it for ARM.

Thanks to the property introduced by the previous commits, this patch is able
to optimize the following sequence:
vmov	d0, r2, r3
vmov	d1, r0, r1
vmov	r0, s0
vmov	r1, s2
udiv	r0, r1, r0
vmov	r1, s1
vmov	r2, s3
udiv	r1, r2, r1
vmov.32	d16[0], r0
vmov.32	d16[1], r1
vmov	r0, r1, d16
bx	lr

into:
udiv	r0, r0, r2
udiv	r1, r1, r3
vmov.32	d16[0], r0
vmov.32	d16[1], r1
vmov	r0, r1, d16
bx	lr

This patch refactors how the copy optimizations are done in the peephole
optimizer. Prior to this patch, we had one copy-related optimization that
replaced a copy or bitcast by a generic, more suitable (in terms of register
file), copy.

With this patch, the peephole optimizer features two copy-related optimizations:
1. One for rewriting generic copies to generic copies:
PeepholeOptimizer::optimizeCoalescableCopy.
2. One for replacing non-generic copies with generic copies:
PeepholeOptimizer::optimizeUncoalescableCopy.

The goals of these two optimizations are slightly different: one rewrite the
operand of the instruction (#1), the other kills off the non-generic instruction
and replace it by a (sequence of) generic instruction(s).

Both optimizations rely on the ValueTracker introduced in r212100.

The ValueTracker has been refactored to use the information from the
TargetInstrInfo for non-generic instruction. As part of the refactoring, we
switched the tracking from the index of the definition to the actual register
(virtual or physical). This one change is to provide better consistency with
register related APIs and to ease the use of the TargetInstrInfo.

Moreover, this patch introduces a new helper class CopyRewriter used to ease the
rewriting of generic copies (i.e., #1).

Finally, this patch adds a dead code elimination pass right after the peephole
optimizer to get rid of dead code that may appear after rewriting.

This is related to <rdar://problem/12702965>.

Review: http://reviews.llvm.org/D4874
chapuni pushed a commit that referenced this issue Sep 26, 2014
This patch makes the ARM backend transform 3 operand instructions such as
'adds/subs' to the 2 operand version of the same instruction if the first
two register operands are the same.

Example: 'adds r0, r0, #1' will is transformed to 'adds r0, #1'.

Currently for some instructions such as 'adds' if you try to assemble
'adds r0, r0, #8' for thumb v6m the assembler would throw an error message
because the immediate cannot be encoded using 3 bits.

The backend should be smart enough to transform the instruction to
'adds r0, #8', which allows for larger immediate constants.

Patch by Ranjeet Singh.
chapuni pushed a commit that referenced this issue Nov 1, 2014
…mbined.

This patch adds an optimization in CodeGenPrepare to move an extractelement
right before a store when the target can combine them.
The optimization may promote any scalar operations to vector operations in the
way to make that possible.


** Context **

Some targets use different register files for both vector and scalar operations.
This means that transitioning from one domain to another may incur copy from one
register file to another. These copies are not coalescable and may be expensive.
For example, according to the scheduling model, on cortex-A8 a vector to GPR
move is 20 cycles.


** Motivating Example **

Let us consider an example:
define void @foo(<2 x i32>* %addr1, i32* %dest) {
 %in1 = load <2 x i32>* %addr1, align 8
 %extract = extractelement <2 x i32> %in1, i32 1
 %out = or i32 %extract, 1
 store i32 %out, i32* %dest, align 4
 ret void
}

As it is, this IR generates the following assembly on armv7:
  vldr  d16, [r0]            @vector load
  vmov.32 r0, d16[1]  @ cross-register-file copy: 20 cycles
  orr r0, r0, #1           @ scalar bitwise or
  str r0, [r1]               @ scalar store
  bx  lr

Whereas we could generate much faster code:
  vldr  d16, [r0]               @ vector load
  vorr.i32  d16, #0x1     @ vector bitwise or
  vst1.32 {d16[1]}, [r1:32] @ vector extract + store
  bx  lr

Half of the computation made in the vector is useless, but this allows to get
rid of the expensive cross-register-file copy.


** Proposed Solution **

To avoid this cross-register-copy penalty, we promote the scalar operations to
vector operations. The penalty will be removed if we manage to promote the whole
chain of computation in the vector domain.
Currently, we do that only when the chain of computation ends by a store and the
target is able to combine an extract with a store.

Stores are the most likely candidates, because other instructions produce values
that would need to be promoted and so, extracted as some point[1]. Moreover,
this is customary that targets feature stores that perform a vector extract (see
AArch64 and X86 for instance).

The proposed implementation relies on the TargetTransformInfo to decide whether
or not it is beneficial to promote a chain of computation in the vector domain.
Unfortunately, this interface is rather inaccurate for this level of details and
although this optimization may be beneficial for X86 and AArch64, the inaccuracy
will lead to the optimization being too aggressive.
Basically in TargetTransformInfo, everything that is legal has a cost of 1,
whereas, even if a vector type is legal, usually a vector operation is slightly
more expensive than its scalar counterpart. That will lead to too many
promotions that may not be counter balanced by the saving of the
cross-register-file copy. For instance, on AArch64 this penalty is just 4
cycles.

For now, the optimization is just enabled for ARM prior than v8, since those
processors have a larger penalty on cross-register-file copies, and the scope is
limited to basic blocks. Because of these two factors, we limit the effects of
the inaccuracy. Indeed, I did not want to build up a fancy cost model with block
frequency and everything on top of that.

[1] We can imagine targets that can combine an extractelement with  other
instructions than just stores. If we want to go into that direction, the current
interfaces must be augmented and, moreover, I think this becomes a global isel
problem.

Differential Revision: http://reviews.llvm.org/D5921

<rdar://problem/14170854>
chapuni pushed a commit that referenced this issue Nov 1, 2014
Our internal test reveals such case should not be transformed:

  cmp x17, #3
  b.lt .LBB10_15
  ...
  subs x12, x12, #1
  b.gt .LBB10_1

where x12 is a liveout, becomes:

  cmp x17, #2
  b.le .LBB10_15
  ...
  subs x12, x12, #2
  b.ge .LBB10_1

Unable to provide test case as it's difficult to reproduce on community branch.

http://reviews.llvm.org/D6048
Patch by Zhaoshi Zheng <[email protected]>!
chapuni pushed a commit that referenced this issue Nov 2, 2014
This commit introduces heap-use-after-free detected by ASan. Here is the output
for one of several tests that detect it:

******************** TEST 'LLVM :: Linker/AppendingLinkage.ll' FAILED ********************
Command Output (stderr):
--
=================================================================
==2122==ERROR: AddressSanitizer: heap-use-after-free on address 0x60c00000b9c8 at pc 0x0000005d05d1 bp 0x7fff64ed27c0 sp 0x7fff64ed27b8
READ of size 4 at 0x60c00000b9c8 thread T0
    #0 0x5d05d0 in llvm::GlobalValue::setUnnamedAddr(bool) /usr/local/google/home/chandlerc/src/llvm/build/../include/llvm/IR/GlobalValue.h:115:35
    #1 0x69fff1 in (anonymous namespace)::ModuleLinker::linkGlobalValueProto(llvm::GlobalValue*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1041:5
    #2 0x697229 in (anonymous namespace)::ModuleLinker::run() /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1485:9
    #3 0x696542 in llvm::Linker::linkInModule(llvm::Module*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1621:10
    #4 0x4a2db7 in main /usr/local/google/home/chandlerc/src/llvm/build/../tools/llvm-link/llvm-link.cpp:116:9
    #5 0x7f4ae61e5ec4 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287
    #6 0x41eb71 in _start (/usr/local/google/home/chandlerc/src/llvm/build/bin/llvm-link+0x41eb71)

0x60c00000b9c8 is located 72 bytes inside of 128-byte region [0x60c00000b980,0x60c00000ba00)
freed by thread T0 here:
    #0 0x4a1e6b in operator delete(void*) /usr/local/google/home/chandlerc/src/llvm/opt-build/../projects/compiler-rt/lib/asan/asan_new_delete.cc:94:3
    #1 0x5d1a7a in llvm::iplist<llvm::GlobalVariable, llvm::ilist_traits<llvm::GlobalVariable> >::erase(llvm::ilist_iterator<llvm::GlobalVariable>) /usr/local/google/home/chandlerc/src/llvm/build/../inclu
de/llvm/ADT/ilist.h:466:5
    #2 0x5d1980 in llvm::GlobalVariable::eraseFromParent() /usr/local/google/home/chandlerc/src/llvm/build/../lib/IR/Globals.cpp:204:3
    #3 0x6a8a4d in (anonymous namespace)::ModuleLinker::linkAppendingVarProto(llvm::GlobalVariable*, llvm::GlobalVariable const*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.
cpp:980:3
    #4 0x6a7403 in (anonymous namespace)::ModuleLinker::linkGlobalVariableProto(llvm::GlobalVariable const*, llvm::GlobalValue*, bool) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkMod
ules.cpp:1074:11
    #5 0x69ff4e in (anonymous namespace)::ModuleLinker::linkGlobalValueProto(llvm::GlobalValue*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1028:13
    #6 0x697229 in (anonymous namespace)::ModuleLinker::run() /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1485:9
    #7 0x696542 in llvm::Linker::linkInModule(llvm::Module*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1621:10
    #8 0x4a2db7 in main /usr/local/google/home/chandlerc/src/llvm/build/../tools/llvm-link/llvm-link.cpp:116:9
    #9 0x7f4ae61e5ec4 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287

previously allocated by thread T0 here:
    #0 0x4a192b in operator new(unsigned long) /usr/local/google/home/chandlerc/src/llvm/opt-build/../projects/compiler-rt/lib/asan/asan_new_delete.cc:62:35
    #1 0x61d85c in llvm::User::operator new(unsigned long, unsigned int) /usr/local/google/home/chandlerc/src/llvm/build/../lib/IR/User.cpp:57:19
    #2 0x6a7525 in (anonymous namespace)::ModuleLinker::linkGlobalVariableProto(llvm::GlobalVariable const*, llvm::GlobalValue*, bool) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkMod
ules.cpp:1100:3
    #3 0x69ff4e in (anonymous namespace)::ModuleLinker::linkGlobalValueProto(llvm::GlobalValue*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1028:13
    #4 0x697229 in (anonymous namespace)::ModuleLinker::run() /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1485:9
    #5 0x696542 in llvm::Linker::linkInModule(llvm::Module*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1621:10
    #6 0x4a2db7 in main /usr/local/google/home/chandlerc/src/llvm/build/../tools/llvm-link/llvm-link.cpp:116:9
    #7 0x7f4ae61e5ec4 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287

SUMMARY: AddressSanitizer: heap-use-after-free /usr/local/google/home/chandlerc/src/llvm/build/../include/llvm/IR/GlobalValue.h:115 llvm::GlobalValue::setUnnamedAddr(bool)
Shadow bytes around the buggy address:
  0x0c187fff96e0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x0c187fff96f0: 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa fa
  0x0c187fff9700: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa
  0x0c187fff9710: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x0c187fff9720: 00 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa
=>0x0c187fff9730: fd fd fd fd fd fd fd fd fd[fd]fd fd fd fd fd fd
  0x0c187fff9740: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x0c187fff9750: fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa fa
  0x0c187fff9760: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c187fff9770: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x0c187fff9780: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  ASan internal:           fe
==2122==ABORTING
rnk pushed a commit to rnk/llvm-project-old that referenced this issue May 11, 2017
…merged"

This reverts r302712.

The change fails with ASAN enabled:

ERROR: AddressSanitizer: use-after-poison on address ... at ...
READ of size 2 at ... thread T0
  #0 ... in llvm::SDNode::getNumValues() const <snip>/include/llvm/CodeGen/SelectionDAGNodes.h:855:42
  chapuni#1 ... in llvm::SDNode::hasAnyUseOfValue(unsigned int) const <snip>/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7270:3
  chapuni#2 ... in llvm::SDValue::use_empty() const <snip> include/llvm/CodeGen/SelectionDAGNodes.h:1042:17
  chapuni#3 ... in (anonymous namespace)::DAGCombiner::MergeConsecutiveStores(llvm::StoreSDNode*) <snip>/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:12944:7

Reviewers: niravd

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33081
rnk pushed a commit to rnk/llvm-project-old that referenced this issue May 17, 2017
There is often a lot of boilerplate code required to visit a type
record or type stream.  The chapuni#1 use case is that you have a sequence
of bytes that represent one or more records, and you want to
deserialize each one, switch on it, and call a callback with the
deserialized record that the user can examine.  Currently this
requires at least 6 lines of code:

  codeview::TypeVisitorCallbackPipeline Pipeline;
  Pipeline.addCallbackToPipeline(Deserializer);
  Pipeline.addCallbackToPipeline(MyCallbacks);

  codeview::CVTypeVisitor Visitor(Pipeline);
  consumeError(Visitor.visitTypeRecord(Record));

With this patch, it becomes one line of code:

  consumeError(codeview::visitTypeRecord(Record, MyCallbacks));

This is done by having the deserialization happen internally inside
of the visitTypeRecord function.  Since this is occasionally not
desirable, the function provides a 3rd parameter that can be used
to change this behavior.

Hopefully this can significantly reduce the barrier to entry
to using the visitation infrastructure.

Differential Revision: https://reviews.llvm.org/D33245
rnk pushed a commit to rnk/llvm-project-old that referenced this issue May 18, 2017
Summary:
The test being added in this patch used to cause an assertion failure:

/build/./bin/clang -cc1 -internal-isystem /build/lib/clang/5.0.0/include -nostdsysteminc -verify -fsyntax-only -std=c++11 -Wshadow-all /src/tools/clang/test/SemaCXX/warn-shadow.cpp
--
Exit Code: 134

Command Output (stderr):
--
clang: /src/tools/clang/lib/AST/ASTDiagnostic.cpp:424: void clang::FormatASTNodeDiagnosticArgument(DiagnosticsEngine::ArgumentKind, intptr_t, llvm::StringRef, llvm::StringRef, ArrayRef<DiagnosticsEngine::ArgumentValue>, SmallVectorImpl<char> &, void *, ArrayRef<intptr_t>): Assertion `isa<NamedDecl>(DC) && "Expected a NamedDecl"' failed.
#0 0x0000000001c7a1b4 PrintStackTraceSignalHandler(void*) (/build/./bin/clang+0x1c7a1b4)
chapuni#1 0x0000000001c7a4e6 SignalHandler(int) (/build/./bin/clang+0x1c7a4e6)
chapuni#2 0x00007f30880078d0 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0xf8d0)
chapuni#3 0x00007f3087054067 gsignal (/lib/x86_64-linux-gnu/libc.so.6+0x35067)
#4 0x00007f3087055448 abort (/lib/x86_64-linux-gnu/libc.so.6+0x36448)
#5 0x00007f308704d266 (/lib/x86_64-linux-gnu/libc.so.6+0x2e266)
#6 0x00007f308704d312 (/lib/x86_64-linux-gnu/libc.so.6+0x2e312)
#7 0x00000000035b7f22 clang::FormatASTNodeDiagnosticArgument(clang::DiagnosticsEngine::ArgumentKind, long, llvm::StringRef, llvm::StringRef, llvm::ArrayRef<std::pair<clang::DiagnosticsEngine::ArgumentKind, long> >, llvm::SmallVectorImpl<char>&, void*, llvm::ArrayRef<long>) (/build/
./bin/clang+0x35b7f22)
#8 0x0000000001ddbae4 clang::Diagnostic::FormatDiagnostic(char const*, char const*, llvm::SmallVectorImpl<char>&) const (/build/./bin/clang+0x1ddbae4)
#9 0x0000000001ddb323 clang::Diagnostic::FormatDiagnostic(char const*, char const*, llvm::SmallVectorImpl<char>&) const (/build/./bin/clang+0x1ddb323)
#10 0x00000000022878a4 clang::TextDiagnosticBuffer::HandleDiagnostic(clang::DiagnosticsEngine::Level, clang::Diagnostic const&) (/build/./bin/clang+0x22878a4)
#11 0x0000000001ddf387 clang::DiagnosticIDs::ProcessDiag(clang::DiagnosticsEngine&) const (/build/./bin/clang+0x1ddf387)
#12 0x0000000001dd9dea clang::DiagnosticsEngine::EmitCurrentDiagnostic(bool) (/build/./bin/clang+0x1dd9dea)
#13 0x0000000002cad00c clang::Sema::EmitCurrentDiagnostic(unsigned int) (/build/./bin/clang+0x2cad00c)
#14 0x0000000002d91cd2 clang::Sema::CheckShadow(clang::NamedDecl*, clang::NamedDecl*, clang::LookupResult const&) (/build/./bin/clang+0x2d91cd2)

Stack dump:
0.      Program arguments: /build/./bin/clang -cc1 -internal-isystem /build/lib/clang/5.0.0/include -nostdsysteminc -verify -fsyntax-only -std=c++11 -Wshadow-all /src/tools/clang/test/SemaCXX/warn-shadow.cpp
1.      /src/tools/clang/test/SemaCXX/warn-shadow.cpp:214:23: current parser token ';'
2.      /src/tools/clang/test/SemaCXX/warn-shadow.cpp:213:26: parsing function body 'handleLinkageSpec'
3.      /src/tools/clang/test/SemaCXX/warn-shadow.cpp:213:26: in compound statement ('{}')
/build/tools/clang/test/SemaCXX/Output/warn-shadow.cpp.script: line 1: 15595 Aborted                 (core dumped) /build/./bin/clang -cc1 -internal-isystem /build/lib/clang/5.0.0/include -nostdsysteminc -verify -fsyntax-only -std=c++11 -Wshadow-all /src/tools/clang/test/SemaCXX/warn-shadow.cpp

Reviewers: rsmith

Reviewed By: rsmith

Subscribers: krytarowski, cfe-commits

Differential Revision: https://reviews.llvm.org/D33207
pcc pushed a commit to pcc/llvm-project-old that referenced this issue Jun 5, 2017
error C2971: 'llvm::ManagedStatic': template parameter 'Creator': 'CreateDefaultTimerGroup': a variable with non-static storage duration cannot be used as a non-type argument
pcc pushed a commit to pcc/llvm-project-old that referenced this issue Jun 5, 2017
…ruction

Op1 (RHS) is a constant, so putting it on the LHS makes us churn through visitICmp
an extra time to canonicalize it:

INSTCOMBINE ITERATION chapuni#1 on cmpnot
IC: ADDING: 3 instrs to worklist
IC: Visiting:   %notx = xor i8 %x, -1
IC: Visiting:   %cmp = icmp sgt i8 %notx, 42
IC: Old =   %cmp = icmp sgt i8 %notx, 42
    New =   <badref> = icmp sgt i8 -43, %x
IC: ADD:   %cmp = icmp sgt i8 -43, %x
IC: ERASE   %1 = icmp sgt i8 %notx, 42
IC: ADD:   %notx = xor i8 %x, -1
IC: DCE:   %notx = xor i8 %x, -1
IC: ERASE   %notx = xor i8 %x, -1
IC: Visiting:   %cmp = icmp sgt i8 -43, %x
IC: Mod =   %cmp = icmp sgt i8 -43, %x
    New =   %cmp = icmp slt i8 %x, -43
IC: ADD:   %cmp = icmp slt i8 %x, -43
IC: Visiting:   %cmp = icmp slt i8 %x, -43
IC: Visiting:   ret i1 %cmp

If we create the swapped ICmp directly, we go faster:

INSTCOMBINE ITERATION chapuni#1 on cmpnot
IC: ADDING: 3 instrs to worklist
IC: Visiting:   %notx = xor i8 %x, -1
IC: Visiting:   %cmp = icmp sgt i8 %notx, 42
IC: Old =   %cmp = icmp sgt i8 %notx, 42
    New =   <badref> = icmp slt i8 %x, -43
IC: ADD:   %cmp = icmp slt i8 %x, -43
IC: ERASE   %1 = icmp sgt i8 %notx, 42
IC: ADD:   %notx = xor i8 %x, -1
IC: DCE:   %notx = xor i8 %x, -1
IC: ERASE   %notx = xor i8 %x, -1
IC: Visiting:   %cmp = icmp slt i8 %x, -43
IC: Visiting:   ret i1 %cmp
pcc pushed a commit to pcc/llvm-project-old that referenced this issue Jun 5, 2017
atos is apparently not able to resolve symbol addresses properly on
i386-darwin reliably any more. This is causing bot flakiness:
http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-expensive/6841

There have not been any SDK changes on the bot as of late.

/Users/buildslave/jenkins/sharedspace/clang-stage1-cmake-RA_workspace/llvm/projects/compiler-rt/test/asan/TestCases/Darwin/atos-symbolizer.cc:20:12: error: expected string not found in input
 // CHECK: chapuni#1 0x{{.*}} in main {{.*}}atos-symbolizer.cc:[[@LINE-4]]
           ^
<stdin>:35:27: note: scanning from here
 #0 0x112f56 in wrap_free (/Users/buildslave/jenkins/sharedspace/clang-stage1-cmake-RA_workspace/clang-build/lib/clang/5.0.0/lib/darwin/libclang_rt.asan_osx_dynamic.dylib:i386+0x56f56)
                          ^
<stdin>:35:27: note: with expression "@LINE-4" equal to "16"
 #0 0x112f56 in wrap_free (/Users/buildslave/jenkins/sharedspace/clang-stage1-cmake-RA_workspace/clang-build/lib/clang/5.0.0/lib/darwin/libclang_rt.asan_osx_dynamic.dylib:i386+0x56f56)
                          ^
<stdin>:36:168: note: possible intended match here
 chapuni#1 0xb6f20 in main (/Users/buildslave/jenkins/sharedspace/clang-stage1-cmake-RA_workspace/clang-build/tools/clang/runtime/compiler-rt-bins/test/asan/I386DarwinConfig/TestCases/Darwin/Output/atos-symbolizer.cc.tmp:i386+0x1f20)
rnk pushed a commit to rnk/llvm-project-old that referenced this issue Jun 20, 2017
Consider:

struct MyClass {
  void f() {}
}
MyClass::f(){} // expected error redefinition of f. chapuni#1

Some clients (eg. cling) need to call removeDecl for the redefined (chapuni#1) decl.

This patch enables us to remove the lookup entry is registered in the semantic
decl context and not in the primary decl context of the lexical decl context
where we currently are trying to remove it from.

It is not trivial to test this piece and writing a full-blown unit test seems
too much.
rnk pushed a commit to rnk/llvm-project-old that referenced this issue Oct 11, 2017
Summary:
This updates the SCCP solver to use of the ValueElement lattice for
parameters, which provides integer range information. The range
information is used to remove unneeded icmp instructions.

For the following function, f() can be optimized to `ret i32 2` with
this change

  source_filename = "sccp.c"
  target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
  target triple = "x86_64-unknown-linux-gnu"

  ; Function Attrs: norecurse nounwind readnone uwtable
  define i32 @main() local_unnamed_addr #0 {
  entry:
    %call = tail call fastcc i32 @f(i32 1)
    %call1 = tail call fastcc i32 @f(i32 47)
    %add3 = add nsw i32 %call, %call1
    ret i32 %add3
  }

  ; Function Attrs: noinline norecurse nounwind readnone uwtable
  define internal fastcc i32 @f(i32 %x) unnamed_addr chapuni#1 {
  entry:
    %c1 = icmp sle i32 %x, 100

    %cmp = icmp sgt i32 %x, 300
    %. = select i1 %cmp, i32 1, i32 2
    ret i32 %.
  }

  attributes chapuni#1 = { noinline }



Reviewers: davide, sanjoy, efriedma, dberlin

Reviewed By: davide, dberlin

Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D36656
pcc pushed a commit to pcc/llvm-project-old that referenced this issue Oct 13, 2017
…SCCP."

This is r315288 & r315294, which were reverted due to stage2 bot
failures.

Summary:
This updates the SCCP solver to use of the ValueElement lattice for
parameters, which provides integer range information. The range
information is used to remove unneeded icmp instructions.

For the following function, f() can be optimized to `ret i32 2` with
this change

  source_filename = "sccp.c"
  target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
  target triple = "x86_64-unknown-linux-gnu"

  ; Function Attrs: norecurse nounwind readnone uwtable
  define i32 @main() local_unnamed_addr #0 {
  entry:
    %call = tail call fastcc i32 @f(i32 1)
    %call1 = tail call fastcc i32 @f(i32 47)
    %add3 = add nsw i32 %call, %call1
    ret i32 %add3
  }

  ; Function Attrs: noinline norecurse nounwind readnone uwtable
  define internal fastcc i32 @f(i32 %x) unnamed_addr chapuni#1 {
  entry:
    %c1 = icmp sle i32 %x, 100

    %cmp = icmp sgt i32 %x, 300
    %. = select i1 %cmp, i32 1, i32 2
    ret i32 %.
  }

  attributes chapuni#1 = { noinline }

Reviewers: davide, sanjoy, efriedma, dberlin

Reviewed By: davide, dberlin

Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D36656
pcc pushed a commit to pcc/llvm-project-old that referenced this issue Oct 27, 2017
This fixes bugzilla 26810
https://bugs.llvm.org/show_bug.cgi?id=26810

This is intended to prevent sequences like:
movl %ebp, 8(%esp) # 4-byte Spill
movl %ecx, %ebp
movl %ebx, %ecx
movl %edi, %ebx
movl %edx, %edi
cltd
idivl %esi
movl %edi, %edx
movl %ebx, %edi
movl %ecx, %ebx
movl %ebp, %ecx
movl 16(%esp), %ebp # 4 - byte Reload

Such sequences are created in 2 scenarios:

Scenario chapuni#1:
vreg0 is evicted from physreg0 by vreg1
Evictee vreg0 is intended for region splitting with split candidate physreg0 (the reg vreg0 was evicted from)
Region splitting creates a local interval because of interference with the evictor vreg1 (normally region spliiting creates 2 interval, the "by reg" and "by stack" intervals. Local interval created when interference occurs.)
one of the split intervals ends up evicting vreg2 from physreg1
Evictee vreg2 is intended for region splitting with split candidate physreg1
one of the split intervals ends up evicting vreg3 from physreg2 etc.. until someone spills

Scenario chapuni#2
vreg0 is evicted from physreg0 by vreg1
vreg2 is evicted from physreg2 by vreg3 etc
Evictee vreg0 is intended for region splitting with split candidate physreg1
Region splitting creates a local interval because of interference with the evictor vreg1
one of the split intervals ends up evicting back original evictor vreg1 from physreg0 (the reg vreg0 was evicted from)
Another evictee vreg2 is intended for region splitting with split candidate physreg1
one of the split intervals ends up evicting vreg3 from physreg2 etc.. until someone spills

As compile time was a concern, I've added a flag to control weather we do cost calculations for local intervals we expect to be created (it's on by default for X86 target, off for the rest).

Differential Revision: https://reviews.llvm.org/D35816

Change-Id: Id9411ff7bbb845463d289ba2ae97737a1ee7cc39
pcc pushed a commit to pcc/llvm-project-old that referenced this issue Oct 27, 2017
Summary:
The code comments indicate that no effort has been spent on
handling load/stores when the size isn't a multiple of the
byte size correctly. However, the code only avoided types
smaller than 8 bits. So for example a load of an i28 could
still be considered as a candidate for vectorization.

This patch adjusts the code to behave according to the code
comment.

The test case used to hit the following assert when
trying to use "cast" an i32 to i28 using CreateBitOrPointerCast:

opt: ../lib/IR/Instructions.cpp:2565: Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed.
#0 PrintStackTraceSignalHandler(void*)
chapuni#1 SignalHandler(int)
chapuni#2 __restore_rt
chapuni#3 __GI_raise
#4 __GI_abort
#5 __GI___assert_fail
#6 llvm::CastInst::Create(llvm::Instruction::CastOps, llvm::Value*, llvm::Type*, llvm::Twine const&, llvm::Instruction*)
#7 llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>::CreateBitOrPointerCast(llvm::Value*, llvm::Type*, llvm::Twine const&)
#8 (anonymous namespace)::Vectorizer::vectorizeLoadChain(llvm::ArrayRef<llvm::Instruction*>, llvm::SmallPtrSet<llvm::Instruction*, 16u>*)

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39295
pcc pushed a commit to pcc/llvm-project-old that referenced this issue Oct 27, 2017
Summary:
We no longer add vectors of pointers as candidates for
load/store vectorization. It does not seem to work anyway,
but without this patch we can end up in asserts when trying
to create casts between an integer type and the pointer of
vectors type.

The test case I've added used to assert like this when trying to
cast between i64 and <2 x i16*>:
opt: ../lib/IR/Instructions.cpp:2565: Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed.
#0 PrintStackTraceSignalHandler(void*)
chapuni#1 SignalHandler(int)
chapuni#2 __restore_rt
chapuni#3 __GI_raise
#4 __GI_abort
#5 __GI___assert_fail
#6 llvm::CastInst::Create(llvm::Instruction::CastOps, llvm::Value*, llvm::Type*, llvm::Twine const&, llvm::Instruction*)
#7 llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>::CreateBitOrPointerCast(llvm::Value*, llvm::Type*, llvm::Twine const&)
#8 Vectorizer::vectorizeStoreChain(llvm::ArrayRef<llvm::Instruction*>, llvm::SmallPtrSet<llvm::Instruction*, 16u>*)

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D39296
pcc pushed a commit to pcc/llvm-project-old that referenced this issue Nov 17, 2017
…in IPSCCP.

This version of the patch includes a fix addressing a stage2 LTO buildbot
failure and addressed some additional nits.

Original commit message:
This updates the SCCP solver to use of the ValueElement lattice for
parameters, which provides integer range information. The range
information is used to remove unneeded icmp instructions.

For the following function, f() can be optimized to ret i32 2 with
this change

    source_filename = "sccp.c"
    target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
    target triple = "x86_64-unknown-linux-gnu"

    ; Function Attrs: norecurse nounwind readnone uwtable
    define i32 @main() local_unnamed_addr #0 {
    entry:
      %call = tail call fastcc i32 @f(i32 1)
      %call1 = tail call fastcc i32 @f(i32 47)
      %add3 = add nsw i32 %call, %call1
      ret i32 %add3
    }

    ; Function Attrs: noinline norecurse nounwind readnone uwtable
    define internal fastcc i32 @f(i32 %x) unnamed_addr chapuni#1 {
    entry:
      %c1 = icmp sle i32 %x, 100

      %cmp = icmp sgt i32 %x, 300
      %. = select i1 %cmp, i32 1, i32 2
      ret i32 %.
    }

    attributes chapuni#1 = { noinline }

Reviewers: davide, sanjoy, efriedma, dberlin

Reviewed By: davide, dberlin

Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D36656
pcc pushed a commit to pcc/llvm-project-old that referenced this issue Nov 17, 2017
…in IPSCCP.

This version of the patch includes a fix addressing a stage2 LTO buildbot
failure and addressed some additional nits.

Original commit message:
This updates the SCCP solver to use of the ValueElement lattice for
parameters, which provides integer range information. The range
information is used to remove unneeded icmp instructions.

For the following function, f() can be optimized to ret i32 2 with
this change

    source_filename = "sccp.c"
    target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
    target triple = "x86_64-unknown-linux-gnu"

    ; Function Attrs: norecurse nounwind readnone uwtable
    define i32 @main() local_unnamed_addr #0 {
    entry:
      %call = tail call fastcc i32 @f(i32 1)
      %call1 = tail call fastcc i32 @f(i32 47)
      %add3 = add nsw i32 %call, %call1
      ret i32 %add3
    }

    ; Function Attrs: noinline norecurse nounwind readnone uwtable
    define internal fastcc i32 @f(i32 %x) unnamed_addr chapuni#1 {
    entry:
      %c1 = icmp sle i32 %x, 100

      %cmp = icmp sgt i32 %x, 300
      %. = select i1 %cmp, i32 1, i32 2
      ret i32 %.
    }

    attributes chapuni#1 = { noinline }

Reviewers: davide, sanjoy, efriedma, dberlin

Reviewed By: davide, dberlin

Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D36656
pcc pushed a commit to pcc/llvm-project-old that referenced this issue Jan 12, 2018
This is a follow-on to D40724 (Wasm entrypoint changes chapuni#1,
add `--undefined` argument to LLD).

Patch by Nicholas Wilson

Differential Revision: https://reviews.llvm.org/D40739
pcc pushed a commit to pcc/llvm-project-old that referenced this issue Jan 12, 2018
We want to do this for 2 reasons:
1. Value tracking does not recognize the ashr variant, so it would fail to match for cases like D39766.
2. DAGCombiner does better at producing optimal codegen when we have the cmp+sel pattern.

More detail about what happens in the backend:
1. DAGCombiner has a generic transform for all targets to convert the scalar cmp+sel variant of abs
   into the shift variant. That is the opposite of this IR canonicalization.
2. DAGCombiner has a generic transform for all targets to convert the vector cmp+sel variant of abs
   into either an ABS node or the shift variant. That is again the opposite of this IR canonicalization.
3. DAGCombiner has a generic transform for all targets to convert the exact shift variants produced by chapuni#1 or chapuni#2
   into an ISD::ABS node. Note: It would be an efficiency improvement if we had chapuni#1 go directly to an ABS node
   when that's legal/custom.
4. The pattern matching above is incomplete, so it is possible to escape the intended/optimal codegen in a
   variety of ways.
   a. For chapuni#2, the vector path is missing the case for setlt with a '1' constant.
   b. For chapuni#3, we are missing a match for commuted versions of the shift variants.
5. Therefore, this IR canonicalization can only help get us to the optimal codegen. The version of cmp+sel
   produced by this patch will be recognized in the DAG and converted to an ABS node when possible or the
   shift sequence when not.
6. In the following examples with this patch applied, we may get conditional moves rather than the shift
   produced by the generic DAGCombiner transforms. The conditional move is created using a target-specific
   decision for any given target. Whether it is optimal or not for a particular subtarget may be up for debate.

define i32 @abs_shifty(i32 %x) {
  %signbit = ashr i32 %x, 31
  %add = add i32 %signbit, %x
  %abs = xor i32 %signbit, %add
  ret i32 %abs
}

define i32 @abs_cmpsubsel(i32 %x) {
  %cmp = icmp slt i32 %x, zeroinitializer
  %sub = sub i32 zeroinitializer, %x
  %abs = select i1 %cmp, i32 %sub, i32 %x
  ret i32 %abs
}

define <4 x i32> @abs_shifty_vec(<4 x i32> %x) {
  %signbit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>
  %add = add <4 x i32> %signbit, %x
  %abs = xor <4 x i32> %signbit, %add
  ret <4 x i32> %abs
}

define <4 x i32> @abs_cmpsubsel_vec(<4 x i32> %x) {
  %cmp = icmp slt <4 x i32> %x, zeroinitializer
  %sub = sub <4 x i32> zeroinitializer, %x
  %abs = select <4 x i1> %cmp, <4 x i32> %sub, <4 x i32> %x
  ret <4 x i32> %abs
}

> $ ./opt -instcombine shiftyabs.ll -S | ./llc -o - -mtriple=x86_64 -mattr=avx
> abs_shifty:
> 	movl	%edi, %eax
> 	negl	%eax
> 	cmovll	%edi, %eax
> 	retq
>
> abs_cmpsubsel:
> 	movl	%edi, %eax
> 	negl	%eax
> 	cmovll	%edi, %eax
> 	retq
>
> abs_shifty_vec:
> 	vpabsd	%xmm0, %xmm0
> 	retq
>
> abs_cmpsubsel_vec:
> 	vpabsd	%xmm0, %xmm0
> 	retq
>
> $ ./opt -instcombine shiftyabs.ll -S | ./llc -o - -mtriple=aarch64
> abs_shifty:
> 	cmp	w0, #0                  // =0
> 	cneg	w0, w0, mi
> 	ret
>
> abs_cmpsubsel:
> 	cmp	w0, #0                  // =0
> 	cneg	w0, w0, mi
> 	ret
>
> abs_shifty_vec:
> 	abs	v0.4s, v0.4s
> 	ret
>
> abs_cmpsubsel_vec:
> 	abs	v0.4s, v0.4s
> 	ret
>
> $ ./opt -instcombine shiftyabs.ll -S | ./llc -o - -mtriple=powerpc64le
> abs_shifty:
> 	srawi 4, 3, 31
> 	add 3, 3, 4
> 	xor 3, 3, 4
> 	blr
>
> abs_cmpsubsel:
> 	srawi 4, 3, 31
> 	add 3, 3, 4
> 	xor 3, 3, 4
> 	blr
>
> abs_shifty_vec:
> 	vspltisw 3, -16
> 	vspltisw 4, 15
> 	vsubuwm 3, 4, 3
> 	vsraw 3, 2, 3
> 	vadduwm 2, 2, 3
> 	xxlxor 34, 34, 35
> 	blr
>
> abs_cmpsubsel_vec:
> 	vspltisw 3, -16
> 	vspltisw 4, 15
> 	vsubuwm 3, 4, 3
> 	vsraw 3, 2, 3
> 	vadduwm 2, 2, 3
> 	xxlxor 34, 34, 35
> 	blr
>

Differential Revision: https://reviews.llvm.org/D40984
pcc pushed a commit to pcc/llvm-project-old that referenced this issue Jan 18, 2018
Summary:
__builtin_clz used for Log calculation returns an undefined result
when argument is 0. I noticed that issue when was testing some fuzzers:

```
/src/libfuzzer/FuzzerTracePC.h:282:33: runtime error: shift exponent 450349 is too large for 32-bit type 'uint32_t' (aka 'unsigned int')
  #0 0x43d83f in operator() /src/libfuzzer/FuzzerTracePC.h:283:33
  chapuni#1 0x43d83f in void fuzzer::TracePC::CollectFeatures<fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool*)::$_1>(fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool*)::$_1) const /src/libfuzzer/FuzzerTracePC.h:290
  chapuni#2 0x43cbd4 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool*) /src/libfuzzer/FuzzerLoop.cpp:445:7
  chapuni#3 0x43e5f1 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, fuzzer::fuzzer_allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) /src/libfuzzer/FuzzerLoop.cpp:706:5
  #4 0x43e9e1 in fuzzer::Fuzzer::Loop(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, fuzzer::fuzzer_allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) /src/libfuzzer/FuzzerLoop.cpp:739:3
  #5 0x432f8c in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/libfuzzer/FuzzerDriver.cpp:754:6
  #6 0x42ee18 in main /src/libfuzzer/FuzzerMain.cpp:20:10
  #7 0x7f17ffeb182f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
  #8 0x407838 in _start (/out/rotate_fuzzer+0x407838)

Reviewers: kcc

Reviewed By: kcc

Subscribers: llvm-commits, #sanitizers

Differential Revision: https://reviews.llvm.org/D41457
pcc pushed a commit to pcc/llvm-project-old that referenced this issue Jan 18, 2018
Summary:
I have been getting rather difficult to reproduce SIGBUS crashes when
compiling certain FreeBSD sources, and their stack traces pointed
squarely at `SelectionDAG::salvageDebugInfo()`:

```
Core was generated by `/usr/obj/share/dim/src/freebsd/clang600-import/amd64.amd64/tmp/usr/bin/cc -cc1 -'.
Program terminated with signal SIGBUS, Bus error.
#0  isInvalidated () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SDNodeDbgValue.h:115
115       bool isInvalidated() const { return Invalid; }
(gdb) bt
#0  isInvalidated () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SDNodeDbgValue.h:115
chapuni#1  salvageDebugInfo () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7116
chapuni#2  0x00000000033b2516 in operator() () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:3595
chapuni#3  __invoke<(lambda at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:3593:59) &, llvm::SDNode *, llvm::SDNode *> () at /usr/include/c++/v1/type_traits:4323
#4  __call<(lambda at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:3593:59) &, llvm::SDNode *, llvm::SDNode *> () at /usr/include/c++/v1/__functional_base:349
#5  operator() () at /usr/include/c++/v1/functional:1562
#6  0x00000000033b0817 in operator() () at /usr/include/c++/v1/functional:1916
#7  NodeDeleted () at /share/dim/src/freebsd/clang600-import/contrib/llvm/include/llvm/CodeGen/SelectionDAG.h:293
#8  0x0000000003529dde in RemoveDeadNodes () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:610
#9  0x00000000035556df in MorphNodeTo () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:6794
#10 0x00000000033a9acc in MorphNode () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:2594
#11 0x00000000033ac80b in SelectCodeCommon () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:3601
#12 0x00000000023d464b in SelectCode () at /usr/obj/share/dim/src/freebsd/clang600-import/amd64.amd64/tmp/obj-tools/lib/clang/libllvm/X86GenDAGISel.inc:282902
#13 Select () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/Target/X86/X86ISelDAGToDAG.cpp:3072
#14 0x00000000033a5afa in DoInstructionSelection () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:988
#15 0x00000000033a4e1a in CodeGenAndEmitDAG () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:868
#16 0x00000000033a2643 in SelectAllBasicBlocks () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:1624
#17 0x000000000339f158 in runOnMachineFunction () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:466
#18 0x00000000023d03c4 in runOnMachineFunction () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/Target/X86/X86ISelDAGToDAG.cpp:175
#19 0x00000000035cc8c2 in runOnFunction () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/MachineFunctionPass.cpp:62
#20 0x00000000030dca9a in runOnFunction () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/IR/LegacyPassManager.cpp:1520
#21 0x00000000030dccf3 in runOnModule () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/IR/LegacyPassManager.cpp:1541
#22 0x00000000030dd228 in runOnModule () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/IR/LegacyPassManager.cpp:1597
#23 run () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/IR/LegacyPassManager.cpp:1700
#24 0x00000000014db578 in EmitAssembly () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/CodeGen/BackendUtil.cpp:815
#25 EmitBackendOutput () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/CodeGen/BackendUtil.cpp:1181
#26 0x00000000014d5b26 in HandleTranslationUnit () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/CodeGen/CodeGenAction.cpp:292
#27 0x0000000001c4c332 in ParseAST () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/Parse/ParseAST.cpp:159
#28 0x00000000015d546c in Execute () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/Frontend/FrontendAction.cpp:897
#29 0x0000000001cec311 in ExecuteAction () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/Frontend/CompilerInstance.cpp:991
#30 0x00000000014b4f81 in ExecuteCompilerInvocation () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:252
#31 0x00000000014aa73f in cc1_main () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/tools/driver/cc1_main.cpp:221
#32 0x00000000014b2928 in ExecuteCC1Tool () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/tools/driver/driver.cpp:309
#33 main () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/tools/driver/driver.cpp:388
(gdb) frame 1
chapuni#1  salvageDebugInfo () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7116
7116        if (DV->isInvalidated())
(gdb) disassemble
Dump of assembler code for function salvageDebugInfo():
[...]
   0x0000000003557348 <+744>:   nopl   0x0(%rax,%rax,1)
   0x0000000003557350 <+752>:   mov    (%r12),%r13
=> 0x0000000003557354 <+756>:   cmpb   $0x0,0x31(%r13)
   0x0000000003557359 <+761>:   jne    0x35573b0 <salvageDebugInfo()+848>
(gdb) info registers
[...]
r13            0x5a5a5a5a5a5a5a5a       6510615555426900570
```

The `0x5a5a5a5a5a5a5a5a` value in `r13` indicates the memory was either
uninitialized, or already freed.

Unfortunately I do not have a simple self-contained test case for this.
However, it seems pretty clear that the call to `AddDbgValue()` in
`salvageDebugInfo()` causes the problems, since it modifies
`SelectionDag::DbgInfo` while looping through one of its DenseMaps:

```
void SelectionDAG::salvageDebugInfo(SDNode &N) {
[...]
  for (auto DV : GetDbgValues(&N)) {
    if (DV->isInvalidated())
      continue;
[...]
        AddDbgValue(Clone, N0.getNode(), false);
[...]
  }
}
```

At least, if I comment out the `AddDbgValue()` call, the crashes go
away.  I propose to change this function slightly, similar to the
`SelectionDAG::transferDbgValues()` function just above it, to save the
cloned SDDbgValues in a separate SmallVector, and only call
AddDbgValue() on them after the for loop is done.

Reviewers: aprantl, bogner, bkramer, davide

Reviewed By: davide

Subscribers: davide, krytarowski, JDevlieghere, emaste, llvm-commits

Differential Revision: https://reviews.llvm.org/D41589
pcc pushed a commit to pcc/llvm-project-old that referenced this issue Feb 13, 2018
…168)

This patch is the LLVM part of fixing the issues, described in
https://bugs.llvm.org/show_bug.cgi?id=36168

* The representation of enumerator values in the debug info metadata now
  contains a boolean flag isUnsigned, which determines how the bits of
  the value are interpreted.
* The DW_TAG_enumeration type DIE now always (for DWARF version >= 3)
  includes a DW_AT_type attribute, which refers to the underlying
  integer type, as suggested in DWARFv4 (5.7 Enumeration Type Entries).
* The debug info metadata for enumeration type contains (in flags)
  indication whether this is a C++11 "fixed enum".
* For C++11 enumeration with a fixed underlying type, the DIE also
  includes the DW_AT_enum_class attribute (for DWARF version >= 4).
* Encoding of enumerator constants uses DW_FORM_sdata for signed values
  and DW_FORM_udata for unsigned values, as suggested by DWARFv4 (7.5.4
  Attribute Encodings).

The changes should be backwards compatible:

* the isUnsigned attribute is optional and defaults to false.
* if the underlying type for the enumeration is not available, the
  enumerator values are considered signed.
* the FixedEnum flag defaults to clear.
* the bitcode format for DIEnumerator stores the unsigned flag bit chapuni#1 of
  the first record element, so the format does not change and the zero
  previously stored there is consistent with the false default for
  IsUnsigned.

Differential Revision: https://reviews.llvm.org/D42734
pcc pushed a commit to pcc/llvm-project-old that referenced this issue Apr 18, 2018
This allows us to collect useful metrics about lldb debugging sessions.

I thought that an example would be better than a thousand words:

  Process 19705 stopped
  * thread chapuni#1, queue = 'com.apple.main-thread', stop reason = step in
      frame #0: 0x0000000100000fb4 blah`main at blah.c:3
     1    int main(void) {
     2      int a = 6;
  -> 3      return 0;
     4    }
  (lldb) statistics enable
  (lldb) frame var a
  (int) a = 6
  (lldb) expr a
  (int) $1 = 6
  (lldb) statistics disable
  (lldb) statistics dump
  Number of expr evaluation successes : 1
  Number of expr evaluation failures : 0
  Number of frame var successes : 1
  Number of frame var failures : 0

Future improvements might include:

1. Passing a file, or implementing categories. The way this patch has
been implemented is generic enough to allow this to be extended
easily without breaking the grammar.
2. Adding an SBAPI and Python API for use in scripts.

Thanks to Jim Ingham for discussing the design with me.

<rdar://problem/36555975>

Differential Revision:  https://reviews.llvm.org/D45547
pcc pushed a commit to pcc/llvm-project-old that referenced this issue May 22, 2018
Patch chapuni#3 from VPlan Outer Loop Vectorization Patch Series chapuni#1
(RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html).

Expected to be NFC for the current inner loop vectorization path. It
introduces the basic algorithm to build the VPlan plain CFG (single-level
CFG, no hierarchical CFG (H-CFG), yet) in the VPlan-native vectorization
path using VPInstructions. It includes:
  - VPlanHCFGBuilder: Main class to build the VPlan H-CFG (plain CFG without nested regions, for now).
  - VPlanVerifier: Main class with utilities to check the consistency of a H-CFG.
  - VPlanBlockUtils: Main class with utilities to manipulate VPBlockBases in VPlan.

Reviewers: rengolin, fhahn, mkuper, mssimpso, a.elovikov, hfinkel, aprantl.

Differential Revision: https://reviews.llvm.org/D44338
pcc pushed a commit to pcc/llvm-project-old that referenced this issue Jun 13, 2018
Summary:
Running sanitized 32-bit x86 programs on glibc 2.27 crashes at startup, with:

    ERROR: AddressSanitizer: SEGV on unknown address 0xf7a8a250 (pc 0xf7f807f4 bp 0xff969fc8 sp 0xff969f7c T16777215)
    The signal is caused by a WRITE memory access.
    #0 0xf7f807f3 in _dl_get_tls_static_info (/lib/ld-linux.so.2+0x127f3)
    chapuni#1 0xf7a92599  (/lib/libasan.so.5+0x112599)
    chapuni#2 0xf7a80737  (/lib/libasan.so.5+0x100737)
    chapuni#3 0xf7f7e14f in _dl_init (/lib/ld-linux.so.2+0x1014f)
    #4 0xf7f6eb49  (/lib/ld-linux.so.2+0xb49)

The problem is that glibc changed the calling convention for the GLIBC_PRIVATE
symbol that sanitizer uses (even when it should not, GLIBC_PRIVATE is exactly
for symbols that can change at any time, be removed etc.), see
https://sourceware.org/ml/libc-alpha/2017-08/msg00497.html

Fixes google/sanitizers#954

Patch By: Jakub Jelinek

Reviewed By: vitalybuka, Lekensteyn

Differential Revison: https://reviews.llvm.org/D44623
pcc pushed a commit to pcc/llvm-project-old that referenced this issue Jun 13, 2018
Don't provide the assembler source as the "root file" unless the user
asked to have debug info for the assembler source (with -g).

If the source doesn't provide an explicit ".file 0" then (a) use the
compilation directory as directory #0, and (b) use the file chapuni#1 info
for file #0 also.

Differential Revision: https://reviews.llvm.org/D48055
rnk pushed a commit to rnk/llvm-project-old that referenced this issue Jul 23, 2018
A DAG-NOT-DAG is a CHECK-DAG group, X, followed by a CHECK-NOT group,
N, followed by a CHECK-DAG group, Y.  Let y be the initial directive
of Y.  This patch makes the following changes to the behavior:

    1. Directives in N can no longer match within part of Y's match
       range just because y happens not to be the earliest match from
       Y.  Specifically, this patch withdraws N's search range end
       from y's match range start to Y's match range start.

    2. y can no longer match within X's match range, where a y match
       produced a reordering complaint, which is thus no longer
       possible.  Specifically, this patch withdraws y's search range
       start from X's permitted range start to X's match range end,
       which was already the search range start for other members of
       Y.

Both of these changes can only increase the number of test passes: chapuni#1
constrains the ability of CHECK-NOTs to match, and chapuni#2 expands the
ability of CHECK-DAGs to match without complaints.

These changes are based on discussions at:

   <http://lists.llvm.org/pipermail/llvm-dev/2018-May/123550.html>
   <https://reviews.llvm.org/D47106>

which conclude that:

    1. These changes simplify the FileCheck conceptual model.  First,
       it makes search ranges for DAG-NOT-DAG more consistent with
       other cases.  Second, it was confusing that y was treated
       differently from the rest of Y.

    2. These changes add theoretical use cases for DAG-NOT-DAG that
       had no obvious means to be expressed otherwise.  We can justify
       the first half of this assertion with the observation that
       these changes can only increase the number of test passes.

    3. Reordering detection for DAG-NOT-DAG had no obvious real
       benefit.

We don't have evidence from real uses cases to help us debate
conclusions chapuni#2 and chapuni#3, but chapuni#1 at least seems intuitive.

Reviewed By: probinson

Differential Revision: https://reviews.llvm.org/D48986
rnk pushed a commit to rnk/llvm-project-old that referenced this issue Nov 1, 2018
Summary:
As a bonus, this arguably improves the code by making it simpler.

gcc 8 on Ubuntu 18.10 reports the following:

==39667==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7fffffff8ae0 at pc 0x555555dbfc68 bp 0x7fffffff8760 sp 0x7fffffff8750
WRITE of size 8 at 0x7fffffff8ae0 thread T0
    #0 0x555555dbfc67 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_Alloc_hider::_Alloc_hider(char*, std::allocator<char>&&) /usr/include/c++/8/bits/basic_string.h:149
    chapuni#1 0x555555dbfc67 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&) /usr/include/c++/8/bits/basic_string.h:542
    chapuni#2 0x555555dbfc67 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&) /usr/include/c++/8/bits/basic_string.h:6009
    chapuni#3 0x555555dbfc67 in searchableFieldType /home/nha/amd/build/san/llvm-src/utils/TableGen/SearchableTableEmitter.cpp:168
    (...)

Address 0x7fffffff8ae0 is located in stack of thread T0 at offset 864 in frame
    #0 0x555555dbef3f in searchableFieldType /home/nha/amd/build/san/llvm-src/utils/TableGen/SearchableTableEmitter.cpp:148

Reviewers: fhahn, simon_tatham, kparzysz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53931
rnk pushed a commit to rnk/llvm-project-old that referenced this issue Nov 10, 2018
ModuleSummaryIndex::exportToDot crashes when linking the Linux kernel
under ThinLTO using LLVMgold.so. This is due to the exportToDot
function trying to get the GUID of an empty ValueInfo. The root cause
related to the fact that we attempt to get the GUID of an aliasee
via its OriginalGUID recorded in the aliasee summary, and that is not
always possible. Specifically, we cannot do this mapping when the value
is internal linkage and there were other internal linkage symbols with
the same name.

There are 2 fixes for the problem included here.

1) In all cases where we can currently print the dot file from the
command line (which is only via save-temps), we have a valid AliaseeGUID
in the AliasSummary. Use that when it is available, so that we can get
the correct aliasee GUID whenever possible.

2) However, if we were to invoke exportToDot from the debugger right
after it is built during the initial analysis step (i.e. the per-module
summary), we won't have the AliaseeGUID field populated. In that case,
we have a fallback fix that will simply print "@"+GUID when we aren't
able to get the GUID from the OriginalGUID. It simply checks if the VI
is valid or not before attempting to get the name. Additionally, since
getAliaseeGUID will assert that the AliaseeGUID is non-zero, guard the
earlier fix chapuni#1 by a new function hasAliaseeGUID().

Reviewers: pcc, tmroeder

Subscribers: evgeny777, mehdi_amini, inglorion, dexonsmith, arphaman, llvm-commits

Differential Revision: https://reviews.llvm.org/D53986
rui314 pushed a commit to rui314/llvm-project that referenced this issue Nov 14, 2018
Flags variable was not initialized and later used (both isMBBSafeToOutlineFrom
implementations assume it's initialized), which breaks
test/CodeGen/AArch64/machine-outliner.mir. under memory sanitizer:
MemorySanitizer: use-of-uninitialized-value
    #0  in llvm::AArch64InstrInfo::getOutliningType(llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>&, unsigned int) const llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:5494:9
    chapuni#1  in (anonymous namespace)::InstructionMapper::convertToUnsignedVec(llvm::MachineBasicBlock&, llvm::TargetInstrInfo const&) llvm/lib/CodeGen/MachineOutliner.cpp:772:19
    chapuni#2  in (anonymous namespace)::MachineOutliner::populateMapper((anonymous namespace)::InstructionMapper&, llvm::Module&, llvm::MachineModuleInfo&) llvm/lib/CodeGen/MachineOutliner.cpp:1543:14
    chapuni#3  in (anonymous namespace)::MachineOutliner::runOnModule(llvm::Module&) llvm/lib/CodeGen/MachineOutliner.cpp:1645:3
    #4  in (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) llvm/lib/IR/LegacyPassManager.cpp:1744:27
    #5  in llvm::legacy::PassManagerImpl::run(llvm::Module&) llvm/lib/IR/LegacyPassManager.cpp:1857:44
    #6  in compileModule(char**, llvm::LLVMContext&) llvm/tools/llc/llc.cpp:597:8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant