1//===- MemorySanitizer.cpp - detector of uninitialized reads --------------===//
2//
3// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4// See https://llvm.org/LICENSE.txt for license information.
5// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6//
7//===----------------------------------------------------------------------===//
8//
9/// \file
10/// This file is a part of MemorySanitizer, a detector of uninitialized
11/// reads.
12///
13/// The algorithm of the tool is similar to Memcheck
14/// (https://static.usenix.org/event/usenix05/tech/general/full_papers/seward/seward_html/usenix2005.html)
15/// We associate a few shadow bits with every byte of the application memory,
16/// poison the shadow of the malloc-ed or alloca-ed memory, load the shadow,
17/// bits on every memory read, propagate the shadow bits through some of the
18/// arithmetic instruction (including MOV), store the shadow bits on every
19/// memory write, report a bug on some other instructions (e.g. JMP) if the
20/// associated shadow is poisoned.
21///
22/// But there are differences too. The first and the major one:
23/// compiler instrumentation instead of binary instrumentation. This
24/// gives us much better register allocation, possible compiler
25/// optimizations and a fast start-up. But this brings the major issue
26/// as well: msan needs to see all program events, including system
27/// calls and reads/writes in system libraries, so we either need to
28/// compile *everything* with msan or use a binary translation
29/// component (e.g. DynamoRIO) to instrument pre-built libraries.
30/// Another difference from Memcheck is that we use 8 shadow bits per
31/// byte of application memory and use a direct shadow mapping. This
32/// greatly simplifies the instrumentation code and avoids races on
33/// shadow updates (Memcheck is single-threaded so races are not a
34/// concern there. Memcheck uses 2 shadow bits per byte with a slow
35/// path storage that uses 8 bits per byte).
36///
37/// The default value of shadow is 0, which means "clean" (not poisoned).
38///
39/// Every module initializer should call __msan_init to ensure that the
40/// shadow memory is ready. On error, __msan_warning is called. Since
41/// parameters and return values may be passed via registers, we have a
42/// specialized thread-local shadow for return values
43/// (__msan_retval_tls) and parameters (__msan_param_tls).
44///
45/// Origin tracking.
46///
47/// MemorySanitizer can track origins (allocation points) of all uninitialized
48/// values. This behavior is controlled with a flag (msan-track-origins) and is
49/// disabled by default.
50///
51/// Origins are 4-byte values created and interpreted by the runtime library.
52/// They are stored in a second shadow mapping, one 4-byte value for 4 bytes
53/// of application memory. Propagation of origins is basically a bunch of
54/// "select" instructions that pick the origin of a dirty argument, if an
55/// instruction has one.
56///
57/// Every 4 aligned, consecutive bytes of application memory have one origin
58/// value associated with them. If these bytes contain uninitialized data
59/// coming from 2 different allocations, the last store wins. Because of this,
60/// MemorySanitizer reports can show unrelated origins, but this is unlikely in
61/// practice.
62///
63/// Origins are meaningless for fully initialized values, so MemorySanitizer
64/// avoids storing origin to memory when a fully initialized value is stored.
65/// This way it avoids needless overwriting origin of the 4-byte region on
66/// a short (i.e. 1 byte) clean store, and it is also good for performance.
67///
68/// Atomic handling.
69///
70/// Ideally, every atomic store of application value should update the
71/// corresponding shadow location in an atomic way. Unfortunately, atomic store
72/// of two disjoint locations can not be done without severe slowdown.
73///
74/// Therefore, we implement an approximation that may err on the safe side.
75/// In this implementation, every atomically accessed location in the program
76/// may only change from (partially) uninitialized to fully initialized, but
77/// not the other way around. We load the shadow _after_ the application load,
78/// and we store the shadow _before_ the app store. Also, we always store clean
79/// shadow (if the application store is atomic). This way, if the store-load
80/// pair constitutes a happens-before arc, shadow store and load are correctly
81/// ordered such that the load will get either the value that was stored, or
82/// some later value (which is always clean).
83///
84/// This does not work very well with Compare-And-Swap (CAS) and
85/// Read-Modify-Write (RMW) operations. To follow the above logic, CAS and RMW
86/// must store the new shadow before the app operation, and load the shadow
87/// after the app operation. Computers don't work this way. Current
88/// implementation ignores the load aspect of CAS/RMW, always returning a clean
89/// value. It implements the store part as a simple atomic store by storing a
90/// clean shadow.
91///
92/// Instrumenting inline assembly.
93///
94/// For inline assembly code LLVM has little idea about which memory locations
95/// become initialized depending on the arguments. It can be possible to figure
96/// out which arguments are meant to point to inputs and outputs, but the
97/// actual semantics can be only visible at runtime. In the Linux kernel it's
98/// also possible that the arguments only indicate the offset for a base taken
99/// from a segment register, so it's dangerous to treat any asm() arguments as
100/// pointers. We take a conservative approach generating calls to
101/// __msan_instrument_asm_store(ptr, size)
102/// , which defer the memory unpoisoning to the runtime library.
103/// The latter can perform more complex address checks to figure out whether
104/// it's safe to touch the shadow memory.
105/// Like with atomic operations, we call __msan_instrument_asm_store() before
106/// the assembly call, so that changes to the shadow memory will be seen by
107/// other threads together with main memory initialization.
108///
109/// KernelMemorySanitizer (KMSAN) implementation.
110///
111/// The major differences between KMSAN and MSan instrumentation are:
112/// - KMSAN always tracks the origins and implies msan-keep-going=true;
113/// - KMSAN allocates shadow and origin memory for each page separately, so
114/// there are no explicit accesses to shadow and origin in the
115/// instrumentation.
116/// Shadow and origin values for a particular X-byte memory location
117/// (X=1,2,4,8) are accessed through pointers obtained via the
118/// __msan_metadata_ptr_for_load_X(ptr)
119/// __msan_metadata_ptr_for_store_X(ptr)
120/// functions. The corresponding functions check that the X-byte accesses
121/// are possible and returns the pointers to shadow and origin memory.
122/// Arbitrary sized accesses are handled with:
123/// __msan_metadata_ptr_for_load_n(ptr, size)
124/// __msan_metadata_ptr_for_store_n(ptr, size);
125/// Note that the sanitizer code has to deal with how shadow/origin pairs
126/// returned by the these functions are represented in different ABIs. In
127/// the X86_64 ABI they are returned in RDX:RAX, in PowerPC64 they are
128/// returned in r3 and r4, and in the SystemZ ABI they are written to memory
129/// pointed to by a hidden parameter.
130/// - TLS variables are stored in a single per-task struct. A call to a
131/// function __msan_get_context_state() returning a pointer to that struct
132/// is inserted into every instrumented function before the entry block;
133/// - __msan_warning() takes a 32-bit origin parameter;
134/// - local variables are poisoned with __msan_poison_alloca() upon function
135/// entry and unpoisoned with __msan_unpoison_alloca() before leaving the
136/// function;
137/// - the pass doesn't declare any global variables or add global constructors
138/// to the translation unit.
139///
140/// Also, KMSAN currently ignores uninitialized memory passed into inline asm
141/// calls, making sure we're on the safe side wrt. possible false positives.
142///
143/// KernelMemorySanitizer only supports X86_64, SystemZ and PowerPC64 at the
144/// moment.
145///
146//
147// FIXME: This sanitizer does not yet handle scalable vectors
148//
149//===----------------------------------------------------------------------===//
150
151#include "llvm/Transforms/Instrumentation/MemorySanitizer.h"
152#include "llvm/ADT/APInt.h"
153#include "llvm/ADT/ArrayRef.h"
154#include "llvm/ADT/DenseMap.h"
155#include "llvm/ADT/DepthFirstIterator.h"
156#include "llvm/ADT/SetVector.h"
157#include "llvm/ADT/SmallPtrSet.h"
158#include "llvm/ADT/SmallVector.h"
159#include "llvm/ADT/StringExtras.h"
160#include "llvm/ADT/StringRef.h"
161#include "llvm/Analysis/GlobalsModRef.h"
162#include "llvm/Analysis/TargetLibraryInfo.h"
163#include "llvm/Analysis/ValueTracking.h"
164#include "llvm/IR/Argument.h"
165#include "llvm/IR/AttributeMask.h"
166#include "llvm/IR/Attributes.h"
167#include "llvm/IR/BasicBlock.h"
168#include "llvm/IR/CallingConv.h"
169#include "llvm/IR/Constant.h"
170#include "llvm/IR/Constants.h"
171#include "llvm/IR/DataLayout.h"
172#include "llvm/IR/DerivedTypes.h"
173#include "llvm/IR/Function.h"
174#include "llvm/IR/GlobalValue.h"
175#include "llvm/IR/GlobalVariable.h"
176#include "llvm/IR/IRBuilder.h"
177#include "llvm/IR/InlineAsm.h"
178#include "llvm/IR/InstVisitor.h"
179#include "llvm/IR/InstrTypes.h"
180#include "llvm/IR/Instruction.h"
181#include "llvm/IR/Instructions.h"
182#include "llvm/IR/IntrinsicInst.h"
183#include "llvm/IR/Intrinsics.h"
184#include "llvm/IR/IntrinsicsAArch64.h"
185#include "llvm/IR/IntrinsicsX86.h"
186#include "llvm/IR/MDBuilder.h"
187#include "llvm/IR/Module.h"
188#include "llvm/IR/Type.h"
189#include "llvm/IR/Value.h"
190#include "llvm/IR/ValueMap.h"
191#include "llvm/Support/Alignment.h"
192#include "llvm/Support/AtomicOrdering.h"
193#include "llvm/Support/Casting.h"
194#include "llvm/Support/CommandLine.h"
195#include "llvm/Support/Debug.h"
196#include "llvm/Support/DebugCounter.h"
197#include "llvm/Support/ErrorHandling.h"
198#include "llvm/Support/MathExtras.h"
199#include "llvm/Support/raw_ostream.h"
200#include "llvm/TargetParser/Triple.h"
201#include "llvm/Transforms/Utils/BasicBlockUtils.h"
202#include "llvm/Transforms/Utils/Instrumentation.h"
203#include "llvm/Transforms/Utils/Local.h"
204#include "llvm/Transforms/Utils/ModuleUtils.h"
205#include <algorithm>
206#include <cassert>
207#include <cstddef>
208#include <cstdint>
209#include <memory>
210#include <numeric>
211#include <string>
212#include <tuple>
213
214using namespace llvm;
215
216#define DEBUG_TYPE "msan"
217
218DEBUG_COUNTER(DebugInsertCheck, "msan-insert-check",
219 "Controls which checks to insert");
220
221DEBUG_COUNTER(DebugInstrumentInstruction, "msan-instrument-instruction",
222 "Controls which instruction to instrument");
223
224static const unsigned kOriginSize = 4;
225static const Align kMinOriginAlignment = Align(4);
226static const Align kShadowTLSAlignment = Align(8);
227
228// These constants must be kept in sync with the ones in msan.h.
229static const unsigned kParamTLSSize = 800;
230static const unsigned kRetvalTLSSize = 800;
231
232// Accesses sizes are powers of two: 1, 2, 4, 8.
233static const size_t kNumberOfAccessSizes = 4;
234
235/// Track origins of uninitialized values.
236///
237/// Adds a section to MemorySanitizer report that points to the allocation
238/// (stack or heap) the uninitialized bits came from originally.
239static cl::opt<int> ClTrackOrigins(
240 "msan-track-origins",
241 cl::desc("Track origins (allocation sites) of poisoned memory"), cl::Hidden,
242 cl::init(Val: 0));
243
244static cl::opt<bool> ClKeepGoing("msan-keep-going",
245 cl::desc("keep going after reporting a UMR"),
246 cl::Hidden, cl::init(Val: false));
247
248static cl::opt<bool>
249 ClPoisonStack("msan-poison-stack",
250 cl::desc("poison uninitialized stack variables"), cl::Hidden,
251 cl::init(Val: true));
252
253static cl::opt<bool> ClPoisonStackWithCall(
254 "msan-poison-stack-with-call",
255 cl::desc("poison uninitialized stack variables with a call"), cl::Hidden,
256 cl::init(Val: false));
257
258static cl::opt<int> ClPoisonStackPattern(
259 "msan-poison-stack-pattern",
260 cl::desc("poison uninitialized stack variables with the given pattern"),
261 cl::Hidden, cl::init(Val: 0xff));
262
263static cl::opt<bool>
264 ClPrintStackNames("msan-print-stack-names",
265 cl::desc("Print name of local stack variable"),
266 cl::Hidden, cl::init(Val: true));
267
268static cl::opt<bool>
269 ClPoisonUndef("msan-poison-undef",
270 cl::desc("Poison fully undef temporary values. "
271 "Partially undefined constant vectors "
272 "are unaffected by this flag (see "
273 "-msan-poison-undef-vectors)."),
274 cl::Hidden, cl::init(Val: true));
275
276static cl::opt<bool> ClPoisonUndefVectors(
277 "msan-poison-undef-vectors",
278 cl::desc("Precisely poison partially undefined constant vectors. "
279 "If false (legacy behavior), the entire vector is "
280 "considered fully initialized, which may lead to false "
281 "negatives. Fully undefined constant vectors are "
282 "unaffected by this flag (see -msan-poison-undef)."),
283 cl::Hidden, cl::init(Val: false));
284
285static cl::opt<bool> ClPreciseDisjointOr(
286 "msan-precise-disjoint-or",
287 cl::desc("Precisely poison disjoint OR. If false (legacy behavior), "
288 "disjointedness is ignored (i.e., 1|1 is initialized)."),
289 cl::Hidden, cl::init(Val: false));
290
291static cl::opt<bool>
292 ClHandleICmp("msan-handle-icmp",
293 cl::desc("propagate shadow through ICmpEQ and ICmpNE"),
294 cl::Hidden, cl::init(Val: true));
295
296static cl::opt<bool>
297 ClHandleICmpExact("msan-handle-icmp-exact",
298 cl::desc("exact handling of relational integer ICmp"),
299 cl::Hidden, cl::init(Val: true));
300
301static cl::opt<bool> ClHandleLifetimeIntrinsics(
302 "msan-handle-lifetime-intrinsics",
303 cl::desc(
304 "when possible, poison scoped variables at the beginning of the scope "
305 "(slower, but more precise)"),
306 cl::Hidden, cl::init(Val: true));
307
308// When compiling the Linux kernel, we sometimes see false positives related to
309// MSan being unable to understand that inline assembly calls may initialize
310// local variables.
311// This flag makes the compiler conservatively unpoison every memory location
312// passed into an assembly call. Note that this may cause false positives.
313// Because it's impossible to figure out the array sizes, we can only unpoison
314// the first sizeof(type) bytes for each type* pointer.
315static cl::opt<bool> ClHandleAsmConservative(
316 "msan-handle-asm-conservative",
317 cl::desc("conservative handling of inline assembly"), cl::Hidden,
318 cl::init(Val: true));
319
320// This flag controls whether we check the shadow of the address
321// operand of load or store. Such bugs are very rare, since load from
322// a garbage address typically results in SEGV, but still happen
323// (e.g. only lower bits of address are garbage, or the access happens
324// early at program startup where malloc-ed memory is more likely to
325// be zeroed. As of 2012-08-28 this flag adds 20% slowdown.
326static cl::opt<bool> ClCheckAccessAddress(
327 "msan-check-access-address",
328 cl::desc("report accesses through a pointer which has poisoned shadow"),
329 cl::Hidden, cl::init(Val: true));
330
331static cl::opt<bool> ClEagerChecks(
332 "msan-eager-checks",
333 cl::desc("check arguments and return values at function call boundaries"),
334 cl::Hidden, cl::init(Val: false));
335
336static cl::opt<bool> ClDumpStrictInstructions(
337 "msan-dump-strict-instructions",
338 cl::desc("print out instructions with default strict semantics i.e.,"
339 "check that all the inputs are fully initialized, and mark "
340 "the output as fully initialized. These semantics are applied "
341 "to instructions that could not be handled explicitly nor "
342 "heuristically."),
343 cl::Hidden, cl::init(Val: false));
344
345// Currently, all the heuristically handled instructions are specifically
346// IntrinsicInst. However, we use the broader "HeuristicInstructions" name
347// to parallel 'msan-dump-strict-instructions', and to keep the door open to
348// handling non-intrinsic instructions heuristically.
349static cl::opt<bool> ClDumpHeuristicInstructions(
350 "msan-dump-heuristic-instructions",
351 cl::desc("Prints 'unknown' instructions that were handled heuristically. "
352 "Use -msan-dump-strict-instructions to print instructions that "
353 "could not be handled explicitly nor heuristically."),
354 cl::Hidden, cl::init(Val: false));
355
356static cl::opt<int> ClInstrumentationWithCallThreshold(
357 "msan-instrumentation-with-call-threshold",
358 cl::desc(
359 "If the function being instrumented requires more than "
360 "this number of checks and origin stores, use callbacks instead of "
361 "inline checks (-1 means never use callbacks)."),
362 cl::Hidden, cl::init(Val: 3500));
363
364static cl::opt<bool>
365 ClEnableKmsan("msan-kernel",
366 cl::desc("Enable KernelMemorySanitizer instrumentation"),
367 cl::Hidden, cl::init(Val: false));
368
369static cl::opt<bool>
370 ClDisableChecks("msan-disable-checks",
371 cl::desc("Apply no_sanitize to the whole file"), cl::Hidden,
372 cl::init(Val: false));
373
374static cl::opt<bool>
375 ClCheckConstantShadow("msan-check-constant-shadow",
376 cl::desc("Insert checks for constant shadow values"),
377 cl::Hidden, cl::init(Val: true));
378
379// This is off by default because of a bug in gold:
380// https://sourceware.org/bugzilla/show_bug.cgi?id=19002
381static cl::opt<bool>
382 ClWithComdat("msan-with-comdat",
383 cl::desc("Place MSan constructors in comdat sections"),
384 cl::Hidden, cl::init(Val: false));
385
386// These options allow to specify custom memory map parameters
387// See MemoryMapParams for details.
388static cl::opt<uint64_t> ClAndMask("msan-and-mask",
389 cl::desc("Define custom MSan AndMask"),
390 cl::Hidden, cl::init(Val: 0));
391
392static cl::opt<uint64_t> ClXorMask("msan-xor-mask",
393 cl::desc("Define custom MSan XorMask"),
394 cl::Hidden, cl::init(Val: 0));
395
396static cl::opt<uint64_t> ClShadowBase("msan-shadow-base",
397 cl::desc("Define custom MSan ShadowBase"),
398 cl::Hidden, cl::init(Val: 0));
399
400static cl::opt<uint64_t> ClOriginBase("msan-origin-base",
401 cl::desc("Define custom MSan OriginBase"),
402 cl::Hidden, cl::init(Val: 0));
403
404static cl::opt<int>
405 ClDisambiguateWarning("msan-disambiguate-warning-threshold",
406 cl::desc("Define threshold for number of checks per "
407 "debug location to force origin update."),
408 cl::Hidden, cl::init(Val: 3));
409
410const char kMsanModuleCtorName[] = "msan.module_ctor";
411const char kMsanInitName[] = "__msan_init";
412
413namespace {
414
415// Memory map parameters used in application-to-shadow address calculation.
416// Offset = (Addr & ~AndMask) ^ XorMask
417// Shadow = ShadowBase + Offset
418// Origin = OriginBase + Offset
419struct MemoryMapParams {
420 uint64_t AndMask;
421 uint64_t XorMask;
422 uint64_t ShadowBase;
423 uint64_t OriginBase;
424};
425
426struct PlatformMemoryMapParams {
427 const MemoryMapParams *bits32;
428 const MemoryMapParams *bits64;
429};
430
431} // end anonymous namespace
432
433// i386 Linux
434static const MemoryMapParams Linux_I386_MemoryMapParams = {
435 .AndMask: 0x000080000000, // AndMask
436 .XorMask: 0, // XorMask (not used)
437 .ShadowBase: 0, // ShadowBase (not used)
438 .OriginBase: 0x000040000000, // OriginBase
439};
440
441// x86_64 Linux
442static const MemoryMapParams Linux_X86_64_MemoryMapParams = {
443 .AndMask: 0, // AndMask (not used)
444 .XorMask: 0x500000000000, // XorMask
445 .ShadowBase: 0, // ShadowBase (not used)
446 .OriginBase: 0x100000000000, // OriginBase
447};
448
449// mips32 Linux
450// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
451// after picking good constants
452
453// mips64 Linux
454static const MemoryMapParams Linux_MIPS64_MemoryMapParams = {
455 .AndMask: 0, // AndMask (not used)
456 .XorMask: 0x008000000000, // XorMask
457 .ShadowBase: 0, // ShadowBase (not used)
458 .OriginBase: 0x002000000000, // OriginBase
459};
460
461// ppc32 Linux
462// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
463// after picking good constants
464
465// ppc64 Linux
466static const MemoryMapParams Linux_PowerPC64_MemoryMapParams = {
467 .AndMask: 0xE00000000000, // AndMask
468 .XorMask: 0x100000000000, // XorMask
469 .ShadowBase: 0x080000000000, // ShadowBase
470 .OriginBase: 0x1C0000000000, // OriginBase
471};
472
473// s390x Linux
474static const MemoryMapParams Linux_S390X_MemoryMapParams = {
475 .AndMask: 0xC00000000000, // AndMask
476 .XorMask: 0, // XorMask (not used)
477 .ShadowBase: 0x080000000000, // ShadowBase
478 .OriginBase: 0x1C0000000000, // OriginBase
479};
480
481// arm32 Linux
482// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
483// after picking good constants
484
485// aarch64 Linux
486static const MemoryMapParams Linux_AArch64_MemoryMapParams = {
487 .AndMask: 0, // AndMask (not used)
488 .XorMask: 0x0B00000000000, // XorMask
489 .ShadowBase: 0, // ShadowBase (not used)
490 .OriginBase: 0x0200000000000, // OriginBase
491};
492
493// loongarch64 Linux
494static const MemoryMapParams Linux_LoongArch64_MemoryMapParams = {
495 .AndMask: 0, // AndMask (not used)
496 .XorMask: 0x500000000000, // XorMask
497 .ShadowBase: 0, // ShadowBase (not used)
498 .OriginBase: 0x100000000000, // OriginBase
499};
500
501// riscv32 Linux
502// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
503// after picking good constants
504
505// aarch64 FreeBSD
506static const MemoryMapParams FreeBSD_AArch64_MemoryMapParams = {
507 .AndMask: 0x1800000000000, // AndMask
508 .XorMask: 0x0400000000000, // XorMask
509 .ShadowBase: 0x0200000000000, // ShadowBase
510 .OriginBase: 0x0700000000000, // OriginBase
511};
512
513// i386 FreeBSD
514static const MemoryMapParams FreeBSD_I386_MemoryMapParams = {
515 .AndMask: 0x000180000000, // AndMask
516 .XorMask: 0x000040000000, // XorMask
517 .ShadowBase: 0x000020000000, // ShadowBase
518 .OriginBase: 0x000700000000, // OriginBase
519};
520
521// x86_64 FreeBSD
522static const MemoryMapParams FreeBSD_X86_64_MemoryMapParams = {
523 .AndMask: 0xc00000000000, // AndMask
524 .XorMask: 0x200000000000, // XorMask
525 .ShadowBase: 0x100000000000, // ShadowBase
526 .OriginBase: 0x380000000000, // OriginBase
527};
528
529// x86_64 NetBSD
530static const MemoryMapParams NetBSD_X86_64_MemoryMapParams = {
531 .AndMask: 0, // AndMask
532 .XorMask: 0x500000000000, // XorMask
533 .ShadowBase: 0, // ShadowBase
534 .OriginBase: 0x100000000000, // OriginBase
535};
536
537static const PlatformMemoryMapParams Linux_X86_MemoryMapParams = {
538 .bits32: &Linux_I386_MemoryMapParams,
539 .bits64: &Linux_X86_64_MemoryMapParams,
540};
541
542static const PlatformMemoryMapParams Linux_MIPS_MemoryMapParams = {
543 .bits32: nullptr,
544 .bits64: &Linux_MIPS64_MemoryMapParams,
545};
546
547static const PlatformMemoryMapParams Linux_PowerPC_MemoryMapParams = {
548 .bits32: nullptr,
549 .bits64: &Linux_PowerPC64_MemoryMapParams,
550};
551
552static const PlatformMemoryMapParams Linux_S390_MemoryMapParams = {
553 .bits32: nullptr,
554 .bits64: &Linux_S390X_MemoryMapParams,
555};
556
557static const PlatformMemoryMapParams Linux_ARM_MemoryMapParams = {
558 .bits32: nullptr,
559 .bits64: &Linux_AArch64_MemoryMapParams,
560};
561
562static const PlatformMemoryMapParams Linux_LoongArch_MemoryMapParams = {
563 .bits32: nullptr,
564 .bits64: &Linux_LoongArch64_MemoryMapParams,
565};
566
567static const PlatformMemoryMapParams FreeBSD_ARM_MemoryMapParams = {
568 .bits32: nullptr,
569 .bits64: &FreeBSD_AArch64_MemoryMapParams,
570};
571
572static const PlatformMemoryMapParams FreeBSD_X86_MemoryMapParams = {
573 .bits32: &FreeBSD_I386_MemoryMapParams,
574 .bits64: &FreeBSD_X86_64_MemoryMapParams,
575};
576
577static const PlatformMemoryMapParams NetBSD_X86_MemoryMapParams = {
578 .bits32: nullptr,
579 .bits64: &NetBSD_X86_64_MemoryMapParams,
580};
581
582namespace {
583
584/// Instrument functions of a module to detect uninitialized reads.
585///
586/// Instantiating MemorySanitizer inserts the msan runtime library API function
587/// declarations into the module if they don't exist already. Instantiating
588/// ensures the __msan_init function is in the list of global constructors for
589/// the module.
590class MemorySanitizer {
591public:
592 MemorySanitizer(Module &M, MemorySanitizerOptions Options)
593 : CompileKernel(Options.Kernel), TrackOrigins(Options.TrackOrigins),
594 Recover(Options.Recover), EagerChecks(Options.EagerChecks) {
595 initializeModule(M);
596 }
597
598 // MSan cannot be moved or copied because of MapParams.
599 MemorySanitizer(MemorySanitizer &&) = delete;
600 MemorySanitizer &operator=(MemorySanitizer &&) = delete;
601 MemorySanitizer(const MemorySanitizer &) = delete;
602 MemorySanitizer &operator=(const MemorySanitizer &) = delete;
603
604 bool sanitizeFunction(Function &F, TargetLibraryInfo &TLI);
605
606private:
607 friend struct MemorySanitizerVisitor;
608 friend struct VarArgHelperBase;
609 friend struct VarArgAMD64Helper;
610 friend struct VarArgAArch64Helper;
611 friend struct VarArgPowerPC64Helper;
612 friend struct VarArgPowerPC32Helper;
613 friend struct VarArgSystemZHelper;
614 friend struct VarArgI386Helper;
615 friend struct VarArgGenericHelper;
616
617 void initializeModule(Module &M);
618 void initializeCallbacks(Module &M, const TargetLibraryInfo &TLI);
619 void createKernelApi(Module &M, const TargetLibraryInfo &TLI);
620 void createUserspaceApi(Module &M, const TargetLibraryInfo &TLI);
621
622 template <typename... ArgsTy>
623 FunctionCallee getOrInsertMsanMetadataFunction(Module &M, StringRef Name,
624 ArgsTy... Args);
625
626 /// True if we're compiling the Linux kernel.
627 bool CompileKernel;
628 /// Track origins (allocation points) of uninitialized values.
629 int TrackOrigins;
630 bool Recover;
631 bool EagerChecks;
632
633 Triple TargetTriple;
634 LLVMContext *C;
635 Type *IntptrTy; ///< Integer type with the size of a ptr in default AS.
636 Type *OriginTy;
637 PointerType *PtrTy; ///< Integer type with the size of a ptr in default AS.
638
639 // XxxTLS variables represent the per-thread state in MSan and per-task state
640 // in KMSAN.
641 // For the userspace these point to thread-local globals. In the kernel land
642 // they point to the members of a per-task struct obtained via a call to
643 // __msan_get_context_state().
644
645 /// Thread-local shadow storage for function parameters.
646 Value *ParamTLS;
647
648 /// Thread-local origin storage for function parameters.
649 Value *ParamOriginTLS;
650
651 /// Thread-local shadow storage for function return value.
652 Value *RetvalTLS;
653
654 /// Thread-local origin storage for function return value.
655 Value *RetvalOriginTLS;
656
657 /// Thread-local shadow storage for in-register va_arg function.
658 Value *VAArgTLS;
659
660 /// Thread-local shadow storage for in-register va_arg function.
661 Value *VAArgOriginTLS;
662
663 /// Thread-local shadow storage for va_arg overflow area.
664 Value *VAArgOverflowSizeTLS;
665
666 /// Are the instrumentation callbacks set up?
667 bool CallbacksInitialized = false;
668
669 /// The run-time callback to print a warning.
670 FunctionCallee WarningFn;
671
672 // These arrays are indexed by log2(AccessSize).
673 FunctionCallee MaybeWarningFn[kNumberOfAccessSizes];
674 FunctionCallee MaybeWarningVarSizeFn;
675 FunctionCallee MaybeStoreOriginFn[kNumberOfAccessSizes];
676
677 /// Run-time helper that generates a new origin value for a stack
678 /// allocation.
679 FunctionCallee MsanSetAllocaOriginWithDescriptionFn;
680 // No description version
681 FunctionCallee MsanSetAllocaOriginNoDescriptionFn;
682
683 /// Run-time helper that poisons stack on function entry.
684 FunctionCallee MsanPoisonStackFn;
685
686 /// Run-time helper that records a store (or any event) of an
687 /// uninitialized value and returns an updated origin id encoding this info.
688 FunctionCallee MsanChainOriginFn;
689
690 /// Run-time helper that paints an origin over a region.
691 FunctionCallee MsanSetOriginFn;
692
693 /// MSan runtime replacements for memmove, memcpy and memset.
694 FunctionCallee MemmoveFn, MemcpyFn, MemsetFn;
695
696 /// KMSAN callback for task-local function argument shadow.
697 StructType *MsanContextStateTy;
698 FunctionCallee MsanGetContextStateFn;
699
700 /// Functions for poisoning/unpoisoning local variables
701 FunctionCallee MsanPoisonAllocaFn, MsanUnpoisonAllocaFn;
702
703 /// Pair of shadow/origin pointers.
704 Type *MsanMetadata;
705
706 /// Each of the MsanMetadataPtrXxx functions returns a MsanMetadata.
707 FunctionCallee MsanMetadataPtrForLoadN, MsanMetadataPtrForStoreN;
708 FunctionCallee MsanMetadataPtrForLoad_1_8[4];
709 FunctionCallee MsanMetadataPtrForStore_1_8[4];
710 FunctionCallee MsanInstrumentAsmStoreFn;
711
712 /// Storage for return values of the MsanMetadataPtrXxx functions.
713 Value *MsanMetadataAlloca;
714
715 /// Helper to choose between different MsanMetadataPtrXxx().
716 FunctionCallee getKmsanShadowOriginAccessFn(bool isStore, int size);
717
718 /// Memory map parameters used in application-to-shadow calculation.
719 const MemoryMapParams *MapParams;
720
721 /// Custom memory map parameters used when -msan-shadow-base or
722 // -msan-origin-base is provided.
723 MemoryMapParams CustomMapParams;
724
725 MDNode *ColdCallWeights;
726
727 /// Branch weights for origin store.
728 MDNode *OriginStoreWeights;
729};
730
731void insertModuleCtor(Module &M) {
732 getOrCreateSanitizerCtorAndInitFunctions(
733 M, CtorName: kMsanModuleCtorName, InitName: kMsanInitName,
734 /*InitArgTypes=*/{},
735 /*InitArgs=*/{},
736 // This callback is invoked when the functions are created the first
737 // time. Hook them into the global ctors list in that case:
738 FunctionsCreatedCallback: [&](Function *Ctor, FunctionCallee) {
739 if (!ClWithComdat) {
740 appendToGlobalCtors(M, F: Ctor, Priority: 0);
741 return;
742 }
743 Comdat *MsanCtorComdat = M.getOrInsertComdat(Name: kMsanModuleCtorName);
744 Ctor->setComdat(MsanCtorComdat);
745 appendToGlobalCtors(M, F: Ctor, Priority: 0, Data: Ctor);
746 });
747}
748
749template <class T> T getOptOrDefault(const cl::opt<T> &Opt, T Default) {
750 return (Opt.getNumOccurrences() > 0) ? Opt : Default;
751}
752
753} // end anonymous namespace
754
755MemorySanitizerOptions::MemorySanitizerOptions(int TO, bool R, bool K,
756 bool EagerChecks)
757 : Kernel(getOptOrDefault(Opt: ClEnableKmsan, Default: K)),
758 TrackOrigins(getOptOrDefault(Opt: ClTrackOrigins, Default: Kernel ? 2 : TO)),
759 Recover(getOptOrDefault(Opt: ClKeepGoing, Default: Kernel || R)),
760 EagerChecks(getOptOrDefault(Opt: ClEagerChecks, Default: EagerChecks)) {}
761
762PreservedAnalyses MemorySanitizerPass::run(Module &M,
763 ModuleAnalysisManager &AM) {
764 // Return early if nosanitize_memory module flag is present for the module.
765 if (checkIfAlreadyInstrumented(M, Flag: "nosanitize_memory"))
766 return PreservedAnalyses::all();
767 bool Modified = false;
768 if (!Options.Kernel) {
769 insertModuleCtor(M);
770 Modified = true;
771 }
772
773 auto &FAM = AM.getResult<FunctionAnalysisManagerModuleProxy>(IR&: M).getManager();
774 for (Function &F : M) {
775 if (F.empty())
776 continue;
777 MemorySanitizer Msan(*F.getParent(), Options);
778 Modified |=
779 Msan.sanitizeFunction(F, TLI&: FAM.getResult<TargetLibraryAnalysis>(IR&: F));
780 }
781
782 if (!Modified)
783 return PreservedAnalyses::all();
784
785 PreservedAnalyses PA = PreservedAnalyses::none();
786 // GlobalsAA is considered stateless and does not get invalidated unless
787 // explicitly invalidated; PreservedAnalyses::none() is not enough. Sanitizers
788 // make changes that require GlobalsAA to be invalidated.
789 PA.abandon<GlobalsAA>();
790 return PA;
791}
792
793void MemorySanitizerPass::printPipeline(
794 raw_ostream &OS, function_ref<StringRef(StringRef)> MapClassName2PassName) {
795 static_cast<PassInfoMixin<MemorySanitizerPass> *>(this)->printPipeline(
796 OS, MapClassName2PassName);
797 OS << '<';
798 if (Options.Recover)
799 OS << "recover;";
800 if (Options.Kernel)
801 OS << "kernel;";
802 if (Options.EagerChecks)
803 OS << "eager-checks;";
804 OS << "track-origins=" << Options.TrackOrigins;
805 OS << '>';
806}
807
808/// Create a non-const global initialized with the given string.
809///
810/// Creates a writable global for Str so that we can pass it to the
811/// run-time lib. Runtime uses first 4 bytes of the string to store the
812/// frame ID, so the string needs to be mutable.
813static GlobalVariable *createPrivateConstGlobalForString(Module &M,
814 StringRef Str) {
815 Constant *StrConst = ConstantDataArray::getString(Context&: M.getContext(), Initializer: Str);
816 return new GlobalVariable(M, StrConst->getType(), /*isConstant=*/true,
817 GlobalValue::PrivateLinkage, StrConst, "");
818}
819
820template <typename... ArgsTy>
821FunctionCallee
822MemorySanitizer::getOrInsertMsanMetadataFunction(Module &M, StringRef Name,
823 ArgsTy... Args) {
824 if (TargetTriple.getArch() == Triple::systemz) {
825 // SystemZ ABI: shadow/origin pair is returned via a hidden parameter.
826 return M.getOrInsertFunction(Name, Type::getVoidTy(C&: *C), PtrTy,
827 std::forward<ArgsTy>(Args)...);
828 }
829
830 return M.getOrInsertFunction(Name, MsanMetadata,
831 std::forward<ArgsTy>(Args)...);
832}
833
834/// Create KMSAN API callbacks.
835void MemorySanitizer::createKernelApi(Module &M, const TargetLibraryInfo &TLI) {
836 IRBuilder<> IRB(*C);
837
838 // These will be initialized in insertKmsanPrologue().
839 RetvalTLS = nullptr;
840 RetvalOriginTLS = nullptr;
841 ParamTLS = nullptr;
842 ParamOriginTLS = nullptr;
843 VAArgTLS = nullptr;
844 VAArgOriginTLS = nullptr;
845 VAArgOverflowSizeTLS = nullptr;
846
847 WarningFn = M.getOrInsertFunction(Name: "__msan_warning",
848 AttributeList: TLI.getAttrList(C, ArgNos: {0}, /*Signed=*/false),
849 RetTy: IRB.getVoidTy(), Args: IRB.getInt32Ty());
850
851 // Requests the per-task context state (kmsan_context_state*) from the
852 // runtime library.
853 MsanContextStateTy = StructType::get(
854 elt1: ArrayType::get(ElementType: IRB.getInt64Ty(), NumElements: kParamTLSSize / 8),
855 elts: ArrayType::get(ElementType: IRB.getInt64Ty(), NumElements: kRetvalTLSSize / 8),
856 elts: ArrayType::get(ElementType: IRB.getInt64Ty(), NumElements: kParamTLSSize / 8),
857 elts: ArrayType::get(ElementType: IRB.getInt64Ty(), NumElements: kParamTLSSize / 8), /* va_arg_origin */
858 elts: IRB.getInt64Ty(), elts: ArrayType::get(ElementType: OriginTy, NumElements: kParamTLSSize / 4), elts: OriginTy,
859 elts: OriginTy);
860 MsanGetContextStateFn =
861 M.getOrInsertFunction(Name: "__msan_get_context_state", RetTy: PtrTy);
862
863 MsanMetadata = StructType::get(elt1: PtrTy, elts: PtrTy);
864
865 for (int ind = 0, size = 1; ind < 4; ind++, size <<= 1) {
866 std::string name_load =
867 "__msan_metadata_ptr_for_load_" + std::to_string(val: size);
868 std::string name_store =
869 "__msan_metadata_ptr_for_store_" + std::to_string(val: size);
870 MsanMetadataPtrForLoad_1_8[ind] =
871 getOrInsertMsanMetadataFunction(M, Name: name_load, Args: PtrTy);
872 MsanMetadataPtrForStore_1_8[ind] =
873 getOrInsertMsanMetadataFunction(M, Name: name_store, Args: PtrTy);
874 }
875
876 MsanMetadataPtrForLoadN = getOrInsertMsanMetadataFunction(
877 M, Name: "__msan_metadata_ptr_for_load_n", Args: PtrTy, Args: IntptrTy);
878 MsanMetadataPtrForStoreN = getOrInsertMsanMetadataFunction(
879 M, Name: "__msan_metadata_ptr_for_store_n", Args: PtrTy, Args: IntptrTy);
880
881 // Functions for poisoning and unpoisoning memory.
882 MsanPoisonAllocaFn = M.getOrInsertFunction(
883 Name: "__msan_poison_alloca", RetTy: IRB.getVoidTy(), Args: PtrTy, Args: IntptrTy, Args: PtrTy);
884 MsanUnpoisonAllocaFn = M.getOrInsertFunction(
885 Name: "__msan_unpoison_alloca", RetTy: IRB.getVoidTy(), Args: PtrTy, Args: IntptrTy);
886}
887
888static Constant *getOrInsertGlobal(Module &M, StringRef Name, Type *Ty) {
889 return M.getOrInsertGlobal(Name, Ty, CreateGlobalCallback: [&] {
890 return new GlobalVariable(M, Ty, false, GlobalVariable::ExternalLinkage,
891 nullptr, Name, nullptr,
892 GlobalVariable::InitialExecTLSModel);
893 });
894}
895
896/// Insert declarations for userspace-specific functions and globals.
897void MemorySanitizer::createUserspaceApi(Module &M,
898 const TargetLibraryInfo &TLI) {
899 IRBuilder<> IRB(*C);
900
901 // Create the callback.
902 // FIXME: this function should have "Cold" calling conv,
903 // which is not yet implemented.
904 if (TrackOrigins) {
905 StringRef WarningFnName = Recover ? "__msan_warning_with_origin"
906 : "__msan_warning_with_origin_noreturn";
907 WarningFn = M.getOrInsertFunction(Name: WarningFnName,
908 AttributeList: TLI.getAttrList(C, ArgNos: {0}, /*Signed=*/false),
909 RetTy: IRB.getVoidTy(), Args: IRB.getInt32Ty());
910 } else {
911 StringRef WarningFnName =
912 Recover ? "__msan_warning" : "__msan_warning_noreturn";
913 WarningFn = M.getOrInsertFunction(Name: WarningFnName, RetTy: IRB.getVoidTy());
914 }
915
916 // Create the global TLS variables.
917 RetvalTLS =
918 getOrInsertGlobal(M, Name: "__msan_retval_tls",
919 Ty: ArrayType::get(ElementType: IRB.getInt64Ty(), NumElements: kRetvalTLSSize / 8));
920
921 RetvalOriginTLS = getOrInsertGlobal(M, Name: "__msan_retval_origin_tls", Ty: OriginTy);
922
923 ParamTLS =
924 getOrInsertGlobal(M, Name: "__msan_param_tls",
925 Ty: ArrayType::get(ElementType: IRB.getInt64Ty(), NumElements: kParamTLSSize / 8));
926
927 ParamOriginTLS =
928 getOrInsertGlobal(M, Name: "__msan_param_origin_tls",
929 Ty: ArrayType::get(ElementType: OriginTy, NumElements: kParamTLSSize / 4));
930
931 VAArgTLS =
932 getOrInsertGlobal(M, Name: "__msan_va_arg_tls",
933 Ty: ArrayType::get(ElementType: IRB.getInt64Ty(), NumElements: kParamTLSSize / 8));
934
935 VAArgOriginTLS =
936 getOrInsertGlobal(M, Name: "__msan_va_arg_origin_tls",
937 Ty: ArrayType::get(ElementType: OriginTy, NumElements: kParamTLSSize / 4));
938
939 VAArgOverflowSizeTLS = getOrInsertGlobal(M, Name: "__msan_va_arg_overflow_size_tls",
940 Ty: IRB.getIntPtrTy(DL: M.getDataLayout()));
941
942 for (size_t AccessSizeIndex = 0; AccessSizeIndex < kNumberOfAccessSizes;
943 AccessSizeIndex++) {
944 unsigned AccessSize = 1 << AccessSizeIndex;
945 std::string FunctionName = "__msan_maybe_warning_" + itostr(X: AccessSize);
946 MaybeWarningFn[AccessSizeIndex] = M.getOrInsertFunction(
947 Name: FunctionName, AttributeList: TLI.getAttrList(C, ArgNos: {0, 1}, /*Signed=*/false),
948 RetTy: IRB.getVoidTy(), Args: IRB.getIntNTy(N: AccessSize * 8), Args: IRB.getInt32Ty());
949 MaybeWarningVarSizeFn = M.getOrInsertFunction(
950 Name: "__msan_maybe_warning_N", AttributeList: TLI.getAttrList(C, ArgNos: {}, /*Signed=*/false),
951 RetTy: IRB.getVoidTy(), Args: PtrTy, Args: IRB.getInt64Ty(), Args: IRB.getInt32Ty());
952 FunctionName = "__msan_maybe_store_origin_" + itostr(X: AccessSize);
953 MaybeStoreOriginFn[AccessSizeIndex] = M.getOrInsertFunction(
954 Name: FunctionName, AttributeList: TLI.getAttrList(C, ArgNos: {0, 2}, /*Signed=*/false),
955 RetTy: IRB.getVoidTy(), Args: IRB.getIntNTy(N: AccessSize * 8), Args: PtrTy,
956 Args: IRB.getInt32Ty());
957 }
958
959 MsanSetAllocaOriginWithDescriptionFn =
960 M.getOrInsertFunction(Name: "__msan_set_alloca_origin_with_descr",
961 RetTy: IRB.getVoidTy(), Args: PtrTy, Args: IntptrTy, Args: PtrTy, Args: PtrTy);
962 MsanSetAllocaOriginNoDescriptionFn =
963 M.getOrInsertFunction(Name: "__msan_set_alloca_origin_no_descr",
964 RetTy: IRB.getVoidTy(), Args: PtrTy, Args: IntptrTy, Args: PtrTy);
965 MsanPoisonStackFn = M.getOrInsertFunction(Name: "__msan_poison_stack",
966 RetTy: IRB.getVoidTy(), Args: PtrTy, Args: IntptrTy);
967}
968
969/// Insert extern declaration of runtime-provided functions and globals.
970void MemorySanitizer::initializeCallbacks(Module &M,
971 const TargetLibraryInfo &TLI) {
972 // Only do this once.
973 if (CallbacksInitialized)
974 return;
975
976 IRBuilder<> IRB(*C);
977 // Initialize callbacks that are common for kernel and userspace
978 // instrumentation.
979 MsanChainOriginFn = M.getOrInsertFunction(
980 Name: "__msan_chain_origin",
981 AttributeList: TLI.getAttrList(C, ArgNos: {0}, /*Signed=*/false, /*Ret=*/true), RetTy: IRB.getInt32Ty(),
982 Args: IRB.getInt32Ty());
983 MsanSetOriginFn = M.getOrInsertFunction(
984 Name: "__msan_set_origin", AttributeList: TLI.getAttrList(C, ArgNos: {2}, /*Signed=*/false),
985 RetTy: IRB.getVoidTy(), Args: PtrTy, Args: IntptrTy, Args: IRB.getInt32Ty());
986 MemmoveFn =
987 M.getOrInsertFunction(Name: "__msan_memmove", RetTy: PtrTy, Args: PtrTy, Args: PtrTy, Args: IntptrTy);
988 MemcpyFn =
989 M.getOrInsertFunction(Name: "__msan_memcpy", RetTy: PtrTy, Args: PtrTy, Args: PtrTy, Args: IntptrTy);
990 MemsetFn = M.getOrInsertFunction(Name: "__msan_memset",
991 AttributeList: TLI.getAttrList(C, ArgNos: {1}, /*Signed=*/true),
992 RetTy: PtrTy, Args: PtrTy, Args: IRB.getInt32Ty(), Args: IntptrTy);
993
994 MsanInstrumentAsmStoreFn = M.getOrInsertFunction(
995 Name: "__msan_instrument_asm_store", RetTy: IRB.getVoidTy(), Args: PtrTy, Args: IntptrTy);
996
997 if (CompileKernel) {
998 createKernelApi(M, TLI);
999 } else {
1000 createUserspaceApi(M, TLI);
1001 }
1002 CallbacksInitialized = true;
1003}
1004
1005FunctionCallee MemorySanitizer::getKmsanShadowOriginAccessFn(bool isStore,
1006 int size) {
1007 FunctionCallee *Fns =
1008 isStore ? MsanMetadataPtrForStore_1_8 : MsanMetadataPtrForLoad_1_8;
1009 switch (size) {
1010 case 1:
1011 return Fns[0];
1012 case 2:
1013 return Fns[1];
1014 case 4:
1015 return Fns[2];
1016 case 8:
1017 return Fns[3];
1018 default:
1019 return nullptr;
1020 }
1021}
1022
1023/// Module-level initialization.
1024///
1025/// inserts a call to __msan_init to the module's constructor list.
1026void MemorySanitizer::initializeModule(Module &M) {
1027 auto &DL = M.getDataLayout();
1028
1029 TargetTriple = M.getTargetTriple();
1030
1031 bool ShadowPassed = ClShadowBase.getNumOccurrences() > 0;
1032 bool OriginPassed = ClOriginBase.getNumOccurrences() > 0;
1033 // Check the overrides first
1034 if (ShadowPassed || OriginPassed) {
1035 CustomMapParams.AndMask = ClAndMask;
1036 CustomMapParams.XorMask = ClXorMask;
1037 CustomMapParams.ShadowBase = ClShadowBase;
1038 CustomMapParams.OriginBase = ClOriginBase;
1039 MapParams = &CustomMapParams;
1040 } else {
1041 switch (TargetTriple.getOS()) {
1042 case Triple::FreeBSD:
1043 switch (TargetTriple.getArch()) {
1044 case Triple::aarch64:
1045 MapParams = FreeBSD_ARM_MemoryMapParams.bits64;
1046 break;
1047 case Triple::x86_64:
1048 MapParams = FreeBSD_X86_MemoryMapParams.bits64;
1049 break;
1050 case Triple::x86:
1051 MapParams = FreeBSD_X86_MemoryMapParams.bits32;
1052 break;
1053 default:
1054 report_fatal_error(reason: "unsupported architecture");
1055 }
1056 break;
1057 case Triple::NetBSD:
1058 switch (TargetTriple.getArch()) {
1059 case Triple::x86_64:
1060 MapParams = NetBSD_X86_MemoryMapParams.bits64;
1061 break;
1062 default:
1063 report_fatal_error(reason: "unsupported architecture");
1064 }
1065 break;
1066 case Triple::Linux:
1067 switch (TargetTriple.getArch()) {
1068 case Triple::x86_64:
1069 MapParams = Linux_X86_MemoryMapParams.bits64;
1070 break;
1071 case Triple::x86:
1072 MapParams = Linux_X86_MemoryMapParams.bits32;
1073 break;
1074 case Triple::mips64:
1075 case Triple::mips64el:
1076 MapParams = Linux_MIPS_MemoryMapParams.bits64;
1077 break;
1078 case Triple::ppc64:
1079 case Triple::ppc64le:
1080 MapParams = Linux_PowerPC_MemoryMapParams.bits64;
1081 break;
1082 case Triple::systemz:
1083 MapParams = Linux_S390_MemoryMapParams.bits64;
1084 break;
1085 case Triple::aarch64:
1086 case Triple::aarch64_be:
1087 MapParams = Linux_ARM_MemoryMapParams.bits64;
1088 break;
1089 case Triple::loongarch64:
1090 MapParams = Linux_LoongArch_MemoryMapParams.bits64;
1091 break;
1092 default:
1093 report_fatal_error(reason: "unsupported architecture");
1094 }
1095 break;
1096 default:
1097 report_fatal_error(reason: "unsupported operating system");
1098 }
1099 }
1100
1101 C = &(M.getContext());
1102 IRBuilder<> IRB(*C);
1103 IntptrTy = IRB.getIntPtrTy(DL);
1104 OriginTy = IRB.getInt32Ty();
1105 PtrTy = IRB.getPtrTy();
1106
1107 ColdCallWeights = MDBuilder(*C).createUnlikelyBranchWeights();
1108 OriginStoreWeights = MDBuilder(*C).createUnlikelyBranchWeights();
1109
1110 if (!CompileKernel) {
1111 if (TrackOrigins)
1112 M.getOrInsertGlobal(Name: "__msan_track_origins", Ty: IRB.getInt32Ty(), CreateGlobalCallback: [&] {
1113 return new GlobalVariable(
1114 M, IRB.getInt32Ty(), true, GlobalValue::WeakODRLinkage,
1115 IRB.getInt32(C: TrackOrigins), "__msan_track_origins");
1116 });
1117
1118 if (Recover)
1119 M.getOrInsertGlobal(Name: "__msan_keep_going", Ty: IRB.getInt32Ty(), CreateGlobalCallback: [&] {
1120 return new GlobalVariable(M, IRB.getInt32Ty(), true,
1121 GlobalValue::WeakODRLinkage,
1122 IRB.getInt32(C: Recover), "__msan_keep_going");
1123 });
1124 }
1125}
1126
1127namespace {
1128
1129/// A helper class that handles instrumentation of VarArg
1130/// functions on a particular platform.
1131///
1132/// Implementations are expected to insert the instrumentation
1133/// necessary to propagate argument shadow through VarArg function
1134/// calls. Visit* methods are called during an InstVisitor pass over
1135/// the function, and should avoid creating new basic blocks. A new
1136/// instance of this class is created for each instrumented function.
1137struct VarArgHelper {
1138 virtual ~VarArgHelper() = default;
1139
1140 /// Visit a CallBase.
1141 virtual void visitCallBase(CallBase &CB, IRBuilder<> &IRB) = 0;
1142
1143 /// Visit a va_start call.
1144 virtual void visitVAStartInst(VAStartInst &I) = 0;
1145
1146 /// Visit a va_copy call.
1147 virtual void visitVACopyInst(VACopyInst &I) = 0;
1148
1149 /// Finalize function instrumentation.
1150 ///
1151 /// This method is called after visiting all interesting (see above)
1152 /// instructions in a function.
1153 virtual void finalizeInstrumentation() = 0;
1154};
1155
1156struct MemorySanitizerVisitor;
1157
1158} // end anonymous namespace
1159
1160static VarArgHelper *CreateVarArgHelper(Function &Func, MemorySanitizer &Msan,
1161 MemorySanitizerVisitor &Visitor);
1162
1163static unsigned TypeSizeToSizeIndex(TypeSize TS) {
1164 if (TS.isScalable())
1165 // Scalable types unconditionally take slowpaths.
1166 return kNumberOfAccessSizes;
1167 unsigned TypeSizeFixed = TS.getFixedValue();
1168 if (TypeSizeFixed <= 8)
1169 return 0;
1170 return Log2_32_Ceil(Value: (TypeSizeFixed + 7) / 8);
1171}
1172
1173namespace {
1174
1175/// Helper class to attach debug information of the given instruction onto new
1176/// instructions inserted after.
1177class NextNodeIRBuilder : public IRBuilder<> {
1178public:
1179 explicit NextNodeIRBuilder(Instruction *IP) : IRBuilder<>(IP->getNextNode()) {
1180 SetCurrentDebugLocation(IP->getDebugLoc());
1181 }
1182};
1183
1184/// This class does all the work for a given function. Store and Load
1185/// instructions store and load corresponding shadow and origin
1186/// values. Most instructions propagate shadow from arguments to their
1187/// return values. Certain instructions (most importantly, BranchInst)
1188/// test their argument shadow and print reports (with a runtime call) if it's
1189/// non-zero.
1190struct MemorySanitizerVisitor : public InstVisitor<MemorySanitizerVisitor> {
1191 Function &F;
1192 MemorySanitizer &MS;
1193 SmallVector<PHINode *, 16> ShadowPHINodes, OriginPHINodes;
1194 ValueMap<Value *, Value *> ShadowMap, OriginMap;
1195 std::unique_ptr<VarArgHelper> VAHelper;
1196 const TargetLibraryInfo *TLI;
1197 Instruction *FnPrologueEnd;
1198 SmallVector<Instruction *, 16> Instructions;
1199
1200 // The following flags disable parts of MSan instrumentation based on
1201 // exclusion list contents and command-line options.
1202 bool InsertChecks;
1203 bool PropagateShadow;
1204 bool PoisonStack;
1205 bool PoisonUndef;
1206 bool PoisonUndefVectors;
1207
1208 struct ShadowOriginAndInsertPoint {
1209 Value *Shadow;
1210 Value *Origin;
1211 Instruction *OrigIns;
1212
1213 ShadowOriginAndInsertPoint(Value *S, Value *O, Instruction *I)
1214 : Shadow(S), Origin(O), OrigIns(I) {}
1215 };
1216 SmallVector<ShadowOriginAndInsertPoint, 16> InstrumentationList;
1217 DenseMap<const DILocation *, int> LazyWarningDebugLocationCount;
1218 bool InstrumentLifetimeStart = ClHandleLifetimeIntrinsics;
1219 SmallSetVector<AllocaInst *, 16> AllocaSet;
1220 SmallVector<std::pair<IntrinsicInst *, AllocaInst *>, 16> LifetimeStartList;
1221 SmallVector<StoreInst *, 16> StoreList;
1222 int64_t SplittableBlocksCount = 0;
1223
1224 MemorySanitizerVisitor(Function &F, MemorySanitizer &MS,
1225 const TargetLibraryInfo &TLI)
1226 : F(F), MS(MS), VAHelper(CreateVarArgHelper(Func&: F, Msan&: MS, Visitor&: *this)), TLI(&TLI) {
1227 bool SanitizeFunction =
1228 F.hasFnAttribute(Kind: Attribute::SanitizeMemory) && !ClDisableChecks;
1229 InsertChecks = SanitizeFunction;
1230 PropagateShadow = SanitizeFunction;
1231 PoisonStack = SanitizeFunction && ClPoisonStack;
1232 PoisonUndef = SanitizeFunction && ClPoisonUndef;
1233 PoisonUndefVectors = SanitizeFunction && ClPoisonUndefVectors;
1234
1235 // In the presence of unreachable blocks, we may see Phi nodes with
1236 // incoming nodes from such blocks. Since InstVisitor skips unreachable
1237 // blocks, such nodes will not have any shadow value associated with them.
1238 // It's easier to remove unreachable blocks than deal with missing shadow.
1239 removeUnreachableBlocks(F);
1240
1241 MS.initializeCallbacks(M&: *F.getParent(), TLI);
1242 FnPrologueEnd =
1243 IRBuilder<>(&F.getEntryBlock(), F.getEntryBlock().getFirstNonPHIIt())
1244 .CreateIntrinsic(ID: Intrinsic::donothing, Args: {});
1245
1246 if (MS.CompileKernel) {
1247 IRBuilder<> IRB(FnPrologueEnd);
1248 insertKmsanPrologue(IRB);
1249 }
1250
1251 LLVM_DEBUG(if (!InsertChecks) dbgs()
1252 << "MemorySanitizer is not inserting checks into '"
1253 << F.getName() << "'\n");
1254 }
1255
1256 bool instrumentWithCalls(Value *V) {
1257 // Constants likely will be eliminated by follow-up passes.
1258 if (isa<Constant>(Val: V))
1259 return false;
1260 ++SplittableBlocksCount;
1261 return ClInstrumentationWithCallThreshold >= 0 &&
1262 SplittableBlocksCount > ClInstrumentationWithCallThreshold;
1263 }
1264
1265 bool isInPrologue(Instruction &I) {
1266 return I.getParent() == FnPrologueEnd->getParent() &&
1267 (&I == FnPrologueEnd || I.comesBefore(Other: FnPrologueEnd));
1268 }
1269
1270 // Creates a new origin and records the stack trace. In general we can call
1271 // this function for any origin manipulation we like. However it will cost
1272 // runtime resources. So use this wisely only if it can provide additional
1273 // information helpful to a user.
1274 Value *updateOrigin(Value *V, IRBuilder<> &IRB) {
1275 if (MS.TrackOrigins <= 1)
1276 return V;
1277 return IRB.CreateCall(Callee: MS.MsanChainOriginFn, Args: V);
1278 }
1279
1280 Value *originToIntptr(IRBuilder<> &IRB, Value *Origin) {
1281 const DataLayout &DL = F.getDataLayout();
1282 unsigned IntptrSize = DL.getTypeStoreSize(Ty: MS.IntptrTy);
1283 if (IntptrSize == kOriginSize)
1284 return Origin;
1285 assert(IntptrSize == kOriginSize * 2);
1286 Origin = IRB.CreateIntCast(V: Origin, DestTy: MS.IntptrTy, /* isSigned */ false);
1287 return IRB.CreateOr(LHS: Origin, RHS: IRB.CreateShl(LHS: Origin, RHS: kOriginSize * 8));
1288 }
1289
1290 /// Fill memory range with the given origin value.
1291 void paintOrigin(IRBuilder<> &IRB, Value *Origin, Value *OriginPtr,
1292 TypeSize TS, Align Alignment) {
1293 const DataLayout &DL = F.getDataLayout();
1294 const Align IntptrAlignment = DL.getABITypeAlign(Ty: MS.IntptrTy);
1295 unsigned IntptrSize = DL.getTypeStoreSize(Ty: MS.IntptrTy);
1296 assert(IntptrAlignment >= kMinOriginAlignment);
1297 assert(IntptrSize >= kOriginSize);
1298
1299 // Note: The loop based formation works for fixed length vectors too,
1300 // however we prefer to unroll and specialize alignment below.
1301 if (TS.isScalable()) {
1302 Value *Size = IRB.CreateTypeSize(Ty: MS.IntptrTy, Size: TS);
1303 Value *RoundUp =
1304 IRB.CreateAdd(LHS: Size, RHS: ConstantInt::get(Ty: MS.IntptrTy, V: kOriginSize - 1));
1305 Value *End =
1306 IRB.CreateUDiv(LHS: RoundUp, RHS: ConstantInt::get(Ty: MS.IntptrTy, V: kOriginSize));
1307 auto [InsertPt, Index] =
1308 SplitBlockAndInsertSimpleForLoop(End, SplitBefore: IRB.GetInsertPoint());
1309 IRB.SetInsertPoint(InsertPt);
1310
1311 Value *GEP = IRB.CreateGEP(Ty: MS.OriginTy, Ptr: OriginPtr, IdxList: Index);
1312 IRB.CreateAlignedStore(Val: Origin, Ptr: GEP, Align: kMinOriginAlignment);
1313 return;
1314 }
1315
1316 unsigned Size = TS.getFixedValue();
1317
1318 unsigned Ofs = 0;
1319 Align CurrentAlignment = Alignment;
1320 if (Alignment >= IntptrAlignment && IntptrSize > kOriginSize) {
1321 Value *IntptrOrigin = originToIntptr(IRB, Origin);
1322 Value *IntptrOriginPtr = IRB.CreatePointerCast(V: OriginPtr, DestTy: MS.PtrTy);
1323 for (unsigned i = 0; i < Size / IntptrSize; ++i) {
1324 Value *Ptr = i ? IRB.CreateConstGEP1_32(Ty: MS.IntptrTy, Ptr: IntptrOriginPtr, Idx0: i)
1325 : IntptrOriginPtr;
1326 IRB.CreateAlignedStore(Val: IntptrOrigin, Ptr, Align: CurrentAlignment);
1327 Ofs += IntptrSize / kOriginSize;
1328 CurrentAlignment = IntptrAlignment;
1329 }
1330 }
1331
1332 for (unsigned i = Ofs; i < (Size + kOriginSize - 1) / kOriginSize; ++i) {
1333 Value *GEP =
1334 i ? IRB.CreateConstGEP1_32(Ty: MS.OriginTy, Ptr: OriginPtr, Idx0: i) : OriginPtr;
1335 IRB.CreateAlignedStore(Val: Origin, Ptr: GEP, Align: CurrentAlignment);
1336 CurrentAlignment = kMinOriginAlignment;
1337 }
1338 }
1339
1340 void storeOrigin(IRBuilder<> &IRB, Value *Addr, Value *Shadow, Value *Origin,
1341 Value *OriginPtr, Align Alignment) {
1342 const DataLayout &DL = F.getDataLayout();
1343 const Align OriginAlignment = std::max(a: kMinOriginAlignment, b: Alignment);
1344 TypeSize StoreSize = DL.getTypeStoreSize(Ty: Shadow->getType());
1345 // ZExt cannot convert between vector and scalar
1346 Value *ConvertedShadow = convertShadowToScalar(V: Shadow, IRB);
1347 if (auto *ConstantShadow = dyn_cast<Constant>(Val: ConvertedShadow)) {
1348 if (!ClCheckConstantShadow || ConstantShadow->isZeroValue()) {
1349 // Origin is not needed: value is initialized or const shadow is
1350 // ignored.
1351 return;
1352 }
1353 if (llvm::isKnownNonZero(V: ConvertedShadow, Q: DL)) {
1354 // Copy origin as the value is definitely uninitialized.
1355 paintOrigin(IRB, Origin: updateOrigin(V: Origin, IRB), OriginPtr, TS: StoreSize,
1356 Alignment: OriginAlignment);
1357 return;
1358 }
1359 // Fallback to runtime check, which still can be optimized out later.
1360 }
1361
1362 TypeSize TypeSizeInBits = DL.getTypeSizeInBits(Ty: ConvertedShadow->getType());
1363 unsigned SizeIndex = TypeSizeToSizeIndex(TS: TypeSizeInBits);
1364 if (instrumentWithCalls(V: ConvertedShadow) &&
1365 SizeIndex < kNumberOfAccessSizes && !MS.CompileKernel) {
1366 FunctionCallee Fn = MS.MaybeStoreOriginFn[SizeIndex];
1367 Value *ConvertedShadow2 =
1368 IRB.CreateZExt(V: ConvertedShadow, DestTy: IRB.getIntNTy(N: 8 * (1 << SizeIndex)));
1369 CallBase *CB = IRB.CreateCall(Callee: Fn, Args: {ConvertedShadow2, Addr, Origin});
1370 CB->addParamAttr(ArgNo: 0, Kind: Attribute::ZExt);
1371 CB->addParamAttr(ArgNo: 2, Kind: Attribute::ZExt);
1372 } else {
1373 Value *Cmp = convertToBool(V: ConvertedShadow, IRB, name: "_mscmp");
1374 Instruction *CheckTerm = SplitBlockAndInsertIfThen(
1375 Cond: Cmp, SplitBefore: &*IRB.GetInsertPoint(), Unreachable: false, BranchWeights: MS.OriginStoreWeights);
1376 IRBuilder<> IRBNew(CheckTerm);
1377 paintOrigin(IRB&: IRBNew, Origin: updateOrigin(V: Origin, IRB&: IRBNew), OriginPtr, TS: StoreSize,
1378 Alignment: OriginAlignment);
1379 }
1380 }
1381
1382 void materializeStores() {
1383 for (StoreInst *SI : StoreList) {
1384 IRBuilder<> IRB(SI);
1385 Value *Val = SI->getValueOperand();
1386 Value *Addr = SI->getPointerOperand();
1387 Value *Shadow = SI->isAtomic() ? getCleanShadow(V: Val) : getShadow(V: Val);
1388 Value *ShadowPtr, *OriginPtr;
1389 Type *ShadowTy = Shadow->getType();
1390 const Align Alignment = SI->getAlign();
1391 const Align OriginAlignment = std::max(a: kMinOriginAlignment, b: Alignment);
1392 std::tie(args&: ShadowPtr, args&: OriginPtr) =
1393 getShadowOriginPtr(Addr, IRB, ShadowTy, Alignment, /*isStore*/ true);
1394
1395 [[maybe_unused]] StoreInst *NewSI =
1396 IRB.CreateAlignedStore(Val: Shadow, Ptr: ShadowPtr, Align: Alignment);
1397 LLVM_DEBUG(dbgs() << " STORE: " << *NewSI << "\n");
1398
1399 if (SI->isAtomic())
1400 SI->setOrdering(addReleaseOrdering(a: SI->getOrdering()));
1401
1402 if (MS.TrackOrigins && !SI->isAtomic())
1403 storeOrigin(IRB, Addr, Shadow, Origin: getOrigin(V: Val), OriginPtr,
1404 Alignment: OriginAlignment);
1405 }
1406 }
1407
1408 // Returns true if Debug Location corresponds to multiple warnings.
1409 bool shouldDisambiguateWarningLocation(const DebugLoc &DebugLoc) {
1410 if (MS.TrackOrigins < 2)
1411 return false;
1412
1413 if (LazyWarningDebugLocationCount.empty())
1414 for (const auto &I : InstrumentationList)
1415 ++LazyWarningDebugLocationCount[I.OrigIns->getDebugLoc()];
1416
1417 return LazyWarningDebugLocationCount[DebugLoc] >= ClDisambiguateWarning;
1418 }
1419
1420 /// Helper function to insert a warning at IRB's current insert point.
1421 void insertWarningFn(IRBuilder<> &IRB, Value *Origin) {
1422 if (!Origin)
1423 Origin = (Value *)IRB.getInt32(C: 0);
1424 assert(Origin->getType()->isIntegerTy());
1425
1426 if (shouldDisambiguateWarningLocation(DebugLoc: IRB.getCurrentDebugLocation())) {
1427 // Try to create additional origin with debug info of the last origin
1428 // instruction. It may provide additional information to the user.
1429 if (Instruction *OI = dyn_cast_or_null<Instruction>(Val: Origin)) {
1430 assert(MS.TrackOrigins);
1431 auto NewDebugLoc = OI->getDebugLoc();
1432 // Origin update with missing or the same debug location provides no
1433 // additional value.
1434 if (NewDebugLoc && NewDebugLoc != IRB.getCurrentDebugLocation()) {
1435 // Insert update just before the check, so we call runtime only just
1436 // before the report.
1437 IRBuilder<> IRBOrigin(&*IRB.GetInsertPoint());
1438 IRBOrigin.SetCurrentDebugLocation(NewDebugLoc);
1439 Origin = updateOrigin(V: Origin, IRB&: IRBOrigin);
1440 }
1441 }
1442 }
1443
1444 if (MS.CompileKernel || MS.TrackOrigins)
1445 IRB.CreateCall(Callee: MS.WarningFn, Args: Origin)->setCannotMerge();
1446 else
1447 IRB.CreateCall(Callee: MS.WarningFn)->setCannotMerge();
1448 // FIXME: Insert UnreachableInst if !MS.Recover?
1449 // This may invalidate some of the following checks and needs to be done
1450 // at the very end.
1451 }
1452
1453 void materializeOneCheck(IRBuilder<> &IRB, Value *ConvertedShadow,
1454 Value *Origin) {
1455 const DataLayout &DL = F.getDataLayout();
1456 TypeSize TypeSizeInBits = DL.getTypeSizeInBits(Ty: ConvertedShadow->getType());
1457 unsigned SizeIndex = TypeSizeToSizeIndex(TS: TypeSizeInBits);
1458 if (instrumentWithCalls(V: ConvertedShadow) && !MS.CompileKernel) {
1459 // ZExt cannot convert between vector and scalar
1460 ConvertedShadow = convertShadowToScalar(V: ConvertedShadow, IRB);
1461 Value *ConvertedShadow2 =
1462 IRB.CreateZExt(V: ConvertedShadow, DestTy: IRB.getIntNTy(N: 8 * (1 << SizeIndex)));
1463
1464 if (SizeIndex < kNumberOfAccessSizes) {
1465 FunctionCallee Fn = MS.MaybeWarningFn[SizeIndex];
1466 CallBase *CB = IRB.CreateCall(
1467 Callee: Fn,
1468 Args: {ConvertedShadow2,
1469 MS.TrackOrigins && Origin ? Origin : (Value *)IRB.getInt32(C: 0)});
1470 CB->addParamAttr(ArgNo: 0, Kind: Attribute::ZExt);
1471 CB->addParamAttr(ArgNo: 1, Kind: Attribute::ZExt);
1472 } else {
1473 FunctionCallee Fn = MS.MaybeWarningVarSizeFn;
1474 Value *ShadowAlloca = IRB.CreateAlloca(Ty: ConvertedShadow2->getType(), AddrSpace: 0u);
1475 IRB.CreateStore(Val: ConvertedShadow2, Ptr: ShadowAlloca);
1476 unsigned ShadowSize = DL.getTypeAllocSize(Ty: ConvertedShadow2->getType());
1477 CallBase *CB = IRB.CreateCall(
1478 Callee: Fn,
1479 Args: {ShadowAlloca, ConstantInt::get(Ty: IRB.getInt64Ty(), V: ShadowSize),
1480 MS.TrackOrigins && Origin ? Origin : (Value *)IRB.getInt32(C: 0)});
1481 CB->addParamAttr(ArgNo: 1, Kind: Attribute::ZExt);
1482 CB->addParamAttr(ArgNo: 2, Kind: Attribute::ZExt);
1483 }
1484 } else {
1485 Value *Cmp = convertToBool(V: ConvertedShadow, IRB, name: "_mscmp");
1486 Instruction *CheckTerm = SplitBlockAndInsertIfThen(
1487 Cond: Cmp, SplitBefore: &*IRB.GetInsertPoint(),
1488 /* Unreachable */ !MS.Recover, BranchWeights: MS.ColdCallWeights);
1489
1490 IRB.SetInsertPoint(CheckTerm);
1491 insertWarningFn(IRB, Origin);
1492 LLVM_DEBUG(dbgs() << " CHECK: " << *Cmp << "\n");
1493 }
1494 }
1495
1496 void materializeInstructionChecks(
1497 ArrayRef<ShadowOriginAndInsertPoint> InstructionChecks) {
1498 const DataLayout &DL = F.getDataLayout();
1499 // Disable combining in some cases. TrackOrigins checks each shadow to pick
1500 // correct origin.
1501 bool Combine = !MS.TrackOrigins;
1502 Instruction *Instruction = InstructionChecks.front().OrigIns;
1503 Value *Shadow = nullptr;
1504 for (const auto &ShadowData : InstructionChecks) {
1505 assert(ShadowData.OrigIns == Instruction);
1506 IRBuilder<> IRB(Instruction);
1507
1508 Value *ConvertedShadow = ShadowData.Shadow;
1509
1510 if (auto *ConstantShadow = dyn_cast<Constant>(Val: ConvertedShadow)) {
1511 if (!ClCheckConstantShadow || ConstantShadow->isZeroValue()) {
1512 // Skip, value is initialized or const shadow is ignored.
1513 continue;
1514 }
1515 if (llvm::isKnownNonZero(V: ConvertedShadow, Q: DL)) {
1516 // Report as the value is definitely uninitialized.
1517 insertWarningFn(IRB, Origin: ShadowData.Origin);
1518 if (!MS.Recover)
1519 return; // Always fail and stop here, not need to check the rest.
1520 // Skip entire instruction,
1521 continue;
1522 }
1523 // Fallback to runtime check, which still can be optimized out later.
1524 }
1525
1526 if (!Combine) {
1527 materializeOneCheck(IRB, ConvertedShadow, Origin: ShadowData.Origin);
1528 continue;
1529 }
1530
1531 if (!Shadow) {
1532 Shadow = ConvertedShadow;
1533 continue;
1534 }
1535
1536 Shadow = convertToBool(V: Shadow, IRB, name: "_mscmp");
1537 ConvertedShadow = convertToBool(V: ConvertedShadow, IRB, name: "_mscmp");
1538 Shadow = IRB.CreateOr(LHS: Shadow, RHS: ConvertedShadow, Name: "_msor");
1539 }
1540
1541 if (Shadow) {
1542 assert(Combine);
1543 IRBuilder<> IRB(Instruction);
1544 materializeOneCheck(IRB, ConvertedShadow: Shadow, Origin: nullptr);
1545 }
1546 }
1547
1548 void materializeChecks() {
1549#ifndef NDEBUG
1550 // For assert below.
1551 SmallPtrSet<Instruction *, 16> Done;
1552#endif
1553
1554 for (auto I = InstrumentationList.begin();
1555 I != InstrumentationList.end();) {
1556 auto OrigIns = I->OrigIns;
1557 // Checks are grouped by the original instruction. We call all
1558 // `insertShadowCheck` for an instruction at once.
1559 assert(Done.insert(OrigIns).second);
1560 auto J = std::find_if(first: I + 1, last: InstrumentationList.end(),
1561 pred: [OrigIns](const ShadowOriginAndInsertPoint &R) {
1562 return OrigIns != R.OrigIns;
1563 });
1564 // Process all checks of instruction at once.
1565 materializeInstructionChecks(InstructionChecks: ArrayRef<ShadowOriginAndInsertPoint>(I, J));
1566 I = J;
1567 }
1568
1569 LLVM_DEBUG(dbgs() << "DONE:\n" << F);
1570 }
1571
1572 // Returns the last instruction in the new prologue
1573 void insertKmsanPrologue(IRBuilder<> &IRB) {
1574 Value *ContextState = IRB.CreateCall(Callee: MS.MsanGetContextStateFn, Args: {});
1575 Constant *Zero = IRB.getInt32(C: 0);
1576 MS.ParamTLS = IRB.CreateGEP(Ty: MS.MsanContextStateTy, Ptr: ContextState,
1577 IdxList: {Zero, IRB.getInt32(C: 0)}, Name: "param_shadow");
1578 MS.RetvalTLS = IRB.CreateGEP(Ty: MS.MsanContextStateTy, Ptr: ContextState,
1579 IdxList: {Zero, IRB.getInt32(C: 1)}, Name: "retval_shadow");
1580 MS.VAArgTLS = IRB.CreateGEP(Ty: MS.MsanContextStateTy, Ptr: ContextState,
1581 IdxList: {Zero, IRB.getInt32(C: 2)}, Name: "va_arg_shadow");
1582 MS.VAArgOriginTLS = IRB.CreateGEP(Ty: MS.MsanContextStateTy, Ptr: ContextState,
1583 IdxList: {Zero, IRB.getInt32(C: 3)}, Name: "va_arg_origin");
1584 MS.VAArgOverflowSizeTLS =
1585 IRB.CreateGEP(Ty: MS.MsanContextStateTy, Ptr: ContextState,
1586 IdxList: {Zero, IRB.getInt32(C: 4)}, Name: "va_arg_overflow_size");
1587 MS.ParamOriginTLS = IRB.CreateGEP(Ty: MS.MsanContextStateTy, Ptr: ContextState,
1588 IdxList: {Zero, IRB.getInt32(C: 5)}, Name: "param_origin");
1589 MS.RetvalOriginTLS =
1590 IRB.CreateGEP(Ty: MS.MsanContextStateTy, Ptr: ContextState,
1591 IdxList: {Zero, IRB.getInt32(C: 6)}, Name: "retval_origin");
1592 if (MS.TargetTriple.getArch() == Triple::systemz)
1593 MS.MsanMetadataAlloca = IRB.CreateAlloca(Ty: MS.MsanMetadata, AddrSpace: 0u);
1594 }
1595
1596 /// Add MemorySanitizer instrumentation to a function.
1597 bool runOnFunction() {
1598 // Iterate all BBs in depth-first order and create shadow instructions
1599 // for all instructions (where applicable).
1600 // For PHI nodes we create dummy shadow PHIs which will be finalized later.
1601 for (BasicBlock *BB : depth_first(G: FnPrologueEnd->getParent()))
1602 visit(BB&: *BB);
1603
1604 // `visit` above only collects instructions. Process them after iterating
1605 // CFG to avoid requirement on CFG transformations.
1606 for (Instruction *I : Instructions)
1607 InstVisitor<MemorySanitizerVisitor>::visit(I&: *I);
1608
1609 // Finalize PHI nodes.
1610 for (PHINode *PN : ShadowPHINodes) {
1611 PHINode *PNS = cast<PHINode>(Val: getShadow(V: PN));
1612 PHINode *PNO = MS.TrackOrigins ? cast<PHINode>(Val: getOrigin(V: PN)) : nullptr;
1613 size_t NumValues = PN->getNumIncomingValues();
1614 for (size_t v = 0; v < NumValues; v++) {
1615 PNS->addIncoming(V: getShadow(I: PN, i: v), BB: PN->getIncomingBlock(i: v));
1616 if (PNO)
1617 PNO->addIncoming(V: getOrigin(I: PN, i: v), BB: PN->getIncomingBlock(i: v));
1618 }
1619 }
1620
1621 VAHelper->finalizeInstrumentation();
1622
1623 // Poison llvm.lifetime.start intrinsics, if we haven't fallen back to
1624 // instrumenting only allocas.
1625 if (InstrumentLifetimeStart) {
1626 for (auto Item : LifetimeStartList) {
1627 instrumentAlloca(I&: *Item.second, InsPoint: Item.first);
1628 AllocaSet.remove(X: Item.second);
1629 }
1630 }
1631 // Poison the allocas for which we didn't instrument the corresponding
1632 // lifetime intrinsics.
1633 for (AllocaInst *AI : AllocaSet)
1634 instrumentAlloca(I&: *AI);
1635
1636 // Insert shadow value checks.
1637 materializeChecks();
1638
1639 // Delayed instrumentation of StoreInst.
1640 // This may not add new address checks.
1641 materializeStores();
1642
1643 return true;
1644 }
1645
1646 /// Compute the shadow type that corresponds to a given Value.
1647 Type *getShadowTy(Value *V) { return getShadowTy(OrigTy: V->getType()); }
1648
1649 /// Compute the shadow type that corresponds to a given Type.
1650 Type *getShadowTy(Type *OrigTy) {
1651 if (!OrigTy->isSized()) {
1652 return nullptr;
1653 }
1654 // For integer type, shadow is the same as the original type.
1655 // This may return weird-sized types like i1.
1656 if (IntegerType *IT = dyn_cast<IntegerType>(Val: OrigTy))
1657 return IT;
1658 const DataLayout &DL = F.getDataLayout();
1659 if (VectorType *VT = dyn_cast<VectorType>(Val: OrigTy)) {
1660 uint32_t EltSize = DL.getTypeSizeInBits(Ty: VT->getElementType());
1661 return VectorType::get(ElementType: IntegerType::get(C&: *MS.C, NumBits: EltSize),
1662 EC: VT->getElementCount());
1663 }
1664 if (ArrayType *AT = dyn_cast<ArrayType>(Val: OrigTy)) {
1665 return ArrayType::get(ElementType: getShadowTy(OrigTy: AT->getElementType()),
1666 NumElements: AT->getNumElements());
1667 }
1668 if (StructType *ST = dyn_cast<StructType>(Val: OrigTy)) {
1669 SmallVector<Type *, 4> Elements;
1670 for (unsigned i = 0, n = ST->getNumElements(); i < n; i++)
1671 Elements.push_back(Elt: getShadowTy(OrigTy: ST->getElementType(N: i)));
1672 StructType *Res = StructType::get(Context&: *MS.C, Elements, isPacked: ST->isPacked());
1673 LLVM_DEBUG(dbgs() << "getShadowTy: " << *ST << " ===> " << *Res << "\n");
1674 return Res;
1675 }
1676 uint32_t TypeSize = DL.getTypeSizeInBits(Ty: OrigTy);
1677 return IntegerType::get(C&: *MS.C, NumBits: TypeSize);
1678 }
1679
1680 /// Extract combined shadow of struct elements as a bool
1681 Value *collapseStructShadow(StructType *Struct, Value *Shadow,
1682 IRBuilder<> &IRB) {
1683 Value *FalseVal = IRB.getIntN(/* width */ N: 1, /* value */ C: 0);
1684 Value *Aggregator = FalseVal;
1685
1686 for (unsigned Idx = 0; Idx < Struct->getNumElements(); Idx++) {
1687 // Combine by ORing together each element's bool shadow
1688 Value *ShadowItem = IRB.CreateExtractValue(Agg: Shadow, Idxs: Idx);
1689 Value *ShadowBool = convertToBool(V: ShadowItem, IRB);
1690
1691 if (Aggregator != FalseVal)
1692 Aggregator = IRB.CreateOr(LHS: Aggregator, RHS: ShadowBool);
1693 else
1694 Aggregator = ShadowBool;
1695 }
1696
1697 return Aggregator;
1698 }
1699
1700 // Extract combined shadow of array elements
1701 Value *collapseArrayShadow(ArrayType *Array, Value *Shadow,
1702 IRBuilder<> &IRB) {
1703 if (!Array->getNumElements())
1704 return IRB.getIntN(/* width */ N: 1, /* value */ C: 0);
1705
1706 Value *FirstItem = IRB.CreateExtractValue(Agg: Shadow, Idxs: 0);
1707 Value *Aggregator = convertShadowToScalar(V: FirstItem, IRB);
1708
1709 for (unsigned Idx = 1; Idx < Array->getNumElements(); Idx++) {
1710 Value *ShadowItem = IRB.CreateExtractValue(Agg: Shadow, Idxs: Idx);
1711 Value *ShadowInner = convertShadowToScalar(V: ShadowItem, IRB);
1712 Aggregator = IRB.CreateOr(LHS: Aggregator, RHS: ShadowInner);
1713 }
1714 return Aggregator;
1715 }
1716
1717 /// Convert a shadow value to it's flattened variant. The resulting
1718 /// shadow may not necessarily have the same bit width as the input
1719 /// value, but it will always be comparable to zero.
1720 Value *convertShadowToScalar(Value *V, IRBuilder<> &IRB) {
1721 if (StructType *Struct = dyn_cast<StructType>(Val: V->getType()))
1722 return collapseStructShadow(Struct, Shadow: V, IRB);
1723 if (ArrayType *Array = dyn_cast<ArrayType>(Val: V->getType()))
1724 return collapseArrayShadow(Array, Shadow: V, IRB);
1725 if (isa<VectorType>(Val: V->getType())) {
1726 if (isa<ScalableVectorType>(Val: V->getType()))
1727 return convertShadowToScalar(V: IRB.CreateOrReduce(Src: V), IRB);
1728 unsigned BitWidth =
1729 V->getType()->getPrimitiveSizeInBits().getFixedValue();
1730 return IRB.CreateBitCast(V, DestTy: IntegerType::get(C&: *MS.C, NumBits: BitWidth));
1731 }
1732 return V;
1733 }
1734
1735 // Convert a scalar value to an i1 by comparing with 0
1736 Value *convertToBool(Value *V, IRBuilder<> &IRB, const Twine &name = "") {
1737 Type *VTy = V->getType();
1738 if (!VTy->isIntegerTy())
1739 return convertToBool(V: convertShadowToScalar(V, IRB), IRB, name);
1740 if (VTy->getIntegerBitWidth() == 1)
1741 // Just converting a bool to a bool, so do nothing.
1742 return V;
1743 return IRB.CreateICmpNE(LHS: V, RHS: ConstantInt::get(Ty: VTy, V: 0), Name: name);
1744 }
1745
1746 Type *ptrToIntPtrType(Type *PtrTy) const {
1747 if (VectorType *VectTy = dyn_cast<VectorType>(Val: PtrTy)) {
1748 return VectorType::get(ElementType: ptrToIntPtrType(PtrTy: VectTy->getElementType()),
1749 EC: VectTy->getElementCount());
1750 }
1751 assert(PtrTy->isIntOrPtrTy());
1752 return MS.IntptrTy;
1753 }
1754
1755 Type *getPtrToShadowPtrType(Type *IntPtrTy, Type *ShadowTy) const {
1756 if (VectorType *VectTy = dyn_cast<VectorType>(Val: IntPtrTy)) {
1757 return VectorType::get(
1758 ElementType: getPtrToShadowPtrType(IntPtrTy: VectTy->getElementType(), ShadowTy),
1759 EC: VectTy->getElementCount());
1760 }
1761 assert(IntPtrTy == MS.IntptrTy);
1762 return MS.PtrTy;
1763 }
1764
1765 Constant *constToIntPtr(Type *IntPtrTy, uint64_t C) const {
1766 if (VectorType *VectTy = dyn_cast<VectorType>(Val: IntPtrTy)) {
1767 return ConstantVector::getSplat(
1768 EC: VectTy->getElementCount(),
1769 Elt: constToIntPtr(IntPtrTy: VectTy->getElementType(), C));
1770 }
1771 assert(IntPtrTy == MS.IntptrTy);
1772 return ConstantInt::get(Ty: MS.IntptrTy, V: C);
1773 }
1774
1775 /// Returns the integer shadow offset that corresponds to a given
1776 /// application address, whereby:
1777 ///
1778 /// Offset = (Addr & ~AndMask) ^ XorMask
1779 /// Shadow = ShadowBase + Offset
1780 /// Origin = (OriginBase + Offset) & ~Alignment
1781 ///
1782 /// Note: for efficiency, many shadow mappings only require use the XorMask
1783 /// and OriginBase; the AndMask and ShadowBase are often zero.
1784 Value *getShadowPtrOffset(Value *Addr, IRBuilder<> &IRB) {
1785 Type *IntptrTy = ptrToIntPtrType(PtrTy: Addr->getType());
1786 Value *OffsetLong = IRB.CreatePointerCast(V: Addr, DestTy: IntptrTy);
1787
1788 if (uint64_t AndMask = MS.MapParams->AndMask)
1789 OffsetLong = IRB.CreateAnd(LHS: OffsetLong, RHS: constToIntPtr(IntPtrTy: IntptrTy, C: ~AndMask));
1790
1791 if (uint64_t XorMask = MS.MapParams->XorMask)
1792 OffsetLong = IRB.CreateXor(LHS: OffsetLong, RHS: constToIntPtr(IntPtrTy: IntptrTy, C: XorMask));
1793 return OffsetLong;
1794 }
1795
1796 /// Compute the shadow and origin addresses corresponding to a given
1797 /// application address.
1798 ///
1799 /// Shadow = ShadowBase + Offset
1800 /// Origin = (OriginBase + Offset) & ~3ULL
1801 /// Addr can be a ptr or <N x ptr>. In both cases ShadowTy the shadow type of
1802 /// a single pointee.
1803 /// Returns <shadow_ptr, origin_ptr> or <<N x shadow_ptr>, <N x origin_ptr>>.
1804 std::pair<Value *, Value *>
1805 getShadowOriginPtrUserspace(Value *Addr, IRBuilder<> &IRB, Type *ShadowTy,
1806 MaybeAlign Alignment) {
1807 VectorType *VectTy = dyn_cast<VectorType>(Val: Addr->getType());
1808 if (!VectTy) {
1809 assert(Addr->getType()->isPointerTy());
1810 } else {
1811 assert(VectTy->getElementType()->isPointerTy());
1812 }
1813 Type *IntptrTy = ptrToIntPtrType(PtrTy: Addr->getType());
1814 Value *ShadowOffset = getShadowPtrOffset(Addr, IRB);
1815 Value *ShadowLong = ShadowOffset;
1816 if (uint64_t ShadowBase = MS.MapParams->ShadowBase) {
1817 ShadowLong =
1818 IRB.CreateAdd(LHS: ShadowLong, RHS: constToIntPtr(IntPtrTy: IntptrTy, C: ShadowBase));
1819 }
1820 Value *ShadowPtr = IRB.CreateIntToPtr(
1821 V: ShadowLong, DestTy: getPtrToShadowPtrType(IntPtrTy: IntptrTy, ShadowTy));
1822
1823 Value *OriginPtr = nullptr;
1824 if (MS.TrackOrigins) {
1825 Value *OriginLong = ShadowOffset;
1826 uint64_t OriginBase = MS.MapParams->OriginBase;
1827 if (OriginBase != 0)
1828 OriginLong =
1829 IRB.CreateAdd(LHS: OriginLong, RHS: constToIntPtr(IntPtrTy: IntptrTy, C: OriginBase));
1830 if (!Alignment || *Alignment < kMinOriginAlignment) {
1831 uint64_t Mask = kMinOriginAlignment.value() - 1;
1832 OriginLong = IRB.CreateAnd(LHS: OriginLong, RHS: constToIntPtr(IntPtrTy: IntptrTy, C: ~Mask));
1833 }
1834 OriginPtr = IRB.CreateIntToPtr(
1835 V: OriginLong, DestTy: getPtrToShadowPtrType(IntPtrTy: IntptrTy, ShadowTy: MS.OriginTy));
1836 }
1837 return std::make_pair(x&: ShadowPtr, y&: OriginPtr);
1838 }
1839
1840 template <typename... ArgsTy>
1841 Value *createMetadataCall(IRBuilder<> &IRB, FunctionCallee Callee,
1842 ArgsTy... Args) {
1843 if (MS.TargetTriple.getArch() == Triple::systemz) {
1844 IRB.CreateCall(Callee,
1845 {MS.MsanMetadataAlloca, std::forward<ArgsTy>(Args)...});
1846 return IRB.CreateLoad(Ty: MS.MsanMetadata, Ptr: MS.MsanMetadataAlloca);
1847 }
1848
1849 return IRB.CreateCall(Callee, {std::forward<ArgsTy>(Args)...});
1850 }
1851
1852 std::pair<Value *, Value *> getShadowOriginPtrKernelNoVec(Value *Addr,
1853 IRBuilder<> &IRB,
1854 Type *ShadowTy,
1855 bool isStore) {
1856 Value *ShadowOriginPtrs;
1857 const DataLayout &DL = F.getDataLayout();
1858 TypeSize Size = DL.getTypeStoreSize(Ty: ShadowTy);
1859
1860 FunctionCallee Getter = MS.getKmsanShadowOriginAccessFn(isStore, size: Size);
1861 Value *AddrCast = IRB.CreatePointerCast(V: Addr, DestTy: MS.PtrTy);
1862 if (Getter) {
1863 ShadowOriginPtrs = createMetadataCall(IRB, Callee: Getter, Args: AddrCast);
1864 } else {
1865 Value *SizeVal = ConstantInt::get(Ty: MS.IntptrTy, V: Size);
1866 ShadowOriginPtrs = createMetadataCall(
1867 IRB,
1868 Callee: isStore ? MS.MsanMetadataPtrForStoreN : MS.MsanMetadataPtrForLoadN,
1869 Args: AddrCast, Args: SizeVal);
1870 }
1871 Value *ShadowPtr = IRB.CreateExtractValue(Agg: ShadowOriginPtrs, Idxs: 0);
1872 ShadowPtr = IRB.CreatePointerCast(V: ShadowPtr, DestTy: MS.PtrTy);
1873 Value *OriginPtr = IRB.CreateExtractValue(Agg: ShadowOriginPtrs, Idxs: 1);
1874
1875 return std::make_pair(x&: ShadowPtr, y&: OriginPtr);
1876 }
1877
1878 /// Addr can be a ptr or <N x ptr>. In both cases ShadowTy the shadow type of
1879 /// a single pointee.
1880 /// Returns <shadow_ptr, origin_ptr> or <<N x shadow_ptr>, <N x origin_ptr>>.
1881 std::pair<Value *, Value *> getShadowOriginPtrKernel(Value *Addr,
1882 IRBuilder<> &IRB,
1883 Type *ShadowTy,
1884 bool isStore) {
1885 VectorType *VectTy = dyn_cast<VectorType>(Val: Addr->getType());
1886 if (!VectTy) {
1887 assert(Addr->getType()->isPointerTy());
1888 return getShadowOriginPtrKernelNoVec(Addr, IRB, ShadowTy, isStore);
1889 }
1890
1891 // TODO: Support callbacs with vectors of addresses.
1892 unsigned NumElements = cast<FixedVectorType>(Val: VectTy)->getNumElements();
1893 Value *ShadowPtrs = ConstantInt::getNullValue(
1894 Ty: FixedVectorType::get(ElementType: IRB.getPtrTy(), NumElts: NumElements));
1895 Value *OriginPtrs = nullptr;
1896 if (MS.TrackOrigins)
1897 OriginPtrs = ConstantInt::getNullValue(
1898 Ty: FixedVectorType::get(ElementType: IRB.getPtrTy(), NumElts: NumElements));
1899 for (unsigned i = 0; i < NumElements; ++i) {
1900 Value *OneAddr =
1901 IRB.CreateExtractElement(Vec: Addr, Idx: ConstantInt::get(Ty: IRB.getInt32Ty(), V: i));
1902 auto [ShadowPtr, OriginPtr] =
1903 getShadowOriginPtrKernelNoVec(Addr: OneAddr, IRB, ShadowTy, isStore);
1904
1905 ShadowPtrs = IRB.CreateInsertElement(
1906 Vec: ShadowPtrs, NewElt: ShadowPtr, Idx: ConstantInt::get(Ty: IRB.getInt32Ty(), V: i));
1907 if (MS.TrackOrigins)
1908 OriginPtrs = IRB.CreateInsertElement(
1909 Vec: OriginPtrs, NewElt: OriginPtr, Idx: ConstantInt::get(Ty: IRB.getInt32Ty(), V: i));
1910 }
1911 return {ShadowPtrs, OriginPtrs};
1912 }
1913
1914 std::pair<Value *, Value *> getShadowOriginPtr(Value *Addr, IRBuilder<> &IRB,
1915 Type *ShadowTy,
1916 MaybeAlign Alignment,
1917 bool isStore) {
1918 if (MS.CompileKernel)
1919 return getShadowOriginPtrKernel(Addr, IRB, ShadowTy, isStore);
1920 return getShadowOriginPtrUserspace(Addr, IRB, ShadowTy, Alignment);
1921 }
1922
1923 /// Compute the shadow address for a given function argument.
1924 ///
1925 /// Shadow = ParamTLS+ArgOffset.
1926 Value *getShadowPtrForArgument(IRBuilder<> &IRB, int ArgOffset) {
1927 Value *Base = IRB.CreatePointerCast(V: MS.ParamTLS, DestTy: MS.IntptrTy);
1928 if (ArgOffset)
1929 Base = IRB.CreateAdd(LHS: Base, RHS: ConstantInt::get(Ty: MS.IntptrTy, V: ArgOffset));
1930 return IRB.CreateIntToPtr(V: Base, DestTy: IRB.getPtrTy(AddrSpace: 0), Name: "_msarg");
1931 }
1932
1933 /// Compute the origin address for a given function argument.
1934 Value *getOriginPtrForArgument(IRBuilder<> &IRB, int ArgOffset) {
1935 if (!MS.TrackOrigins)
1936 return nullptr;
1937 Value *Base = IRB.CreatePointerCast(V: MS.ParamOriginTLS, DestTy: MS.IntptrTy);
1938 if (ArgOffset)
1939 Base = IRB.CreateAdd(LHS: Base, RHS: ConstantInt::get(Ty: MS.IntptrTy, V: ArgOffset));
1940 return IRB.CreateIntToPtr(V: Base, DestTy: IRB.getPtrTy(AddrSpace: 0), Name: "_msarg_o");
1941 }
1942
1943 /// Compute the shadow address for a retval.
1944 Value *getShadowPtrForRetval(IRBuilder<> &IRB) {
1945 return IRB.CreatePointerCast(V: MS.RetvalTLS, DestTy: IRB.getPtrTy(AddrSpace: 0), Name: "_msret");
1946 }
1947
1948 /// Compute the origin address for a retval.
1949 Value *getOriginPtrForRetval() {
1950 // We keep a single origin for the entire retval. Might be too optimistic.
1951 return MS.RetvalOriginTLS;
1952 }
1953
1954 /// Set SV to be the shadow value for V.
1955 void setShadow(Value *V, Value *SV) {
1956 assert(!ShadowMap.count(V) && "Values may only have one shadow");
1957 ShadowMap[V] = PropagateShadow ? SV : getCleanShadow(V);
1958 }
1959
1960 /// Set Origin to be the origin value for V.
1961 void setOrigin(Value *V, Value *Origin) {
1962 if (!MS.TrackOrigins)
1963 return;
1964 assert(!OriginMap.count(V) && "Values may only have one origin");
1965 LLVM_DEBUG(dbgs() << "ORIGIN: " << *V << " ==> " << *Origin << "\n");
1966 OriginMap[V] = Origin;
1967 }
1968
1969 Constant *getCleanShadow(Type *OrigTy) {
1970 Type *ShadowTy = getShadowTy(OrigTy);
1971 if (!ShadowTy)
1972 return nullptr;
1973 return Constant::getNullValue(Ty: ShadowTy);
1974 }
1975
1976 /// Create a clean shadow value for a given value.
1977 ///
1978 /// Clean shadow (all zeroes) means all bits of the value are defined
1979 /// (initialized).
1980 Constant *getCleanShadow(Value *V) { return getCleanShadow(OrigTy: V->getType()); }
1981
1982 /// Create a dirty shadow of a given shadow type.
1983 Constant *getPoisonedShadow(Type *ShadowTy) {
1984 assert(ShadowTy);
1985 if (isa<IntegerType>(Val: ShadowTy) || isa<VectorType>(Val: ShadowTy))
1986 return Constant::getAllOnesValue(Ty: ShadowTy);
1987 if (ArrayType *AT = dyn_cast<ArrayType>(Val: ShadowTy)) {
1988 SmallVector<Constant *, 4> Vals(AT->getNumElements(),
1989 getPoisonedShadow(ShadowTy: AT->getElementType()));
1990 return ConstantArray::get(T: AT, V: Vals);
1991 }
1992 if (StructType *ST = dyn_cast<StructType>(Val: ShadowTy)) {
1993 SmallVector<Constant *, 4> Vals;
1994 for (unsigned i = 0, n = ST->getNumElements(); i < n; i++)
1995 Vals.push_back(Elt: getPoisonedShadow(ShadowTy: ST->getElementType(N: i)));
1996 return ConstantStruct::get(T: ST, V: Vals);
1997 }
1998 llvm_unreachable("Unexpected shadow type");
1999 }
2000
2001 /// Create a dirty shadow for a given value.
2002 Constant *getPoisonedShadow(Value *V) {
2003 Type *ShadowTy = getShadowTy(V);
2004 if (!ShadowTy)
2005 return nullptr;
2006 return getPoisonedShadow(ShadowTy);
2007 }
2008
2009 /// Create a clean (zero) origin.
2010 Value *getCleanOrigin() { return Constant::getNullValue(Ty: MS.OriginTy); }
2011
2012 /// Get the shadow value for a given Value.
2013 ///
2014 /// This function either returns the value set earlier with setShadow,
2015 /// or extracts if from ParamTLS (for function arguments).
2016 Value *getShadow(Value *V) {
2017 if (Instruction *I = dyn_cast<Instruction>(Val: V)) {
2018 if (!PropagateShadow || I->getMetadata(KindID: LLVMContext::MD_nosanitize))
2019 return getCleanShadow(V);
2020 // For instructions the shadow is already stored in the map.
2021 Value *Shadow = ShadowMap[V];
2022 if (!Shadow) {
2023 LLVM_DEBUG(dbgs() << "No shadow: " << *V << "\n" << *(I->getParent()));
2024 assert(Shadow && "No shadow for a value");
2025 }
2026 return Shadow;
2027 }
2028 // Handle fully undefined values
2029 // (partially undefined constant vectors are handled later)
2030 if ([[maybe_unused]] UndefValue *U = dyn_cast<UndefValue>(Val: V)) {
2031 Value *AllOnes = (PropagateShadow && PoisonUndef) ? getPoisonedShadow(V)
2032 : getCleanShadow(V);
2033 LLVM_DEBUG(dbgs() << "Undef: " << *U << " ==> " << *AllOnes << "\n");
2034 return AllOnes;
2035 }
2036 if (Argument *A = dyn_cast<Argument>(Val: V)) {
2037 // For arguments we compute the shadow on demand and store it in the map.
2038 Value *&ShadowPtr = ShadowMap[V];
2039 if (ShadowPtr)
2040 return ShadowPtr;
2041 Function *F = A->getParent();
2042 IRBuilder<> EntryIRB(FnPrologueEnd);
2043 unsigned ArgOffset = 0;
2044 const DataLayout &DL = F->getDataLayout();
2045 for (auto &FArg : F->args()) {
2046 if (!FArg.getType()->isSized() || FArg.getType()->isScalableTy()) {
2047 LLVM_DEBUG(dbgs() << (FArg.getType()->isScalableTy()
2048 ? "vscale not fully supported\n"
2049 : "Arg is not sized\n"));
2050 if (A == &FArg) {
2051 ShadowPtr = getCleanShadow(V);
2052 setOrigin(V: A, Origin: getCleanOrigin());
2053 break;
2054 }
2055 continue;
2056 }
2057
2058 unsigned Size = FArg.hasByValAttr()
2059 ? DL.getTypeAllocSize(Ty: FArg.getParamByValType())
2060 : DL.getTypeAllocSize(Ty: FArg.getType());
2061
2062 if (A == &FArg) {
2063 bool Overflow = ArgOffset + Size > kParamTLSSize;
2064 if (FArg.hasByValAttr()) {
2065 // ByVal pointer itself has clean shadow. We copy the actual
2066 // argument shadow to the underlying memory.
2067 // Figure out maximal valid memcpy alignment.
2068 const Align ArgAlign = DL.getValueOrABITypeAlignment(
2069 Alignment: FArg.getParamAlign(), Ty: FArg.getParamByValType());
2070 Value *CpShadowPtr, *CpOriginPtr;
2071 std::tie(args&: CpShadowPtr, args&: CpOriginPtr) =
2072 getShadowOriginPtr(Addr: V, IRB&: EntryIRB, ShadowTy: EntryIRB.getInt8Ty(), Alignment: ArgAlign,
2073 /*isStore*/ true);
2074 if (!PropagateShadow || Overflow) {
2075 // ParamTLS overflow.
2076 EntryIRB.CreateMemSet(
2077 Ptr: CpShadowPtr, Val: Constant::getNullValue(Ty: EntryIRB.getInt8Ty()),
2078 Size, Align: ArgAlign);
2079 } else {
2080 Value *Base = getShadowPtrForArgument(IRB&: EntryIRB, ArgOffset);
2081 const Align CopyAlign = std::min(a: ArgAlign, b: kShadowTLSAlignment);
2082 [[maybe_unused]] Value *Cpy = EntryIRB.CreateMemCpy(
2083 Dst: CpShadowPtr, DstAlign: CopyAlign, Src: Base, SrcAlign: CopyAlign, Size);
2084 LLVM_DEBUG(dbgs() << " ByValCpy: " << *Cpy << "\n");
2085
2086 if (MS.TrackOrigins) {
2087 Value *OriginPtr = getOriginPtrForArgument(IRB&: EntryIRB, ArgOffset);
2088 // FIXME: OriginSize should be:
2089 // alignTo(V % kMinOriginAlignment + Size, kMinOriginAlignment)
2090 unsigned OriginSize = alignTo(Size, A: kMinOriginAlignment);
2091 EntryIRB.CreateMemCpy(
2092 Dst: CpOriginPtr,
2093 /* by getShadowOriginPtr */ DstAlign: kMinOriginAlignment, Src: OriginPtr,
2094 /* by origin_tls[ArgOffset] */ SrcAlign: kMinOriginAlignment,
2095 Size: OriginSize);
2096 }
2097 }
2098 }
2099
2100 if (!PropagateShadow || Overflow || FArg.hasByValAttr() ||
2101 (MS.EagerChecks && FArg.hasAttribute(Kind: Attribute::NoUndef))) {
2102 ShadowPtr = getCleanShadow(V);
2103 setOrigin(V: A, Origin: getCleanOrigin());
2104 } else {
2105 // Shadow over TLS
2106 Value *Base = getShadowPtrForArgument(IRB&: EntryIRB, ArgOffset);
2107 ShadowPtr = EntryIRB.CreateAlignedLoad(Ty: getShadowTy(V: &FArg), Ptr: Base,
2108 Align: kShadowTLSAlignment);
2109 if (MS.TrackOrigins) {
2110 Value *OriginPtr = getOriginPtrForArgument(IRB&: EntryIRB, ArgOffset);
2111 setOrigin(V: A, Origin: EntryIRB.CreateLoad(Ty: MS.OriginTy, Ptr: OriginPtr));
2112 }
2113 }
2114 LLVM_DEBUG(dbgs()
2115 << " ARG: " << FArg << " ==> " << *ShadowPtr << "\n");
2116 break;
2117 }
2118
2119 ArgOffset += alignTo(Size, A: kShadowTLSAlignment);
2120 }
2121 assert(ShadowPtr && "Could not find shadow for an argument");
2122 return ShadowPtr;
2123 }
2124
2125 // Check for partially-undefined constant vectors
2126 // TODO: scalable vectors (this is hard because we do not have IRBuilder)
2127 if (isa<FixedVectorType>(Val: V->getType()) && isa<Constant>(Val: V) &&
2128 cast<Constant>(Val: V)->containsUndefOrPoisonElement() && PropagateShadow &&
2129 PoisonUndefVectors) {
2130 unsigned NumElems = cast<FixedVectorType>(Val: V->getType())->getNumElements();
2131 SmallVector<Constant *, 32> ShadowVector(NumElems);
2132 for (unsigned i = 0; i != NumElems; ++i) {
2133 Constant *Elem = cast<Constant>(Val: V)->getAggregateElement(Elt: i);
2134 ShadowVector[i] = isa<UndefValue>(Val: Elem) ? getPoisonedShadow(V: Elem)
2135 : getCleanShadow(V: Elem);
2136 }
2137
2138 Value *ShadowConstant = ConstantVector::get(V: ShadowVector);
2139 LLVM_DEBUG(dbgs() << "Partial undef constant vector: " << *V << " ==> "
2140 << *ShadowConstant << "\n");
2141
2142 return ShadowConstant;
2143 }
2144
2145 // TODO: partially-undefined constant arrays, structures, and nested types
2146
2147 // For everything else the shadow is zero.
2148 return getCleanShadow(V);
2149 }
2150
2151 /// Get the shadow for i-th argument of the instruction I.
2152 Value *getShadow(Instruction *I, int i) {
2153 return getShadow(V: I->getOperand(i));
2154 }
2155
2156 /// Get the origin for a value.
2157 Value *getOrigin(Value *V) {
2158 if (!MS.TrackOrigins)
2159 return nullptr;
2160 if (!PropagateShadow || isa<Constant>(Val: V) || isa<InlineAsm>(Val: V))
2161 return getCleanOrigin();
2162 assert((isa<Instruction>(V) || isa<Argument>(V)) &&
2163 "Unexpected value type in getOrigin()");
2164 if (Instruction *I = dyn_cast<Instruction>(Val: V)) {
2165 if (I->getMetadata(KindID: LLVMContext::MD_nosanitize))
2166 return getCleanOrigin();
2167 }
2168 Value *Origin = OriginMap[V];
2169 assert(Origin && "Missing origin");
2170 return Origin;
2171 }
2172
2173 /// Get the origin for i-th argument of the instruction I.
2174 Value *getOrigin(Instruction *I, int i) {
2175 return getOrigin(V: I->getOperand(i));
2176 }
2177
2178 /// Remember the place where a shadow check should be inserted.
2179 ///
2180 /// This location will be later instrumented with a check that will print a
2181 /// UMR warning in runtime if the shadow value is not 0.
2182 void insertShadowCheck(Value *Shadow, Value *Origin, Instruction *OrigIns) {
2183 assert(Shadow);
2184 if (!InsertChecks)
2185 return;
2186
2187 if (!DebugCounter::shouldExecute(CounterName: DebugInsertCheck)) {
2188 LLVM_DEBUG(dbgs() << "Skipping check of " << *Shadow << " before "
2189 << *OrigIns << "\n");
2190 return;
2191 }
2192#ifndef NDEBUG
2193 Type *ShadowTy = Shadow->getType();
2194 assert((isa<IntegerType>(ShadowTy) || isa<VectorType>(ShadowTy) ||
2195 isa<StructType>(ShadowTy) || isa<ArrayType>(ShadowTy)) &&
2196 "Can only insert checks for integer, vector, and aggregate shadow "
2197 "types");
2198#endif
2199 InstrumentationList.push_back(
2200 Elt: ShadowOriginAndInsertPoint(Shadow, Origin, OrigIns));
2201 }
2202
2203 /// Remember the place where a shadow check should be inserted.
2204 ///
2205 /// This location will be later instrumented with a check that will print a
2206 /// UMR warning in runtime if the value is not fully defined.
2207 void insertShadowCheck(Value *Val, Instruction *OrigIns) {
2208 assert(Val);
2209 Value *Shadow, *Origin;
2210 if (ClCheckConstantShadow) {
2211 Shadow = getShadow(V: Val);
2212 if (!Shadow)
2213 return;
2214 Origin = getOrigin(V: Val);
2215 } else {
2216 Shadow = dyn_cast_or_null<Instruction>(Val: getShadow(V: Val));
2217 if (!Shadow)
2218 return;
2219 Origin = dyn_cast_or_null<Instruction>(Val: getOrigin(V: Val));
2220 }
2221 insertShadowCheck(Shadow, Origin, OrigIns);
2222 }
2223
2224 AtomicOrdering addReleaseOrdering(AtomicOrdering a) {
2225 switch (a) {
2226 case AtomicOrdering::NotAtomic:
2227 return AtomicOrdering::NotAtomic;
2228 case AtomicOrdering::Unordered:
2229 case AtomicOrdering::Monotonic:
2230 case AtomicOrdering::Release:
2231 return AtomicOrdering::Release;
2232 case AtomicOrdering::Acquire:
2233 case AtomicOrdering::AcquireRelease:
2234 return AtomicOrdering::AcquireRelease;
2235 case AtomicOrdering::SequentiallyConsistent:
2236 return AtomicOrdering::SequentiallyConsistent;
2237 }
2238 llvm_unreachable("Unknown ordering");
2239 }
2240
2241 Value *makeAddReleaseOrderingTable(IRBuilder<> &IRB) {
2242 constexpr int NumOrderings = (int)AtomicOrderingCABI::seq_cst + 1;
2243 uint32_t OrderingTable[NumOrderings] = {};
2244
2245 OrderingTable[(int)AtomicOrderingCABI::relaxed] =
2246 OrderingTable[(int)AtomicOrderingCABI::release] =
2247 (int)AtomicOrderingCABI::release;
2248 OrderingTable[(int)AtomicOrderingCABI::consume] =
2249 OrderingTable[(int)AtomicOrderingCABI::acquire] =
2250 OrderingTable[(int)AtomicOrderingCABI::acq_rel] =
2251 (int)AtomicOrderingCABI::acq_rel;
2252 OrderingTable[(int)AtomicOrderingCABI::seq_cst] =
2253 (int)AtomicOrderingCABI::seq_cst;
2254
2255 return ConstantDataVector::get(Context&: IRB.getContext(), Elts: OrderingTable);
2256 }
2257
2258 AtomicOrdering addAcquireOrdering(AtomicOrdering a) {
2259 switch (a) {
2260 case AtomicOrdering::NotAtomic:
2261 return AtomicOrdering::NotAtomic;
2262 case AtomicOrdering::Unordered:
2263 case AtomicOrdering::Monotonic:
2264 case AtomicOrdering::Acquire:
2265 return AtomicOrdering::Acquire;
2266 case AtomicOrdering::Release:
2267 case AtomicOrdering::AcquireRelease:
2268 return AtomicOrdering::AcquireRelease;
2269 case AtomicOrdering::SequentiallyConsistent:
2270 return AtomicOrdering::SequentiallyConsistent;
2271 }
2272 llvm_unreachable("Unknown ordering");
2273 }
2274
2275 Value *makeAddAcquireOrderingTable(IRBuilder<> &IRB) {
2276 constexpr int NumOrderings = (int)AtomicOrderingCABI::seq_cst + 1;
2277 uint32_t OrderingTable[NumOrderings] = {};
2278
2279 OrderingTable[(int)AtomicOrderingCABI::relaxed] =
2280 OrderingTable[(int)AtomicOrderingCABI::acquire] =
2281 OrderingTable[(int)AtomicOrderingCABI::consume] =
2282 (int)AtomicOrderingCABI::acquire;
2283 OrderingTable[(int)AtomicOrderingCABI::release] =
2284 OrderingTable[(int)AtomicOrderingCABI::acq_rel] =
2285 (int)AtomicOrderingCABI::acq_rel;
2286 OrderingTable[(int)AtomicOrderingCABI::seq_cst] =
2287 (int)AtomicOrderingCABI::seq_cst;
2288
2289 return ConstantDataVector::get(Context&: IRB.getContext(), Elts: OrderingTable);
2290 }
2291
2292 // ------------------- Visitors.
2293 using InstVisitor<MemorySanitizerVisitor>::visit;
2294 void visit(Instruction &I) {
2295 if (I.getMetadata(KindID: LLVMContext::MD_nosanitize))
2296 return;
2297 // Don't want to visit if we're in the prologue
2298 if (isInPrologue(I))
2299 return;
2300 if (!DebugCounter::shouldExecute(CounterName: DebugInstrumentInstruction)) {
2301 LLVM_DEBUG(dbgs() << "Skipping instruction: " << I << "\n");
2302 // We still need to set the shadow and origin to clean values.
2303 setShadow(V: &I, SV: getCleanShadow(V: &I));
2304 setOrigin(V: &I, Origin: getCleanOrigin());
2305 return;
2306 }
2307
2308 Instructions.push_back(Elt: &I);
2309 }
2310
2311 /// Instrument LoadInst
2312 ///
2313 /// Loads the corresponding shadow and (optionally) origin.
2314 /// Optionally, checks that the load address is fully defined.
2315 void visitLoadInst(LoadInst &I) {
2316 assert(I.getType()->isSized() && "Load type must have size");
2317 assert(!I.getMetadata(LLVMContext::MD_nosanitize));
2318 NextNodeIRBuilder IRB(&I);
2319 Type *ShadowTy = getShadowTy(V: &I);
2320 Value *Addr = I.getPointerOperand();
2321 Value *ShadowPtr = nullptr, *OriginPtr = nullptr;
2322 const Align Alignment = I.getAlign();
2323 if (PropagateShadow) {
2324 std::tie(args&: ShadowPtr, args&: OriginPtr) =
2325 getShadowOriginPtr(Addr, IRB, ShadowTy, Alignment, /*isStore*/ false);
2326 setShadow(V: &I,
2327 SV: IRB.CreateAlignedLoad(Ty: ShadowTy, Ptr: ShadowPtr, Align: Alignment, Name: "_msld"));
2328 } else {
2329 setShadow(V: &I, SV: getCleanShadow(V: &I));
2330 }
2331
2332 if (ClCheckAccessAddress)
2333 insertShadowCheck(Val: I.getPointerOperand(), OrigIns: &I);
2334
2335 if (I.isAtomic())
2336 I.setOrdering(addAcquireOrdering(a: I.getOrdering()));
2337
2338 if (MS.TrackOrigins) {
2339 if (PropagateShadow) {
2340 const Align OriginAlignment = std::max(a: kMinOriginAlignment, b: Alignment);
2341 setOrigin(
2342 V: &I, Origin: IRB.CreateAlignedLoad(Ty: MS.OriginTy, Ptr: OriginPtr, Align: OriginAlignment));
2343 } else {
2344 setOrigin(V: &I, Origin: getCleanOrigin());
2345 }
2346 }
2347 }
2348
2349 /// Instrument StoreInst
2350 ///
2351 /// Stores the corresponding shadow and (optionally) origin.
2352 /// Optionally, checks that the store address is fully defined.
2353 void visitStoreInst(StoreInst &I) {
2354 StoreList.push_back(Elt: &I);
2355 if (ClCheckAccessAddress)
2356 insertShadowCheck(Val: I.getPointerOperand(), OrigIns: &I);
2357 }
2358
2359 void handleCASOrRMW(Instruction &I) {
2360 assert(isa<AtomicRMWInst>(I) || isa<AtomicCmpXchgInst>(I));
2361
2362 IRBuilder<> IRB(&I);
2363 Value *Addr = I.getOperand(i: 0);
2364 Value *Val = I.getOperand(i: 1);
2365 Value *ShadowPtr = getShadowOriginPtr(Addr, IRB, ShadowTy: getShadowTy(V: Val), Alignment: Align(1),
2366 /*isStore*/ true)
2367 .first;
2368
2369 if (ClCheckAccessAddress)
2370 insertShadowCheck(Val: Addr, OrigIns: &I);
2371
2372 // Only test the conditional argument of cmpxchg instruction.
2373 // The other argument can potentially be uninitialized, but we can not
2374 // detect this situation reliably without possible false positives.
2375 if (isa<AtomicCmpXchgInst>(Val: I))
2376 insertShadowCheck(Val, OrigIns: &I);
2377
2378 IRB.CreateStore(Val: getCleanShadow(V: Val), Ptr: ShadowPtr);
2379
2380 setShadow(V: &I, SV: getCleanShadow(V: &I));
2381 setOrigin(V: &I, Origin: getCleanOrigin());
2382 }
2383
2384 void visitAtomicRMWInst(AtomicRMWInst &I) {
2385 handleCASOrRMW(I);
2386 I.setOrdering(addReleaseOrdering(a: I.getOrdering()));
2387 }
2388
2389 void visitAtomicCmpXchgInst(AtomicCmpXchgInst &I) {
2390 handleCASOrRMW(I);
2391 I.setSuccessOrdering(addReleaseOrdering(a: I.getSuccessOrdering()));
2392 }
2393
2394 // Vector manipulation.
2395 void visitExtractElementInst(ExtractElementInst &I) {
2396 insertShadowCheck(Val: I.getOperand(i_nocapture: 1), OrigIns: &I);
2397 IRBuilder<> IRB(&I);
2398 setShadow(V: &I, SV: IRB.CreateExtractElement(Vec: getShadow(I: &I, i: 0), Idx: I.getOperand(i_nocapture: 1),
2399 Name: "_msprop"));
2400 setOrigin(V: &I, Origin: getOrigin(I: &I, i: 0));
2401 }
2402
2403 void visitInsertElementInst(InsertElementInst &I) {
2404 insertShadowCheck(Val: I.getOperand(i_nocapture: 2), OrigIns: &I);
2405 IRBuilder<> IRB(&I);
2406 auto *Shadow0 = getShadow(I: &I, i: 0);
2407 auto *Shadow1 = getShadow(I: &I, i: 1);
2408 setShadow(V: &I, SV: IRB.CreateInsertElement(Vec: Shadow0, NewElt: Shadow1, Idx: I.getOperand(i_nocapture: 2),
2409 Name: "_msprop"));
2410 setOriginForNaryOp(I);
2411 }
2412
2413 void visitShuffleVectorInst(ShuffleVectorInst &I) {
2414 IRBuilder<> IRB(&I);
2415 auto *Shadow0 = getShadow(I: &I, i: 0);
2416 auto *Shadow1 = getShadow(I: &I, i: 1);
2417 setShadow(V: &I, SV: IRB.CreateShuffleVector(V1: Shadow0, V2: Shadow1, Mask: I.getShuffleMask(),
2418 Name: "_msprop"));
2419 setOriginForNaryOp(I);
2420 }
2421
2422 // Casts.
2423 void visitSExtInst(SExtInst &I) {
2424 IRBuilder<> IRB(&I);
2425 setShadow(V: &I, SV: IRB.CreateSExt(V: getShadow(I: &I, i: 0), DestTy: I.getType(), Name: "_msprop"));
2426 setOrigin(V: &I, Origin: getOrigin(I: &I, i: 0));
2427 }
2428
2429 void visitZExtInst(ZExtInst &I) {
2430 IRBuilder<> IRB(&I);
2431 setShadow(V: &I, SV: IRB.CreateZExt(V: getShadow(I: &I, i: 0), DestTy: I.getType(), Name: "_msprop"));
2432 setOrigin(V: &I, Origin: getOrigin(I: &I, i: 0));
2433 }
2434
2435 void visitTruncInst(TruncInst &I) {
2436 IRBuilder<> IRB(&I);
2437 setShadow(V: &I, SV: IRB.CreateTrunc(V: getShadow(I: &I, i: 0), DestTy: I.getType(), Name: "_msprop"));
2438 setOrigin(V: &I, Origin: getOrigin(I: &I, i: 0));
2439 }
2440
2441 void visitBitCastInst(BitCastInst &I) {
2442 // Special case: if this is the bitcast (there is exactly 1 allowed) between
2443 // a musttail call and a ret, don't instrument. New instructions are not
2444 // allowed after a musttail call.
2445 if (auto *CI = dyn_cast<CallInst>(Val: I.getOperand(i_nocapture: 0)))
2446 if (CI->isMustTailCall())
2447 return;
2448 IRBuilder<> IRB(&I);
2449 setShadow(V: &I, SV: IRB.CreateBitCast(V: getShadow(I: &I, i: 0), DestTy: getShadowTy(V: &I)));
2450 setOrigin(V: &I, Origin: getOrigin(I: &I, i: 0));
2451 }
2452
2453 void visitPtrToIntInst(PtrToIntInst &I) {
2454 IRBuilder<> IRB(&I);
2455 setShadow(V: &I, SV: IRB.CreateIntCast(V: getShadow(I: &I, i: 0), DestTy: getShadowTy(V: &I), isSigned: false,
2456 Name: "_msprop_ptrtoint"));
2457 setOrigin(V: &I, Origin: getOrigin(I: &I, i: 0));
2458 }
2459
2460 void visitIntToPtrInst(IntToPtrInst &I) {
2461 IRBuilder<> IRB(&I);
2462 setShadow(V: &I, SV: IRB.CreateIntCast(V: getShadow(I: &I, i: 0), DestTy: getShadowTy(V: &I), isSigned: false,
2463 Name: "_msprop_inttoptr"));
2464 setOrigin(V: &I, Origin: getOrigin(I: &I, i: 0));
2465 }
2466
2467 void visitFPToSIInst(CastInst &I) { handleShadowOr(I); }
2468 void visitFPToUIInst(CastInst &I) { handleShadowOr(I); }
2469 void visitSIToFPInst(CastInst &I) { handleShadowOr(I); }
2470 void visitUIToFPInst(CastInst &I) { handleShadowOr(I); }
2471 void visitFPExtInst(CastInst &I) { handleShadowOr(I); }
2472 void visitFPTruncInst(CastInst &I) { handleShadowOr(I); }
2473
2474 /// Propagate shadow for bitwise AND.
2475 ///
2476 /// This code is exact, i.e. if, for example, a bit in the left argument
2477 /// is defined and 0, then neither the value not definedness of the
2478 /// corresponding bit in B don't affect the resulting shadow.
2479 void visitAnd(BinaryOperator &I) {
2480 IRBuilder<> IRB(&I);
2481 // "And" of 0 and a poisoned value results in unpoisoned value.
2482 // 1&1 => 1; 0&1 => 0; p&1 => p;
2483 // 1&0 => 0; 0&0 => 0; p&0 => 0;
2484 // 1&p => p; 0&p => 0; p&p => p;
2485 // S = (S1 & S2) | (V1 & S2) | (S1 & V2)
2486 Value *S1 = getShadow(I: &I, i: 0);
2487 Value *S2 = getShadow(I: &I, i: 1);
2488 Value *V1 = I.getOperand(i_nocapture: 0);
2489 Value *V2 = I.getOperand(i_nocapture: 1);
2490 if (V1->getType() != S1->getType()) {
2491 V1 = IRB.CreateIntCast(V: V1, DestTy: S1->getType(), isSigned: false);
2492 V2 = IRB.CreateIntCast(V: V2, DestTy: S2->getType(), isSigned: false);
2493 }
2494 Value *S1S2 = IRB.CreateAnd(LHS: S1, RHS: S2);
2495 Value *V1S2 = IRB.CreateAnd(LHS: V1, RHS: S2);
2496 Value *S1V2 = IRB.CreateAnd(LHS: S1, RHS: V2);
2497 setShadow(V: &I, SV: IRB.CreateOr(Ops: {S1S2, V1S2, S1V2}));
2498 setOriginForNaryOp(I);
2499 }
2500
2501 void visitOr(BinaryOperator &I) {
2502 IRBuilder<> IRB(&I);
2503 // "Or" of 1 and a poisoned value results in unpoisoned value:
2504 // 1|1 => 1; 0|1 => 1; p|1 => 1;
2505 // 1|0 => 1; 0|0 => 0; p|0 => p;
2506 // 1|p => 1; 0|p => p; p|p => p;
2507 //
2508 // S = (S1 & S2) | (~V1 & S2) | (S1 & ~V2)
2509 //
2510 // Addendum if the "Or" is "disjoint":
2511 // 1|1 => p;
2512 // S = S | (V1 & V2)
2513 Value *S1 = getShadow(I: &I, i: 0);
2514 Value *S2 = getShadow(I: &I, i: 1);
2515 Value *V1 = I.getOperand(i_nocapture: 0);
2516 Value *V2 = I.getOperand(i_nocapture: 1);
2517 if (V1->getType() != S1->getType()) {
2518 V1 = IRB.CreateIntCast(V: V1, DestTy: S1->getType(), isSigned: false);
2519 V2 = IRB.CreateIntCast(V: V2, DestTy: S2->getType(), isSigned: false);
2520 }
2521
2522 Value *NotV1 = IRB.CreateNot(V: V1);
2523 Value *NotV2 = IRB.CreateNot(V: V2);
2524
2525 Value *S1S2 = IRB.CreateAnd(LHS: S1, RHS: S2);
2526 Value *S2NotV1 = IRB.CreateAnd(LHS: NotV1, RHS: S2);
2527 Value *S1NotV2 = IRB.CreateAnd(LHS: S1, RHS: NotV2);
2528
2529 Value *S = IRB.CreateOr(Ops: {S1S2, S2NotV1, S1NotV2});
2530
2531 if (ClPreciseDisjointOr && cast<PossiblyDisjointInst>(Val: &I)->isDisjoint()) {
2532 Value *V1V2 = IRB.CreateAnd(LHS: V1, RHS: V2);
2533 S = IRB.CreateOr(LHS: S, RHS: V1V2, Name: "_ms_disjoint");
2534 }
2535
2536 setShadow(V: &I, SV: S);
2537 setOriginForNaryOp(I);
2538 }
2539
2540 /// Default propagation of shadow and/or origin.
2541 ///
2542 /// This class implements the general case of shadow propagation, used in all
2543 /// cases where we don't know and/or don't care about what the operation
2544 /// actually does. It converts all input shadow values to a common type
2545 /// (extending or truncating as necessary), and bitwise OR's them.
2546 ///
2547 /// This is much cheaper than inserting checks (i.e. requiring inputs to be
2548 /// fully initialized), and less prone to false positives.
2549 ///
2550 /// This class also implements the general case of origin propagation. For a
2551 /// Nary operation, result origin is set to the origin of an argument that is
2552 /// not entirely initialized. If there is more than one such arguments, the
2553 /// rightmost of them is picked. It does not matter which one is picked if all
2554 /// arguments are initialized.
2555 template <bool CombineShadow> class Combiner {
2556 Value *Shadow = nullptr;
2557 Value *Origin = nullptr;
2558 IRBuilder<> &IRB;
2559 MemorySanitizerVisitor *MSV;
2560
2561 public:
2562 Combiner(MemorySanitizerVisitor *MSV, IRBuilder<> &IRB)
2563 : IRB(IRB), MSV(MSV) {}
2564
2565 /// Add a pair of shadow and origin values to the mix.
2566 Combiner &Add(Value *OpShadow, Value *OpOrigin) {
2567 if (CombineShadow) {
2568 assert(OpShadow);
2569 if (!Shadow)
2570 Shadow = OpShadow;
2571 else {
2572 OpShadow = MSV->CreateShadowCast(IRB, V: OpShadow, dstTy: Shadow->getType());
2573 Shadow = IRB.CreateOr(LHS: Shadow, RHS: OpShadow, Name: "_msprop");
2574 }
2575 }
2576
2577 if (MSV->MS.TrackOrigins) {
2578 assert(OpOrigin);
2579 if (!Origin) {
2580 Origin = OpOrigin;
2581 } else {
2582 Constant *ConstOrigin = dyn_cast<Constant>(Val: OpOrigin);
2583 // No point in adding something that might result in 0 origin value.
2584 if (!ConstOrigin || !ConstOrigin->isNullValue()) {
2585 Value *Cond = MSV->convertToBool(V: OpShadow, IRB);
2586 Origin = IRB.CreateSelect(C: Cond, True: OpOrigin, False: Origin);
2587 }
2588 }
2589 }
2590 return *this;
2591 }
2592
2593 /// Add an application value to the mix.
2594 Combiner &Add(Value *V) {
2595 Value *OpShadow = MSV->getShadow(V);
2596 Value *OpOrigin = MSV->MS.TrackOrigins ? MSV->getOrigin(V) : nullptr;
2597 return Add(OpShadow, OpOrigin);
2598 }
2599
2600 /// Set the current combined values as the given instruction's shadow
2601 /// and origin.
2602 void Done(Instruction *I) {
2603 if (CombineShadow) {
2604 assert(Shadow);
2605 Shadow = MSV->CreateShadowCast(IRB, V: Shadow, dstTy: MSV->getShadowTy(V: I));
2606 MSV->setShadow(V: I, SV: Shadow);
2607 }
2608 if (MSV->MS.TrackOrigins) {
2609 assert(Origin);
2610 MSV->setOrigin(V: I, Origin);
2611 }
2612 }
2613
2614 /// Store the current combined value at the specified origin
2615 /// location.
2616 void DoneAndStoreOrigin(TypeSize TS, Value *OriginPtr) {
2617 if (MSV->MS.TrackOrigins) {
2618 assert(Origin);
2619 MSV->paintOrigin(IRB, Origin, OriginPtr, TS, Alignment: kMinOriginAlignment);
2620 }
2621 }
2622 };
2623
2624 using ShadowAndOriginCombiner = Combiner<true>;
2625 using OriginCombiner = Combiner<false>;
2626
2627 /// Propagate origin for arbitrary operation.
2628 void setOriginForNaryOp(Instruction &I) {
2629 if (!MS.TrackOrigins)
2630 return;
2631 IRBuilder<> IRB(&I);
2632 OriginCombiner OC(this, IRB);
2633 for (Use &Op : I.operands())
2634 OC.Add(V: Op.get());
2635 OC.Done(I: &I);
2636 }
2637
2638 size_t VectorOrPrimitiveTypeSizeInBits(Type *Ty) {
2639 assert(!(Ty->isVectorTy() && Ty->getScalarType()->isPointerTy()) &&
2640 "Vector of pointers is not a valid shadow type");
2641 return Ty->isVectorTy() ? cast<FixedVectorType>(Val: Ty)->getNumElements() *
2642 Ty->getScalarSizeInBits()
2643 : Ty->getPrimitiveSizeInBits();
2644 }
2645
2646 /// Cast between two shadow types, extending or truncating as
2647 /// necessary.
2648 Value *CreateShadowCast(IRBuilder<> &IRB, Value *V, Type *dstTy,
2649 bool Signed = false) {
2650 Type *srcTy = V->getType();
2651 if (srcTy == dstTy)
2652 return V;
2653 size_t srcSizeInBits = VectorOrPrimitiveTypeSizeInBits(Ty: srcTy);
2654 size_t dstSizeInBits = VectorOrPrimitiveTypeSizeInBits(Ty: dstTy);
2655 if (srcSizeInBits > 1 && dstSizeInBits == 1)
2656 return IRB.CreateICmpNE(LHS: V, RHS: getCleanShadow(V));
2657
2658 if (dstTy->isIntegerTy() && srcTy->isIntegerTy())
2659 return IRB.CreateIntCast(V, DestTy: dstTy, isSigned: Signed);
2660 if (dstTy->isVectorTy() && srcTy->isVectorTy() &&
2661 cast<VectorType>(Val: dstTy)->getElementCount() ==
2662 cast<VectorType>(Val: srcTy)->getElementCount())
2663 return IRB.CreateIntCast(V, DestTy: dstTy, isSigned: Signed);
2664 Value *V1 = IRB.CreateBitCast(V, DestTy: Type::getIntNTy(C&: *MS.C, N: srcSizeInBits));
2665 Value *V2 =
2666 IRB.CreateIntCast(V: V1, DestTy: Type::getIntNTy(C&: *MS.C, N: dstSizeInBits), isSigned: Signed);
2667 return IRB.CreateBitCast(V: V2, DestTy: dstTy);
2668 // TODO: handle struct types.
2669 }
2670
2671 /// Cast an application value to the type of its own shadow.
2672 Value *CreateAppToShadowCast(IRBuilder<> &IRB, Value *V) {
2673 Type *ShadowTy = getShadowTy(V);
2674 if (V->getType() == ShadowTy)
2675 return V;
2676 if (V->getType()->isPtrOrPtrVectorTy())
2677 return IRB.CreatePtrToInt(V, DestTy: ShadowTy);
2678 else
2679 return IRB.CreateBitCast(V, DestTy: ShadowTy);
2680 }
2681
2682 /// Propagate shadow for arbitrary operation.
2683 void handleShadowOr(Instruction &I) {
2684 IRBuilder<> IRB(&I);
2685 ShadowAndOriginCombiner SC(this, IRB);
2686 for (Use &Op : I.operands())
2687 SC.Add(V: Op.get());
2688 SC.Done(I: &I);
2689 }
2690
2691 /// Propagate shadow for 1- or 2-vector intrinsics that combine adjacent
2692 /// fields.
2693 ///
2694 /// e.g., <2 x i32> @llvm.aarch64.neon.saddlp.v2i32.v4i16(<4 x i16>)
2695 /// <16 x i8> @llvm.aarch64.neon.addp.v16i8(<16 x i8>, <16 x i8>)
2696 void handlePairwiseShadowOrIntrinsic(IntrinsicInst &I) {
2697 assert(I.arg_size() == 1 || I.arg_size() == 2);
2698
2699 assert(I.getType()->isVectorTy());
2700 assert(I.getArgOperand(0)->getType()->isVectorTy());
2701
2702 FixedVectorType *ParamType =
2703 cast<FixedVectorType>(Val: I.getArgOperand(i: 0)->getType());
2704 assert((I.arg_size() != 2) ||
2705 (ParamType == cast<FixedVectorType>(I.getArgOperand(1)->getType())));
2706 [[maybe_unused]] FixedVectorType *ReturnType =
2707 cast<FixedVectorType>(Val: I.getType());
2708 assert(ParamType->getNumElements() * I.arg_size() ==
2709 2 * ReturnType->getNumElements());
2710
2711 IRBuilder<> IRB(&I);
2712 unsigned Width = ParamType->getNumElements() * I.arg_size();
2713
2714 // Horizontal OR of shadow
2715 SmallVector<int, 8> EvenMask;
2716 SmallVector<int, 8> OddMask;
2717 for (unsigned X = 0; X < Width; X += 2) {
2718 EvenMask.push_back(Elt: X);
2719 OddMask.push_back(Elt: X + 1);
2720 }
2721
2722 Value *FirstArgShadow = getShadow(I: &I, i: 0);
2723 Value *EvenShadow;
2724 Value *OddShadow;
2725 if (I.arg_size() == 2) {
2726 Value *SecondArgShadow = getShadow(I: &I, i: 1);
2727 EvenShadow =
2728 IRB.CreateShuffleVector(V1: FirstArgShadow, V2: SecondArgShadow, Mask: EvenMask);
2729 OddShadow =
2730 IRB.CreateShuffleVector(V1: FirstArgShadow, V2: SecondArgShadow, Mask: OddMask);
2731 } else {
2732 EvenShadow = IRB.CreateShuffleVector(V: FirstArgShadow, Mask: EvenMask);
2733 OddShadow = IRB.CreateShuffleVector(V: FirstArgShadow, Mask: OddMask);
2734 }
2735
2736 Value *OrShadow = IRB.CreateOr(LHS: EvenShadow, RHS: OddShadow);
2737 OrShadow = CreateShadowCast(IRB, V: OrShadow, dstTy: getShadowTy(V: &I));
2738
2739 setShadow(V: &I, SV: OrShadow);
2740 setOriginForNaryOp(I);
2741 }
2742
2743 /// Propagate shadow for 1- or 2-vector intrinsics that combine adjacent
2744 /// fields, with the parameters reinterpreted to have elements of a specified
2745 /// width. For example:
2746 /// @llvm.x86.ssse3.phadd.w(<1 x i64> [[VAR1]], <1 x i64> [[VAR2]])
2747 /// conceptually operates on
2748 /// (<4 x i16> [[VAR1]], <4 x i16> [[VAR2]])
2749 /// and can be handled with ReinterpretElemWidth == 16.
2750 void handlePairwiseShadowOrIntrinsic(IntrinsicInst &I,
2751 int ReinterpretElemWidth) {
2752 assert(I.arg_size() == 1 || I.arg_size() == 2);
2753
2754 assert(I.getType()->isVectorTy());
2755 assert(I.getArgOperand(0)->getType()->isVectorTy());
2756
2757 FixedVectorType *ParamType =
2758 cast<FixedVectorType>(Val: I.getArgOperand(i: 0)->getType());
2759 assert((I.arg_size() != 2) ||
2760 (ParamType == cast<FixedVectorType>(I.getArgOperand(1)->getType())));
2761
2762 [[maybe_unused]] FixedVectorType *ReturnType =
2763 cast<FixedVectorType>(Val: I.getType());
2764 assert(ParamType->getNumElements() * I.arg_size() ==
2765 2 * ReturnType->getNumElements());
2766
2767 IRBuilder<> IRB(&I);
2768
2769 unsigned TotalNumElems = ParamType->getNumElements() * I.arg_size();
2770 FixedVectorType *ReinterpretShadowTy = nullptr;
2771 assert(isAligned(Align(ReinterpretElemWidth),
2772 ParamType->getPrimitiveSizeInBits()));
2773 ReinterpretShadowTy = FixedVectorType::get(
2774 ElementType: IRB.getIntNTy(N: ReinterpretElemWidth),
2775 NumElts: ParamType->getPrimitiveSizeInBits() / ReinterpretElemWidth);
2776 TotalNumElems = ReinterpretShadowTy->getNumElements() * I.arg_size();
2777
2778 // Horizontal OR of shadow
2779 SmallVector<int, 8> EvenMask;
2780 SmallVector<int, 8> OddMask;
2781 for (unsigned X = 0; X < TotalNumElems - 1; X += 2) {
2782 EvenMask.push_back(Elt: X);
2783 OddMask.push_back(Elt: X + 1);
2784 }
2785
2786 Value *FirstArgShadow = getShadow(I: &I, i: 0);
2787 FirstArgShadow = IRB.CreateBitCast(V: FirstArgShadow, DestTy: ReinterpretShadowTy);
2788
2789 // If we had two parameters each with an odd number of elements, the total
2790 // number of elements is even, but we have never seen this in extant
2791 // instruction sets, so we enforce that each parameter must have an even
2792 // number of elements.
2793 assert(isAligned(
2794 Align(2),
2795 cast<FixedVectorType>(FirstArgShadow->getType())->getNumElements()));
2796
2797 Value *EvenShadow;
2798 Value *OddShadow;
2799 if (I.arg_size() == 2) {
2800 Value *SecondArgShadow = getShadow(I: &I, i: 1);
2801 SecondArgShadow = IRB.CreateBitCast(V: SecondArgShadow, DestTy: ReinterpretShadowTy);
2802
2803 EvenShadow =
2804 IRB.CreateShuffleVector(V1: FirstArgShadow, V2: SecondArgShadow, Mask: EvenMask);
2805 OddShadow =
2806 IRB.CreateShuffleVector(V1: FirstArgShadow, V2: SecondArgShadow, Mask: OddMask);
2807 } else {
2808 EvenShadow = IRB.CreateShuffleVector(V: FirstArgShadow, Mask: EvenMask);
2809 OddShadow = IRB.CreateShuffleVector(V: FirstArgShadow, Mask: OddMask);
2810 }
2811
2812 Value *OrShadow = IRB.CreateOr(LHS: EvenShadow, RHS: OddShadow);
2813 OrShadow = CreateShadowCast(IRB, V: OrShadow, dstTy: getShadowTy(V: &I));
2814
2815 setShadow(V: &I, SV: OrShadow);
2816 setOriginForNaryOp(I);
2817 }
2818
2819 void visitFNeg(UnaryOperator &I) { handleShadowOr(I); }
2820
2821 // Handle multiplication by constant.
2822 //
2823 // Handle a special case of multiplication by constant that may have one or
2824 // more zeros in the lower bits. This makes corresponding number of lower bits
2825 // of the result zero as well. We model it by shifting the other operand
2826 // shadow left by the required number of bits. Effectively, we transform
2827 // (X * (A * 2**B)) to ((X << B) * A) and instrument (X << B) as (Sx << B).
2828 // We use multiplication by 2**N instead of shift to cover the case of
2829 // multiplication by 0, which may occur in some elements of a vector operand.
2830 void handleMulByConstant(BinaryOperator &I, Constant *ConstArg,
2831 Value *OtherArg) {
2832 Constant *ShadowMul;
2833 Type *Ty = ConstArg->getType();
2834 if (auto *VTy = dyn_cast<VectorType>(Val: Ty)) {
2835 unsigned NumElements = cast<FixedVectorType>(Val: VTy)->getNumElements();
2836 Type *EltTy = VTy->getElementType();
2837 SmallVector<Constant *, 16> Elements;
2838 for (unsigned Idx = 0; Idx < NumElements; ++Idx) {
2839 if (ConstantInt *Elt =
2840 dyn_cast<ConstantInt>(Val: ConstArg->getAggregateElement(Elt: Idx))) {
2841 const APInt &V = Elt->getValue();
2842 APInt V2 = APInt(V.getBitWidth(), 1) << V.countr_zero();
2843 Elements.push_back(Elt: ConstantInt::get(Ty: EltTy, V: V2));
2844 } else {
2845 Elements.push_back(Elt: ConstantInt::get(Ty: EltTy, V: 1));
2846 }
2847 }
2848 ShadowMul = ConstantVector::get(V: Elements);
2849 } else {
2850 if (ConstantInt *Elt = dyn_cast<ConstantInt>(Val: ConstArg)) {
2851 const APInt &V = Elt->getValue();
2852 APInt V2 = APInt(V.getBitWidth(), 1) << V.countr_zero();
2853 ShadowMul = ConstantInt::get(Ty, V: V2);
2854 } else {
2855 ShadowMul = ConstantInt::get(Ty, V: 1);
2856 }
2857 }
2858
2859 IRBuilder<> IRB(&I);
2860 setShadow(V: &I,
2861 SV: IRB.CreateMul(LHS: getShadow(V: OtherArg), RHS: ShadowMul, Name: "msprop_mul_cst"));
2862 setOrigin(V: &I, Origin: getOrigin(V: OtherArg));
2863 }
2864
2865 void visitMul(BinaryOperator &I) {
2866 Constant *constOp0 = dyn_cast<Constant>(Val: I.getOperand(i_nocapture: 0));
2867 Constant *constOp1 = dyn_cast<Constant>(Val: I.getOperand(i_nocapture: 1));
2868 if (constOp0 && !constOp1)
2869 handleMulByConstant(I, ConstArg: constOp0, OtherArg: I.getOperand(i_nocapture: 1));
2870 else if (constOp1 && !constOp0)
2871 handleMulByConstant(I, ConstArg: constOp1, OtherArg: I.getOperand(i_nocapture: 0));
2872 else
2873 handleShadowOr(I);
2874 }
2875
2876 void visitFAdd(BinaryOperator &I) { handleShadowOr(I); }
2877 void visitFSub(BinaryOperator &I) { handleShadowOr(I); }
2878 void visitFMul(BinaryOperator &I) { handleShadowOr(I); }
2879 void visitAdd(BinaryOperator &I) { handleShadowOr(I); }
2880 void visitSub(BinaryOperator &I) { handleShadowOr(I); }
2881 void visitXor(BinaryOperator &I) { handleShadowOr(I); }
2882
2883 void handleIntegerDiv(Instruction &I) {
2884 IRBuilder<> IRB(&I);
2885 // Strict on the second argument.
2886 insertShadowCheck(Val: I.getOperand(i: 1), OrigIns: &I);
2887 setShadow(V: &I, SV: getShadow(I: &I, i: 0));
2888 setOrigin(V: &I, Origin: getOrigin(I: &I, i: 0));
2889 }
2890
2891 void visitUDiv(BinaryOperator &I) { handleIntegerDiv(I); }
2892 void visitSDiv(BinaryOperator &I) { handleIntegerDiv(I); }
2893 void visitURem(BinaryOperator &I) { handleIntegerDiv(I); }
2894 void visitSRem(BinaryOperator &I) { handleIntegerDiv(I); }
2895
2896 // Floating point division is side-effect free. We can not require that the
2897 // divisor is fully initialized and must propagate shadow. See PR37523.
2898 void visitFDiv(BinaryOperator &I) { handleShadowOr(I); }
2899 void visitFRem(BinaryOperator &I) { handleShadowOr(I); }
2900
2901 /// Instrument == and != comparisons.
2902 ///
2903 /// Sometimes the comparison result is known even if some of the bits of the
2904 /// arguments are not.
2905 void handleEqualityComparison(ICmpInst &I) {
2906 IRBuilder<> IRB(&I);
2907 Value *A = I.getOperand(i_nocapture: 0);
2908 Value *B = I.getOperand(i_nocapture: 1);
2909 Value *Sa = getShadow(V: A);
2910 Value *Sb = getShadow(V: B);
2911
2912 // Get rid of pointers and vectors of pointers.
2913 // For ints (and vectors of ints), types of A and Sa match,
2914 // and this is a no-op.
2915 A = IRB.CreatePointerCast(V: A, DestTy: Sa->getType());
2916 B = IRB.CreatePointerCast(V: B, DestTy: Sb->getType());
2917
2918 // A == B <==> (C = A^B) == 0
2919 // A != B <==> (C = A^B) != 0
2920 // Sc = Sa | Sb
2921 Value *C = IRB.CreateXor(LHS: A, RHS: B);
2922 Value *Sc = IRB.CreateOr(LHS: Sa, RHS: Sb);
2923 // Now dealing with i = (C == 0) comparison (or C != 0, does not matter now)
2924 // Result is defined if one of the following is true
2925 // * there is a defined 1 bit in C
2926 // * C is fully defined
2927 // Si = !(C & ~Sc) && Sc
2928 Value *Zero = Constant::getNullValue(Ty: Sc->getType());
2929 Value *MinusOne = Constant::getAllOnesValue(Ty: Sc->getType());
2930 Value *LHS = IRB.CreateICmpNE(LHS: Sc, RHS: Zero);
2931 Value *RHS =
2932 IRB.CreateICmpEQ(LHS: IRB.CreateAnd(LHS: IRB.CreateXor(LHS: Sc, RHS: MinusOne), RHS: C), RHS: Zero);
2933 Value *Si = IRB.CreateAnd(LHS, RHS);
2934 Si->setName("_msprop_icmp");
2935 setShadow(V: &I, SV: Si);
2936 setOriginForNaryOp(I);
2937 }
2938
2939 /// Instrument relational comparisons.
2940 ///
2941 /// This function does exact shadow propagation for all relational
2942 /// comparisons of integers, pointers and vectors of those.
2943 /// FIXME: output seems suboptimal when one of the operands is a constant
2944 void handleRelationalComparisonExact(ICmpInst &I) {
2945 IRBuilder<> IRB(&I);
2946 Value *A = I.getOperand(i_nocapture: 0);
2947 Value *B = I.getOperand(i_nocapture: 1);
2948 Value *Sa = getShadow(V: A);
2949 Value *Sb = getShadow(V: B);
2950
2951 // Get rid of pointers and vectors of pointers.
2952 // For ints (and vectors of ints), types of A and Sa match,
2953 // and this is a no-op.
2954 A = IRB.CreatePointerCast(V: A, DestTy: Sa->getType());
2955 B = IRB.CreatePointerCast(V: B, DestTy: Sb->getType());
2956
2957 // Let [a0, a1] be the interval of possible values of A, taking into account
2958 // its undefined bits. Let [b0, b1] be the interval of possible values of B.
2959 // Then (A cmp B) is defined iff (a0 cmp b1) == (a1 cmp b0).
2960 bool IsSigned = I.isSigned();
2961
2962 auto GetMinMaxUnsigned = [&](Value *V, Value *S) {
2963 if (IsSigned) {
2964 // Sign-flip to map from signed range to unsigned range. Relation A vs B
2965 // should be preserved, if checked with `getUnsignedPredicate()`.
2966 // Relationship between Amin, Amax, Bmin, Bmax also will not be
2967 // affected, as they are created by effectively adding/substructing from
2968 // A (or B) a value, derived from shadow, with no overflow, either
2969 // before or after sign flip.
2970 APInt MinVal =
2971 APInt::getSignedMinValue(numBits: V->getType()->getScalarSizeInBits());
2972 V = IRB.CreateXor(LHS: V, RHS: ConstantInt::get(Ty: V->getType(), V: MinVal));
2973 }
2974 // Minimize undefined bits.
2975 Value *Min = IRB.CreateAnd(LHS: V, RHS: IRB.CreateNot(V: S));
2976 Value *Max = IRB.CreateOr(LHS: V, RHS: S);
2977 return std::make_pair(x&: Min, y&: Max);
2978 };
2979
2980 auto [Amin, Amax] = GetMinMaxUnsigned(A, Sa);
2981 auto [Bmin, Bmax] = GetMinMaxUnsigned(B, Sb);
2982 Value *S1 = IRB.CreateICmp(P: I.getUnsignedPredicate(), LHS: Amin, RHS: Bmax);
2983 Value *S2 = IRB.CreateICmp(P: I.getUnsignedPredicate(), LHS: Amax, RHS: Bmin);
2984
2985 Value *Si = IRB.CreateXor(LHS: S1, RHS: S2);
2986 setShadow(V: &I, SV: Si);
2987 setOriginForNaryOp(I);
2988 }
2989
2990 /// Instrument signed relational comparisons.
2991 ///
2992 /// Handle sign bit tests: x<0, x>=0, x<=-1, x>-1 by propagating the highest
2993 /// bit of the shadow. Everything else is delegated to handleShadowOr().
2994 void handleSignedRelationalComparison(ICmpInst &I) {
2995 Constant *constOp;
2996 Value *op = nullptr;
2997 CmpInst::Predicate pre;
2998 if ((constOp = dyn_cast<Constant>(Val: I.getOperand(i_nocapture: 1)))) {
2999 op = I.getOperand(i_nocapture: 0);
3000 pre = I.getPredicate();
3001 } else if ((constOp = dyn_cast<Constant>(Val: I.getOperand(i_nocapture: 0)))) {
3002 op = I.getOperand(i_nocapture: 1);
3003 pre = I.getSwappedPredicate();
3004 } else {
3005 handleShadowOr(I);
3006 return;
3007 }
3008
3009 if ((constOp->isNullValue() &&
3010 (pre == CmpInst::ICMP_SLT || pre == CmpInst::ICMP_SGE)) ||
3011 (constOp->isAllOnesValue() &&
3012 (pre == CmpInst::ICMP_SGT || pre == CmpInst::ICMP_SLE))) {
3013 IRBuilder<> IRB(&I);
3014 Value *Shadow = IRB.CreateICmpSLT(LHS: getShadow(V: op), RHS: getCleanShadow(V: op),
3015 Name: "_msprop_icmp_s");
3016 setShadow(V: &I, SV: Shadow);
3017 setOrigin(V: &I, Origin: getOrigin(V: op));
3018 } else {
3019 handleShadowOr(I);
3020 }
3021 }
3022
3023 void visitICmpInst(ICmpInst &I) {
3024 if (!ClHandleICmp) {
3025 handleShadowOr(I);
3026 return;
3027 }
3028 if (I.isEquality()) {
3029 handleEqualityComparison(I);
3030 return;
3031 }
3032
3033 assert(I.isRelational());
3034 if (ClHandleICmpExact) {
3035 handleRelationalComparisonExact(I);
3036 return;
3037 }
3038 if (I.isSigned()) {
3039 handleSignedRelationalComparison(I);
3040 return;
3041 }
3042
3043 assert(I.isUnsigned());
3044 if ((isa<Constant>(Val: I.getOperand(i_nocapture: 0)) || isa<Constant>(Val: I.getOperand(i_nocapture: 1)))) {
3045 handleRelationalComparisonExact(I);
3046 return;
3047 }
3048
3049 handleShadowOr(I);
3050 }
3051
3052 void visitFCmpInst(FCmpInst &I) { handleShadowOr(I); }
3053
3054 void handleShift(BinaryOperator &I) {
3055 IRBuilder<> IRB(&I);
3056 // If any of the S2 bits are poisoned, the whole thing is poisoned.
3057 // Otherwise perform the same shift on S1.
3058 Value *S1 = getShadow(I: &I, i: 0);
3059 Value *S2 = getShadow(I: &I, i: 1);
3060 Value *S2Conv =
3061 IRB.CreateSExt(V: IRB.CreateICmpNE(LHS: S2, RHS: getCleanShadow(V: S2)), DestTy: S2->getType());
3062 Value *V2 = I.getOperand(i_nocapture: 1);
3063 Value *Shift = IRB.CreateBinOp(Opc: I.getOpcode(), LHS: S1, RHS: V2);
3064 setShadow(V: &I, SV: IRB.CreateOr(LHS: Shift, RHS: S2Conv));
3065 setOriginForNaryOp(I);
3066 }
3067
3068 void visitShl(BinaryOperator &I) { handleShift(I); }
3069 void visitAShr(BinaryOperator &I) { handleShift(I); }
3070 void visitLShr(BinaryOperator &I) { handleShift(I); }
3071
3072 void handleFunnelShift(IntrinsicInst &I) {
3073 IRBuilder<> IRB(&I);
3074 // If any of the S2 bits are poisoned, the whole thing is poisoned.
3075 // Otherwise perform the same shift on S0 and S1.
3076 Value *S0 = getShadow(I: &I, i: 0);
3077 Value *S1 = getShadow(I: &I, i: 1);
3078 Value *S2 = getShadow(I: &I, i: 2);
3079 Value *S2Conv =
3080 IRB.CreateSExt(V: IRB.CreateICmpNE(LHS: S2, RHS: getCleanShadow(V: S2)), DestTy: S2->getType());
3081 Value *V2 = I.getOperand(i_nocapture: 2);
3082 Value *Shift = IRB.CreateIntrinsic(ID: I.getIntrinsicID(), Types: S2Conv->getType(),
3083 Args: {S0, S1, V2});
3084 setShadow(V: &I, SV: IRB.CreateOr(LHS: Shift, RHS: S2Conv));
3085 setOriginForNaryOp(I);
3086 }
3087
3088 /// Instrument llvm.memmove
3089 ///
3090 /// At this point we don't know if llvm.memmove will be inlined or not.
3091 /// If we don't instrument it and it gets inlined,
3092 /// our interceptor will not kick in and we will lose the memmove.
3093 /// If we instrument the call here, but it does not get inlined,
3094 /// we will memove the shadow twice: which is bad in case
3095 /// of overlapping regions. So, we simply lower the intrinsic to a call.
3096 ///
3097 /// Similar situation exists for memcpy and memset.
3098 void visitMemMoveInst(MemMoveInst &I) {
3099 getShadow(V: I.getArgOperand(i: 1)); // Ensure shadow initialized
3100 IRBuilder<> IRB(&I);
3101 IRB.CreateCall(Callee: MS.MemmoveFn,
3102 Args: {I.getArgOperand(i: 0), I.getArgOperand(i: 1),
3103 IRB.CreateIntCast(V: I.getArgOperand(i: 2), DestTy: MS.IntptrTy, isSigned: false)});
3104 I.eraseFromParent();
3105 }
3106
3107 /// Instrument memcpy
3108 ///
3109 /// Similar to memmove: avoid copying shadow twice. This is somewhat
3110 /// unfortunate as it may slowdown small constant memcpys.
3111 /// FIXME: consider doing manual inline for small constant sizes and proper
3112 /// alignment.
3113 ///
3114 /// Note: This also handles memcpy.inline, which promises no calls to external
3115 /// functions as an optimization. However, with instrumentation enabled this
3116 /// is difficult to promise; additionally, we know that the MSan runtime
3117 /// exists and provides __msan_memcpy(). Therefore, we assume that with
3118 /// instrumentation it's safe to turn memcpy.inline into a call to
3119 /// __msan_memcpy(). Should this be wrong, such as when implementing memcpy()
3120 /// itself, instrumentation should be disabled with the no_sanitize attribute.
3121 void visitMemCpyInst(MemCpyInst &I) {
3122 getShadow(V: I.getArgOperand(i: 1)); // Ensure shadow initialized
3123 IRBuilder<> IRB(&I);
3124 IRB.CreateCall(Callee: MS.MemcpyFn,
3125 Args: {I.getArgOperand(i: 0), I.getArgOperand(i: 1),
3126 IRB.CreateIntCast(V: I.getArgOperand(i: 2), DestTy: MS.IntptrTy, isSigned: false)});
3127 I.eraseFromParent();
3128 }
3129
3130 // Same as memcpy.
3131 void visitMemSetInst(MemSetInst &I) {
3132 IRBuilder<> IRB(&I);
3133 IRB.CreateCall(
3134 Callee: MS.MemsetFn,
3135 Args: {I.getArgOperand(i: 0),
3136 IRB.CreateIntCast(V: I.getArgOperand(i: 1), DestTy: IRB.getInt32Ty(), isSigned: false),
3137 IRB.CreateIntCast(V: I.getArgOperand(i: 2), DestTy: MS.IntptrTy, isSigned: false)});
3138 I.eraseFromParent();
3139 }
3140
3141 void visitVAStartInst(VAStartInst &I) { VAHelper->visitVAStartInst(I); }
3142
3143 void visitVACopyInst(VACopyInst &I) { VAHelper->visitVACopyInst(I); }
3144
3145 /// Handle vector store-like intrinsics.
3146 ///
3147 /// Instrument intrinsics that look like a simple SIMD store: writes memory,
3148 /// has 1 pointer argument and 1 vector argument, returns void.
3149 bool handleVectorStoreIntrinsic(IntrinsicInst &I) {
3150 assert(I.arg_size() == 2);
3151
3152 IRBuilder<> IRB(&I);
3153 Value *Addr = I.getArgOperand(i: 0);
3154 Value *Shadow = getShadow(I: &I, i: 1);
3155 Value *ShadowPtr, *OriginPtr;
3156
3157 // We don't know the pointer alignment (could be unaligned SSE store!).
3158 // Have to assume to worst case.
3159 std::tie(args&: ShadowPtr, args&: OriginPtr) = getShadowOriginPtr(
3160 Addr, IRB, ShadowTy: Shadow->getType(), Alignment: Align(1), /*isStore*/ true);
3161 IRB.CreateAlignedStore(Val: Shadow, Ptr: ShadowPtr, Align: Align(1));
3162
3163 if (ClCheckAccessAddress)
3164 insertShadowCheck(Val: Addr, OrigIns: &I);
3165
3166 // FIXME: factor out common code from materializeStores
3167 if (MS.TrackOrigins)
3168 IRB.CreateStore(Val: getOrigin(I: &I, i: 1), Ptr: OriginPtr);
3169 return true;
3170 }
3171
3172 /// Handle vector load-like intrinsics.
3173 ///
3174 /// Instrument intrinsics that look like a simple SIMD load: reads memory,
3175 /// has 1 pointer argument, returns a vector.
3176 bool handleVectorLoadIntrinsic(IntrinsicInst &I) {
3177 assert(I.arg_size() == 1);
3178
3179 IRBuilder<> IRB(&I);
3180 Value *Addr = I.getArgOperand(i: 0);
3181
3182 Type *ShadowTy = getShadowTy(V: &I);
3183 Value *ShadowPtr = nullptr, *OriginPtr = nullptr;
3184 if (PropagateShadow) {
3185 // We don't know the pointer alignment (could be unaligned SSE load!).
3186 // Have to assume to worst case.
3187 const Align Alignment = Align(1);
3188 std::tie(args&: ShadowPtr, args&: OriginPtr) =
3189 getShadowOriginPtr(Addr, IRB, ShadowTy, Alignment, /*isStore*/ false);
3190 setShadow(V: &I,
3191 SV: IRB.CreateAlignedLoad(Ty: ShadowTy, Ptr: ShadowPtr, Align: Alignment, Name: "_msld"));
3192 } else {
3193 setShadow(V: &I, SV: getCleanShadow(V: &I));
3194 }
3195
3196 if (ClCheckAccessAddress)
3197 insertShadowCheck(Val: Addr, OrigIns: &I);
3198
3199 if (MS.TrackOrigins) {
3200 if (PropagateShadow)
3201 setOrigin(V: &I, Origin: IRB.CreateLoad(Ty: MS.OriginTy, Ptr: OriginPtr));
3202 else
3203 setOrigin(V: &I, Origin: getCleanOrigin());
3204 }
3205 return true;
3206 }
3207
3208 /// Handle (SIMD arithmetic)-like intrinsics.
3209 ///
3210 /// Instrument intrinsics with any number of arguments of the same type [*],
3211 /// equal to the return type, plus a specified number of trailing flags of
3212 /// any type.
3213 ///
3214 /// [*] The type should be simple (no aggregates or pointers; vectors are
3215 /// fine).
3216 ///
3217 /// Caller guarantees that this intrinsic does not access memory.
3218 ///
3219 /// TODO: "horizontal"/"pairwise" intrinsics are often incorrectly matched by
3220 /// by this handler.
3221 [[maybe_unused]] bool
3222 maybeHandleSimpleNomemIntrinsic(IntrinsicInst &I,
3223 unsigned int trailingFlags) {
3224 Type *RetTy = I.getType();
3225 if (!(RetTy->isIntOrIntVectorTy() || RetTy->isFPOrFPVectorTy()))
3226 return false;
3227
3228 unsigned NumArgOperands = I.arg_size();
3229 assert(NumArgOperands >= trailingFlags);
3230 for (unsigned i = 0; i < NumArgOperands - trailingFlags; ++i) {
3231 Type *Ty = I.getArgOperand(i)->getType();
3232 if (Ty != RetTy)
3233 return false;
3234 }
3235
3236 IRBuilder<> IRB(&I);
3237 ShadowAndOriginCombiner SC(this, IRB);
3238 for (unsigned i = 0; i < NumArgOperands; ++i)
3239 SC.Add(V: I.getArgOperand(i));
3240 SC.Done(I: &I);
3241
3242 return true;
3243 }
3244
3245 /// Heuristically instrument unknown intrinsics.
3246 ///
3247 /// The main purpose of this code is to do something reasonable with all
3248 /// random intrinsics we might encounter, most importantly - SIMD intrinsics.
3249 /// We recognize several classes of intrinsics by their argument types and
3250 /// ModRefBehaviour and apply special instrumentation when we are reasonably
3251 /// sure that we know what the intrinsic does.
3252 ///
3253 /// We special-case intrinsics where this approach fails. See llvm.bswap
3254 /// handling as an example of that.
3255 bool handleUnknownIntrinsicUnlogged(IntrinsicInst &I) {
3256 unsigned NumArgOperands = I.arg_size();
3257 if (NumArgOperands == 0)
3258 return false;
3259
3260 if (NumArgOperands == 2 && I.getArgOperand(i: 0)->getType()->isPointerTy() &&
3261 I.getArgOperand(i: 1)->getType()->isVectorTy() &&
3262 I.getType()->isVoidTy() && !I.onlyReadsMemory()) {
3263 // This looks like a vector store.
3264 return handleVectorStoreIntrinsic(I);
3265 }
3266
3267 if (NumArgOperands == 1 && I.getArgOperand(i: 0)->getType()->isPointerTy() &&
3268 I.getType()->isVectorTy() && I.onlyReadsMemory()) {
3269 // This looks like a vector load.
3270 return handleVectorLoadIntrinsic(I);
3271 }
3272
3273 if (I.doesNotAccessMemory())
3274 if (maybeHandleSimpleNomemIntrinsic(I, /*trailingFlags=*/0))
3275 return true;
3276
3277 // FIXME: detect and handle SSE maskstore/maskload?
3278 // Some cases are now handled in handleAVXMasked{Load,Store}.
3279 return false;
3280 }
3281
3282 bool handleUnknownIntrinsic(IntrinsicInst &I) {
3283 if (handleUnknownIntrinsicUnlogged(I)) {
3284 if (ClDumpHeuristicInstructions)
3285 dumpInst(I);
3286
3287 LLVM_DEBUG(dbgs() << "UNKNOWN INSTRUCTION HANDLED HEURISTICALLY: " << I
3288 << "\n");
3289 return true;
3290 } else
3291 return false;
3292 }
3293
3294 void handleInvariantGroup(IntrinsicInst &I) {
3295 setShadow(V: &I, SV: getShadow(I: &I, i: 0));
3296 setOrigin(V: &I, Origin: getOrigin(I: &I, i: 0));
3297 }
3298
3299 void handleLifetimeStart(IntrinsicInst &I) {
3300 if (!PoisonStack)
3301 return;
3302 AllocaInst *AI = llvm::findAllocaForValue(V: I.getArgOperand(i: 1));
3303 if (!AI)
3304 InstrumentLifetimeStart = false;
3305 LifetimeStartList.push_back(Elt: std::make_pair(x: &I, y&: AI));
3306 }
3307
3308 void handleBswap(IntrinsicInst &I) {
3309 IRBuilder<> IRB(&I);
3310 Value *Op = I.getArgOperand(i: 0);
3311 Type *OpType = Op->getType();
3312 setShadow(V: &I, SV: IRB.CreateIntrinsic(ID: Intrinsic::bswap, Types: ArrayRef(&OpType, 1),
3313 Args: getShadow(V: Op)));
3314 setOrigin(V: &I, Origin: getOrigin(V: Op));
3315 }
3316
3317 // Uninitialized bits are ok if they appear after the leading/trailing 0's
3318 // and a 1. If the input is all zero, it is fully initialized iff
3319 // !is_zero_poison.
3320 //
3321 // e.g., for ctlz, with little-endian, if 0/1 are initialized bits with
3322 // concrete value 0/1, and ? is an uninitialized bit:
3323 // - 0001 0??? is fully initialized
3324 // - 000? ???? is fully uninitialized (*)
3325 // - ???? ???? is fully uninitialized
3326 // - 0000 0000 is fully uninitialized if is_zero_poison,
3327 // fully initialized otherwise
3328 //
3329 // (*) TODO: arguably, since the number of zeros is in the range [3, 8], we
3330 // only need to poison 4 bits.
3331 //
3332 // OutputShadow =
3333 // ((ConcreteZerosCount >= ShadowZerosCount) && !AllZeroShadow)
3334 // || (is_zero_poison && AllZeroSrc)
3335 void handleCountLeadingTrailingZeros(IntrinsicInst &I) {
3336 IRBuilder<> IRB(&I);
3337 Value *Src = I.getArgOperand(i: 0);
3338 Value *SrcShadow = getShadow(V: Src);
3339
3340 Value *False = IRB.getInt1(V: false);
3341 Value *ConcreteZerosCount = IRB.CreateIntrinsic(
3342 RetTy: I.getType(), ID: I.getIntrinsicID(), Args: {Src, /*is_zero_poison=*/False});
3343 Value *ShadowZerosCount = IRB.CreateIntrinsic(
3344 RetTy: I.getType(), ID: I.getIntrinsicID(), Args: {SrcShadow, /*is_zero_poison=*/False});
3345
3346 Value *CompareConcreteZeros = IRB.CreateICmpUGE(
3347 LHS: ConcreteZerosCount, RHS: ShadowZerosCount, Name: "_mscz_cmp_zeros");
3348
3349 Value *NotAllZeroShadow =
3350 IRB.CreateIsNotNull(Arg: SrcShadow, Name: "_mscz_shadow_not_null");
3351 Value *OutputShadow =
3352 IRB.CreateAnd(LHS: CompareConcreteZeros, RHS: NotAllZeroShadow, Name: "_mscz_main");
3353
3354 // If zero poison is requested, mix in with the shadow
3355 Constant *IsZeroPoison = cast<Constant>(Val: I.getOperand(i_nocapture: 1));
3356 if (!IsZeroPoison->isZeroValue()) {
3357 Value *BoolZeroPoison = IRB.CreateIsNull(Arg: Src, Name: "_mscz_bzp");
3358 OutputShadow = IRB.CreateOr(LHS: OutputShadow, RHS: BoolZeroPoison, Name: "_mscz_bs");
3359 }
3360
3361 OutputShadow = IRB.CreateSExt(V: OutputShadow, DestTy: getShadowTy(V: Src), Name: "_mscz_os");
3362
3363 setShadow(V: &I, SV: OutputShadow);
3364 setOriginForNaryOp(I);
3365 }
3366
3367 /// Handle Arm NEON vector convert intrinsics.
3368 ///
3369 /// e.g., <4 x i32> @llvm.aarch64.neon.fcvtpu.v4i32.v4f32(<4 x float>)
3370 /// i32 @llvm.aarch64.neon.fcvtms.i32.f64(double)
3371 ///
3372 /// For x86 SSE vector convert intrinsics, see
3373 /// handleSSEVectorConvertIntrinsic().
3374 void handleNEONVectorConvertIntrinsic(IntrinsicInst &I) {
3375 assert(I.arg_size() == 1);
3376
3377 IRBuilder<> IRB(&I);
3378 Value *S0 = getShadow(I: &I, i: 0);
3379
3380 /// For scalars:
3381 /// Since they are converting from floating-point to integer, the output is
3382 /// - fully uninitialized if *any* bit of the input is uninitialized
3383 /// - fully ininitialized if all bits of the input are ininitialized
3384 /// We apply the same principle on a per-field basis for vectors.
3385 Value *OutShadow = IRB.CreateSExt(V: IRB.CreateICmpNE(LHS: S0, RHS: getCleanShadow(V: S0)),
3386 DestTy: getShadowTy(V: &I));
3387 setShadow(V: &I, SV: OutShadow);
3388 setOriginForNaryOp(I);
3389 }
3390
3391 /// Handle x86 SSE vector conversion.
3392 ///
3393 /// e.g., single-precision to half-precision conversion:
3394 /// <8 x i16> @llvm.x86.vcvtps2ph.256(<8 x float> %a0, i32 0)
3395 /// <8 x i16> @llvm.x86.vcvtps2ph.128(<4 x float> %a0, i32 0)
3396 ///
3397 /// floating-point to integer:
3398 /// <4 x i32> @llvm.x86.sse2.cvtps2dq(<4 x float>)
3399 /// <4 x i32> @llvm.x86.sse2.cvtpd2dq(<2 x double>)
3400 ///
3401 /// Note: if the output has more elements, they are zero-initialized (and
3402 /// therefore the shadow will also be initialized).
3403 ///
3404 /// This differs from handleSSEVectorConvertIntrinsic() because it
3405 /// propagates uninitialized shadow (instead of checking the shadow).
3406 void handleSSEVectorConvertIntrinsicByProp(IntrinsicInst &I,
3407 bool HasRoundingMode) {
3408 if (HasRoundingMode) {
3409 assert(I.arg_size() == 2);
3410 [[maybe_unused]] Value *RoundingMode = I.getArgOperand(i: 1);
3411 assert(RoundingMode->getType()->isIntegerTy());
3412 } else {
3413 assert(I.arg_size() == 1);
3414 }
3415
3416 Value *Src = I.getArgOperand(i: 0);
3417 assert(Src->getType()->isVectorTy());
3418
3419 // The return type might have more elements than the input.
3420 // Temporarily shrink the return type's number of elements.
3421 VectorType *ShadowType = cast<VectorType>(Val: getShadowTy(V: &I));
3422 if (ShadowType->getElementCount() ==
3423 cast<VectorType>(Val: Src->getType())->getElementCount() * 2)
3424 ShadowType = VectorType::getHalfElementsVectorType(VTy: ShadowType);
3425
3426 assert(ShadowType->getElementCount() ==
3427 cast<VectorType>(Src->getType())->getElementCount());
3428
3429 IRBuilder<> IRB(&I);
3430 Value *S0 = getShadow(I: &I, i: 0);
3431
3432 /// For scalars:
3433 /// Since they are converting to and/or from floating-point, the output is:
3434 /// - fully uninitialized if *any* bit of the input is uninitialized
3435 /// - fully ininitialized if all bits of the input are ininitialized
3436 /// We apply the same principle on a per-field basis for vectors.
3437 Value *Shadow =
3438 IRB.CreateSExt(V: IRB.CreateICmpNE(LHS: S0, RHS: getCleanShadow(V: S0)), DestTy: ShadowType);
3439
3440 // The return type might have more elements than the input.
3441 // Extend the return type back to its original width if necessary.
3442 Value *FullShadow = getCleanShadow(V: &I);
3443
3444 if (Shadow->getType() == FullShadow->getType()) {
3445 FullShadow = Shadow;
3446 } else {
3447 SmallVector<int, 8> ShadowMask(
3448 cast<FixedVectorType>(Val: FullShadow->getType())->getNumElements());
3449 std::iota(first: ShadowMask.begin(), last: ShadowMask.end(), value: 0);
3450
3451 // Append zeros
3452 FullShadow =
3453 IRB.CreateShuffleVector(V1: Shadow, V2: getCleanShadow(V: Shadow), Mask: ShadowMask);
3454 }
3455
3456 setShadow(V: &I, SV: FullShadow);
3457 setOriginForNaryOp(I);
3458 }
3459
3460 // Instrument x86 SSE vector convert intrinsic.
3461 //
3462 // This function instruments intrinsics like cvtsi2ss:
3463 // %Out = int_xxx_cvtyyy(%ConvertOp)
3464 // or
3465 // %Out = int_xxx_cvtyyy(%CopyOp, %ConvertOp)
3466 // Intrinsic converts \p NumUsedElements elements of \p ConvertOp to the same
3467 // number \p Out elements, and (if has 2 arguments) copies the rest of the
3468 // elements from \p CopyOp.
3469 // In most cases conversion involves floating-point value which may trigger a
3470 // hardware exception when not fully initialized. For this reason we require
3471 // \p ConvertOp[0:NumUsedElements] to be fully initialized and trap otherwise.
3472 // We copy the shadow of \p CopyOp[NumUsedElements:] to \p
3473 // Out[NumUsedElements:]. This means that intrinsics without \p CopyOp always
3474 // return a fully initialized value.
3475 //
3476 // For Arm NEON vector convert intrinsics, see
3477 // handleNEONVectorConvertIntrinsic().
3478 void handleSSEVectorConvertIntrinsic(IntrinsicInst &I, int NumUsedElements,
3479 bool HasRoundingMode = false) {
3480 IRBuilder<> IRB(&I);
3481 Value *CopyOp, *ConvertOp;
3482
3483 assert((!HasRoundingMode ||
3484 isa<ConstantInt>(I.getArgOperand(I.arg_size() - 1))) &&
3485 "Invalid rounding mode");
3486
3487 switch (I.arg_size() - HasRoundingMode) {
3488 case 2:
3489 CopyOp = I.getArgOperand(i: 0);
3490 ConvertOp = I.getArgOperand(i: 1);
3491 break;
3492 case 1:
3493 ConvertOp = I.getArgOperand(i: 0);
3494 CopyOp = nullptr;
3495 break;
3496 default:
3497 llvm_unreachable("Cvt intrinsic with unsupported number of arguments.");
3498 }
3499
3500 // The first *NumUsedElements* elements of ConvertOp are converted to the
3501 // same number of output elements. The rest of the output is copied from
3502 // CopyOp, or (if not available) filled with zeroes.
3503 // Combine shadow for elements of ConvertOp that are used in this operation,
3504 // and insert a check.
3505 // FIXME: consider propagating shadow of ConvertOp, at least in the case of
3506 // int->any conversion.
3507 Value *ConvertShadow = getShadow(V: ConvertOp);
3508 Value *AggShadow = nullptr;
3509 if (ConvertOp->getType()->isVectorTy()) {
3510 AggShadow = IRB.CreateExtractElement(
3511 Vec: ConvertShadow, Idx: ConstantInt::get(Ty: IRB.getInt32Ty(), V: 0));
3512 for (int i = 1; i < NumUsedElements; ++i) {
3513 Value *MoreShadow = IRB.CreateExtractElement(
3514 Vec: ConvertShadow, Idx: ConstantInt::get(Ty: IRB.getInt32Ty(), V: i));
3515 AggShadow = IRB.CreateOr(LHS: AggShadow, RHS: MoreShadow);
3516 }
3517 } else {
3518 AggShadow = ConvertShadow;
3519 }
3520 assert(AggShadow->getType()->isIntegerTy());
3521 insertShadowCheck(Shadow: AggShadow, Origin: getOrigin(V: ConvertOp), OrigIns: &I);
3522
3523 // Build result shadow by zero-filling parts of CopyOp shadow that come from
3524 // ConvertOp.
3525 if (CopyOp) {
3526 assert(CopyOp->getType() == I.getType());
3527 assert(CopyOp->getType()->isVectorTy());
3528 Value *ResultShadow = getShadow(V: CopyOp);
3529 Type *EltTy = cast<VectorType>(Val: ResultShadow->getType())->getElementType();
3530 for (int i = 0; i < NumUsedElements; ++i) {
3531 ResultShadow = IRB.CreateInsertElement(
3532 Vec: ResultShadow, NewElt: ConstantInt::getNullValue(Ty: EltTy),
3533 Idx: ConstantInt::get(Ty: IRB.getInt32Ty(), V: i));
3534 }
3535 setShadow(V: &I, SV: ResultShadow);
3536 setOrigin(V: &I, Origin: getOrigin(V: CopyOp));
3537 } else {
3538 setShadow(V: &I, SV: getCleanShadow(V: &I));
3539 setOrigin(V: &I, Origin: getCleanOrigin());
3540 }
3541 }
3542
3543 // Given a scalar or vector, extract lower 64 bits (or less), and return all
3544 // zeroes if it is zero, and all ones otherwise.
3545 Value *Lower64ShadowExtend(IRBuilder<> &IRB, Value *S, Type *T) {
3546 if (S->getType()->isVectorTy())
3547 S = CreateShadowCast(IRB, V: S, dstTy: IRB.getInt64Ty(), /* Signed */ true);
3548 assert(S->getType()->getPrimitiveSizeInBits() <= 64);
3549 Value *S2 = IRB.CreateICmpNE(LHS: S, RHS: getCleanShadow(V: S));
3550 return CreateShadowCast(IRB, V: S2, dstTy: T, /* Signed */ true);
3551 }
3552
3553 // Given a vector, extract its first element, and return all
3554 // zeroes if it is zero, and all ones otherwise.
3555 Value *LowerElementShadowExtend(IRBuilder<> &IRB, Value *S, Type *T) {
3556 Value *S1 = IRB.CreateExtractElement(Vec: S, Idx: (uint64_t)0);
3557 Value *S2 = IRB.CreateICmpNE(LHS: S1, RHS: getCleanShadow(V: S1));
3558 return CreateShadowCast(IRB, V: S2, dstTy: T, /* Signed */ true);
3559 }
3560
3561 Value *VariableShadowExtend(IRBuilder<> &IRB, Value *S) {
3562 Type *T = S->getType();
3563 assert(T->isVectorTy());
3564 Value *S2 = IRB.CreateICmpNE(LHS: S, RHS: getCleanShadow(V: S));
3565 return IRB.CreateSExt(V: S2, DestTy: T);
3566 }
3567
3568 // Instrument vector shift intrinsic.
3569 //
3570 // This function instruments intrinsics like int_x86_avx2_psll_w.
3571 // Intrinsic shifts %In by %ShiftSize bits.
3572 // %ShiftSize may be a vector. In that case the lower 64 bits determine shift
3573 // size, and the rest is ignored. Behavior is defined even if shift size is
3574 // greater than register (or field) width.
3575 void handleVectorShiftIntrinsic(IntrinsicInst &I, bool Variable) {
3576 assert(I.arg_size() == 2);
3577 IRBuilder<> IRB(&I);
3578 // If any of the S2 bits are poisoned, the whole thing is poisoned.
3579 // Otherwise perform the same shift on S1.
3580 Value *S1 = getShadow(I: &I, i: 0);
3581 Value *S2 = getShadow(I: &I, i: 1);
3582 Value *S2Conv = Variable ? VariableShadowExtend(IRB, S: S2)
3583 : Lower64ShadowExtend(IRB, S: S2, T: getShadowTy(V: &I));
3584 Value *V1 = I.getOperand(i_nocapture: 0);
3585 Value *V2 = I.getOperand(i_nocapture: 1);
3586 Value *Shift = IRB.CreateCall(FTy: I.getFunctionType(), Callee: I.getCalledOperand(),
3587 Args: {IRB.CreateBitCast(V: S1, DestTy: V1->getType()), V2});
3588 Shift = IRB.CreateBitCast(V: Shift, DestTy: getShadowTy(V: &I));
3589 setShadow(V: &I, SV: IRB.CreateOr(LHS: Shift, RHS: S2Conv));
3590 setOriginForNaryOp(I);
3591 }
3592
3593 // Get an MMX-sized vector type.
3594 Type *getMMXVectorTy(unsigned EltSizeInBits) {
3595 const unsigned X86_MMXSizeInBits = 64;
3596 assert(EltSizeInBits != 0 && (X86_MMXSizeInBits % EltSizeInBits) == 0 &&
3597 "Illegal MMX vector element size");
3598 return FixedVectorType::get(ElementType: IntegerType::get(C&: *MS.C, NumBits: EltSizeInBits),
3599 NumElts: X86_MMXSizeInBits / EltSizeInBits);
3600 }
3601
3602 // Returns a signed counterpart for an (un)signed-saturate-and-pack
3603 // intrinsic.
3604 Intrinsic::ID getSignedPackIntrinsic(Intrinsic::ID id) {
3605 switch (id) {
3606 case Intrinsic::x86_sse2_packsswb_128:
3607 case Intrinsic::x86_sse2_packuswb_128:
3608 return Intrinsic::x86_sse2_packsswb_128;
3609
3610 case Intrinsic::x86_sse2_packssdw_128:
3611 case Intrinsic::x86_sse41_packusdw:
3612 return Intrinsic::x86_sse2_packssdw_128;
3613
3614 case Intrinsic::x86_avx2_packsswb:
3615 case Intrinsic::x86_avx2_packuswb:
3616 return Intrinsic::x86_avx2_packsswb;
3617
3618 case Intrinsic::x86_avx2_packssdw:
3619 case Intrinsic::x86_avx2_packusdw:
3620 return Intrinsic::x86_avx2_packssdw;
3621
3622 case Intrinsic::x86_mmx_packsswb:
3623 case Intrinsic::x86_mmx_packuswb:
3624 return Intrinsic::x86_mmx_packsswb;
3625
3626 case Intrinsic::x86_mmx_packssdw:
3627 return Intrinsic::x86_mmx_packssdw;
3628 default:
3629 llvm_unreachable("unexpected intrinsic id");
3630 }
3631 }
3632
3633 // Instrument vector pack intrinsic.
3634 //
3635 // This function instruments intrinsics like x86_mmx_packsswb, that
3636 // packs elements of 2 input vectors into half as many bits with saturation.
3637 // Shadow is propagated with the signed variant of the same intrinsic applied
3638 // to sext(Sa != zeroinitializer), sext(Sb != zeroinitializer).
3639 // MMXEltSizeInBits is used only for x86mmx arguments.
3640 void handleVectorPackIntrinsic(IntrinsicInst &I,
3641 unsigned MMXEltSizeInBits = 0) {
3642 assert(I.arg_size() == 2);
3643 IRBuilder<> IRB(&I);
3644 Value *S1 = getShadow(I: &I, i: 0);
3645 Value *S2 = getShadow(I: &I, i: 1);
3646 assert(S1->getType()->isVectorTy());
3647
3648 // SExt and ICmpNE below must apply to individual elements of input vectors.
3649 // In case of x86mmx arguments, cast them to appropriate vector types and
3650 // back.
3651 Type *T =
3652 MMXEltSizeInBits ? getMMXVectorTy(EltSizeInBits: MMXEltSizeInBits) : S1->getType();
3653 if (MMXEltSizeInBits) {
3654 S1 = IRB.CreateBitCast(V: S1, DestTy: T);
3655 S2 = IRB.CreateBitCast(V: S2, DestTy: T);
3656 }
3657 Value *S1_ext =
3658 IRB.CreateSExt(V: IRB.CreateICmpNE(LHS: S1, RHS: Constant::getNullValue(Ty: T)), DestTy: T);
3659 Value *S2_ext =
3660 IRB.CreateSExt(V: IRB.CreateICmpNE(LHS: S2, RHS: Constant::getNullValue(Ty: T)), DestTy: T);
3661 if (MMXEltSizeInBits) {
3662 S1_ext = IRB.CreateBitCast(V: S1_ext, DestTy: getMMXVectorTy(EltSizeInBits: 64));
3663 S2_ext = IRB.CreateBitCast(V: S2_ext, DestTy: getMMXVectorTy(EltSizeInBits: 64));
3664 }
3665
3666 Value *S = IRB.CreateIntrinsic(ID: getSignedPackIntrinsic(id: I.getIntrinsicID()),
3667 Args: {S1_ext, S2_ext}, /*FMFSource=*/nullptr,
3668 Name: "_msprop_vector_pack");
3669 if (MMXEltSizeInBits)
3670 S = IRB.CreateBitCast(V: S, DestTy: getShadowTy(V: &I));
3671 setShadow(V: &I, SV: S);
3672 setOriginForNaryOp(I);
3673 }
3674
3675 // Convert `Mask` into `<n x i1>`.
3676 Constant *createDppMask(unsigned Width, unsigned Mask) {
3677 SmallVector<Constant *, 4> R(Width);
3678 for (auto &M : R) {
3679 M = ConstantInt::getBool(Context&: F.getContext(), V: Mask & 1);
3680 Mask >>= 1;
3681 }
3682 return ConstantVector::get(V: R);
3683 }
3684
3685 // Calculate output shadow as array of booleans `<n x i1>`, assuming if any
3686 // arg is poisoned, entire dot product is poisoned.
3687 Value *findDppPoisonedOutput(IRBuilder<> &IRB, Value *S, unsigned SrcMask,
3688 unsigned DstMask) {
3689 const unsigned Width =
3690 cast<FixedVectorType>(Val: S->getType())->getNumElements();
3691
3692 S = IRB.CreateSelect(C: createDppMask(Width, Mask: SrcMask), True: S,
3693 False: Constant::getNullValue(Ty: S->getType()));
3694 Value *SElem = IRB.CreateOrReduce(Src: S);
3695 Value *IsClean = IRB.CreateIsNull(Arg: SElem, Name: "_msdpp");
3696 Value *DstMaskV = createDppMask(Width, Mask: DstMask);
3697
3698 return IRB.CreateSelect(
3699 C: IsClean, True: Constant::getNullValue(Ty: DstMaskV->getType()), False: DstMaskV);
3700 }
3701
3702 // See `Intel Intrinsics Guide` for `_dp_p*` instructions.
3703 //
3704 // 2 and 4 element versions produce single scalar of dot product, and then
3705 // puts it into elements of output vector, selected by 4 lowest bits of the
3706 // mask. Top 4 bits of the mask control which elements of input to use for dot
3707 // product.
3708 //
3709 // 8 element version mask still has only 4 bit for input, and 4 bit for output
3710 // mask. According to the spec it just operates as 4 element version on first
3711 // 4 elements of inputs and output, and then on last 4 elements of inputs and
3712 // output.
3713 void handleDppIntrinsic(IntrinsicInst &I) {
3714 IRBuilder<> IRB(&I);
3715
3716 Value *S0 = getShadow(I: &I, i: 0);
3717 Value *S1 = getShadow(I: &I, i: 1);
3718 Value *S = IRB.CreateOr(LHS: S0, RHS: S1);
3719
3720 const unsigned Width =
3721 cast<FixedVectorType>(Val: S->getType())->getNumElements();
3722 assert(Width == 2 || Width == 4 || Width == 8);
3723
3724 const unsigned Mask = cast<ConstantInt>(Val: I.getArgOperand(i: 2))->getZExtValue();
3725 const unsigned SrcMask = Mask >> 4;
3726 const unsigned DstMask = Mask & 0xf;
3727
3728 // Calculate shadow as `<n x i1>`.
3729 Value *SI1 = findDppPoisonedOutput(IRB, S, SrcMask, DstMask);
3730 if (Width == 8) {
3731 // First 4 elements of shadow are already calculated. `makeDppShadow`
3732 // operats on 32 bit masks, so we can just shift masks, and repeat.
3733 SI1 = IRB.CreateOr(
3734 LHS: SI1, RHS: findDppPoisonedOutput(IRB, S, SrcMask: SrcMask << 4, DstMask: DstMask << 4));
3735 }
3736 // Extend to real size of shadow, poisoning either all or none bits of an
3737 // element.
3738 S = IRB.CreateSExt(V: SI1, DestTy: S->getType(), Name: "_msdpp");
3739
3740 setShadow(V: &I, SV: S);
3741 setOriginForNaryOp(I);
3742 }
3743
3744 Value *convertBlendvToSelectMask(IRBuilder<> &IRB, Value *C) {
3745 C = CreateAppToShadowCast(IRB, V: C);
3746 FixedVectorType *FVT = cast<FixedVectorType>(Val: C->getType());
3747 unsigned ElSize = FVT->getElementType()->getPrimitiveSizeInBits();
3748 C = IRB.CreateAShr(LHS: C, RHS: ElSize - 1);
3749 FVT = FixedVectorType::get(ElementType: IRB.getInt1Ty(), NumElts: FVT->getNumElements());
3750 return IRB.CreateTrunc(V: C, DestTy: FVT);
3751 }
3752
3753 // `blendv(f, t, c)` is effectively `select(c[top_bit], t, f)`.
3754 void handleBlendvIntrinsic(IntrinsicInst &I) {
3755 Value *C = I.getOperand(i_nocapture: 2);
3756 Value *T = I.getOperand(i_nocapture: 1);
3757 Value *F = I.getOperand(i_nocapture: 0);
3758
3759 Value *Sc = getShadow(I: &I, i: 2);
3760 Value *Oc = MS.TrackOrigins ? getOrigin(V: C) : nullptr;
3761
3762 {
3763 IRBuilder<> IRB(&I);
3764 // Extract top bit from condition and its shadow.
3765 C = convertBlendvToSelectMask(IRB, C);
3766 Sc = convertBlendvToSelectMask(IRB, C: Sc);
3767
3768 setShadow(V: C, SV: Sc);
3769 setOrigin(V: C, Origin: Oc);
3770 }
3771
3772 handleSelectLikeInst(I, B: C, C: T, D: F);
3773 }
3774
3775 // Instrument sum-of-absolute-differences intrinsic.
3776 void handleVectorSadIntrinsic(IntrinsicInst &I, bool IsMMX = false) {
3777 const unsigned SignificantBitsPerResultElement = 16;
3778 Type *ResTy = IsMMX ? IntegerType::get(C&: *MS.C, NumBits: 64) : I.getType();
3779 unsigned ZeroBitsPerResultElement =
3780 ResTy->getScalarSizeInBits() - SignificantBitsPerResultElement;
3781
3782 IRBuilder<> IRB(&I);
3783 auto *Shadow0 = getShadow(I: &I, i: 0);
3784 auto *Shadow1 = getShadow(I: &I, i: 1);
3785 Value *S = IRB.CreateOr(LHS: Shadow0, RHS: Shadow1);
3786 S = IRB.CreateBitCast(V: S, DestTy: ResTy);
3787 S = IRB.CreateSExt(V: IRB.CreateICmpNE(LHS: S, RHS: Constant::getNullValue(Ty: ResTy)),
3788 DestTy: ResTy);
3789 S = IRB.CreateLShr(LHS: S, RHS: ZeroBitsPerResultElement);
3790 S = IRB.CreateBitCast(V: S, DestTy: getShadowTy(V: &I));
3791 setShadow(V: &I, SV: S);
3792 setOriginForNaryOp(I);
3793 }
3794
3795 // Instrument multiply-add intrinsic.
3796 void handleVectorPmaddIntrinsic(IntrinsicInst &I,
3797 unsigned MMXEltSizeInBits = 0) {
3798 Type *ResTy =
3799 MMXEltSizeInBits ? getMMXVectorTy(EltSizeInBits: MMXEltSizeInBits * 2) : I.getType();
3800 IRBuilder<> IRB(&I);
3801 auto *Shadow0 = getShadow(I: &I, i: 0);
3802 auto *Shadow1 = getShadow(I: &I, i: 1);
3803 Value *S = IRB.CreateOr(LHS: Shadow0, RHS: Shadow1);
3804 S = IRB.CreateBitCast(V: S, DestTy: ResTy);
3805 S = IRB.CreateSExt(V: IRB.CreateICmpNE(LHS: S, RHS: Constant::getNullValue(Ty: ResTy)),
3806 DestTy: ResTy);
3807 S = IRB.CreateBitCast(V: S, DestTy: getShadowTy(V: &I));
3808 setShadow(V: &I, SV: S);
3809 setOriginForNaryOp(I);
3810 }
3811
3812 // Instrument compare-packed intrinsic.
3813 // Basically, an or followed by sext(icmp ne 0) to end up with all-zeros or
3814 // all-ones shadow.
3815 void handleVectorComparePackedIntrinsic(IntrinsicInst &I) {
3816 IRBuilder<> IRB(&I);
3817 Type *ResTy = getShadowTy(V: &I);
3818 auto *Shadow0 = getShadow(I: &I, i: 0);
3819 auto *Shadow1 = getShadow(I: &I, i: 1);
3820 Value *S0 = IRB.CreateOr(LHS: Shadow0, RHS: Shadow1);
3821 Value *S = IRB.CreateSExt(
3822 V: IRB.CreateICmpNE(LHS: S0, RHS: Constant::getNullValue(Ty: ResTy)), DestTy: ResTy);
3823 setShadow(V: &I, SV: S);
3824 setOriginForNaryOp(I);
3825 }
3826
3827 // Instrument compare-scalar intrinsic.
3828 // This handles both cmp* intrinsics which return the result in the first
3829 // element of a vector, and comi* which return the result as i32.
3830 void handleVectorCompareScalarIntrinsic(IntrinsicInst &I) {
3831 IRBuilder<> IRB(&I);
3832 auto *Shadow0 = getShadow(I: &I, i: 0);
3833 auto *Shadow1 = getShadow(I: &I, i: 1);
3834 Value *S0 = IRB.CreateOr(LHS: Shadow0, RHS: Shadow1);
3835 Value *S = LowerElementShadowExtend(IRB, S: S0, T: getShadowTy(V: &I));
3836 setShadow(V: &I, SV: S);
3837 setOriginForNaryOp(I);
3838 }
3839
3840 // Instrument generic vector reduction intrinsics
3841 // by ORing together all their fields.
3842 //
3843 // If AllowShadowCast is true, the return type does not need to be the same
3844 // type as the fields
3845 // e.g., declare i32 @llvm.aarch64.neon.uaddv.i32.v16i8(<16 x i8>)
3846 void handleVectorReduceIntrinsic(IntrinsicInst &I, bool AllowShadowCast) {
3847 assert(I.arg_size() == 1);
3848
3849 IRBuilder<> IRB(&I);
3850 Value *S = IRB.CreateOrReduce(Src: getShadow(I: &I, i: 0));
3851 if (AllowShadowCast)
3852 S = CreateShadowCast(IRB, V: S, dstTy: getShadowTy(V: &I));
3853 else
3854 assert(S->getType() == getShadowTy(&I));
3855 setShadow(V: &I, SV: S);
3856 setOriginForNaryOp(I);
3857 }
3858
3859 // Similar to handleVectorReduceIntrinsic but with an initial starting value.
3860 // e.g., call float @llvm.vector.reduce.fadd.f32.v2f32(float %a0, <2 x float>
3861 // %a1)
3862 // shadow = shadow[a0] | shadow[a1.0] | shadow[a1.1]
3863 //
3864 // The type of the return value, initial starting value, and elements of the
3865 // vector must be identical.
3866 void handleVectorReduceWithStarterIntrinsic(IntrinsicInst &I) {
3867 assert(I.arg_size() == 2);
3868
3869 IRBuilder<> IRB(&I);
3870 Value *Shadow0 = getShadow(I: &I, i: 0);
3871 Value *Shadow1 = IRB.CreateOrReduce(Src: getShadow(I: &I, i: 1));
3872 assert(Shadow0->getType() == Shadow1->getType());
3873 Value *S = IRB.CreateOr(LHS: Shadow0, RHS: Shadow1);
3874 assert(S->getType() == getShadowTy(&I));
3875 setShadow(V: &I, SV: S);
3876 setOriginForNaryOp(I);
3877 }
3878
3879 // Instrument vector.reduce.or intrinsic.
3880 // Valid (non-poisoned) set bits in the operand pull low the
3881 // corresponding shadow bits.
3882 void handleVectorReduceOrIntrinsic(IntrinsicInst &I) {
3883 assert(I.arg_size() == 1);
3884
3885 IRBuilder<> IRB(&I);
3886 Value *OperandShadow = getShadow(I: &I, i: 0);
3887 Value *OperandUnsetBits = IRB.CreateNot(V: I.getOperand(i_nocapture: 0));
3888 Value *OperandUnsetOrPoison = IRB.CreateOr(LHS: OperandUnsetBits, RHS: OperandShadow);
3889 // Bit N is clean if any field's bit N is 1 and unpoison
3890 Value *OutShadowMask = IRB.CreateAndReduce(Src: OperandUnsetOrPoison);
3891 // Otherwise, it is clean if every field's bit N is unpoison
3892 Value *OrShadow = IRB.CreateOrReduce(Src: OperandShadow);
3893 Value *S = IRB.CreateAnd(LHS: OutShadowMask, RHS: OrShadow);
3894
3895 setShadow(V: &I, SV: S);
3896 setOrigin(V: &I, Origin: getOrigin(I: &I, i: 0));
3897 }
3898
3899 // Instrument vector.reduce.and intrinsic.
3900 // Valid (non-poisoned) unset bits in the operand pull down the
3901 // corresponding shadow bits.
3902 void handleVectorReduceAndIntrinsic(IntrinsicInst &I) {
3903 assert(I.arg_size() == 1);
3904
3905 IRBuilder<> IRB(&I);
3906 Value *OperandShadow = getShadow(I: &I, i: 0);
3907 Value *OperandSetOrPoison = IRB.CreateOr(LHS: I.getOperand(i_nocapture: 0), RHS: OperandShadow);
3908 // Bit N is clean if any field's bit N is 0 and unpoison
3909 Value *OutShadowMask = IRB.CreateAndReduce(Src: OperandSetOrPoison);
3910 // Otherwise, it is clean if every field's bit N is unpoison
3911 Value *OrShadow = IRB.CreateOrReduce(Src: OperandShadow);
3912 Value *S = IRB.CreateAnd(LHS: OutShadowMask, RHS: OrShadow);
3913
3914 setShadow(V: &I, SV: S);
3915 setOrigin(V: &I, Origin: getOrigin(I: &I, i: 0));
3916 }
3917
3918 void handleStmxcsr(IntrinsicInst &I) {
3919 IRBuilder<> IRB(&I);
3920 Value *Addr = I.getArgOperand(i: 0);
3921 Type *Ty = IRB.getInt32Ty();
3922 Value *ShadowPtr =
3923 getShadowOriginPtr(Addr, IRB, ShadowTy: Ty, Alignment: Align(1), /*isStore*/ true).first;
3924
3925 IRB.CreateStore(Val: getCleanShadow(OrigTy: Ty), Ptr: ShadowPtr);
3926
3927 if (ClCheckAccessAddress)
3928 insertShadowCheck(Val: Addr, OrigIns: &I);
3929 }
3930
3931 void handleLdmxcsr(IntrinsicInst &I) {
3932 if (!InsertChecks)
3933 return;
3934
3935 IRBuilder<> IRB(&I);
3936 Value *Addr = I.getArgOperand(i: 0);
3937 Type *Ty = IRB.getInt32Ty();
3938 const Align Alignment = Align(1);
3939 Value *ShadowPtr, *OriginPtr;
3940 std::tie(args&: ShadowPtr, args&: OriginPtr) =
3941 getShadowOriginPtr(Addr, IRB, ShadowTy: Ty, Alignment, /*isStore*/ false);
3942
3943 if (ClCheckAccessAddress)
3944 insertShadowCheck(Val: Addr, OrigIns: &I);
3945
3946 Value *Shadow = IRB.CreateAlignedLoad(Ty, Ptr: ShadowPtr, Align: Alignment, Name: "_ldmxcsr");
3947 Value *Origin = MS.TrackOrigins ? IRB.CreateLoad(Ty: MS.OriginTy, Ptr: OriginPtr)
3948 : getCleanOrigin();
3949 insertShadowCheck(Shadow, Origin, OrigIns: &I);
3950 }
3951
3952 void handleMaskedExpandLoad(IntrinsicInst &I) {
3953 IRBuilder<> IRB(&I);
3954 Value *Ptr = I.getArgOperand(i: 0);
3955 MaybeAlign Align = I.getParamAlign(ArgNo: 0);
3956 Value *Mask = I.getArgOperand(i: 1);
3957 Value *PassThru = I.getArgOperand(i: 2);
3958
3959 if (ClCheckAccessAddress) {
3960 insertShadowCheck(Val: Ptr, OrigIns: &I);
3961 insertShadowCheck(Val: Mask, OrigIns: &I);
3962 }
3963
3964 if (!PropagateShadow) {
3965 setShadow(V: &I, SV: getCleanShadow(V: &I));
3966 setOrigin(V: &I, Origin: getCleanOrigin());
3967 return;
3968 }
3969
3970 Type *ShadowTy = getShadowTy(V: &I);
3971 Type *ElementShadowTy = cast<VectorType>(Val: ShadowTy)->getElementType();
3972 auto [ShadowPtr, OriginPtr] =
3973 getShadowOriginPtr(Addr: Ptr, IRB, ShadowTy: ElementShadowTy, Alignment: Align, /*isStore*/ false);
3974
3975 Value *Shadow =
3976 IRB.CreateMaskedExpandLoad(Ty: ShadowTy, Ptr: ShadowPtr, Align, Mask,
3977 PassThru: getShadow(V: PassThru), Name: "_msmaskedexpload");
3978
3979 setShadow(V: &I, SV: Shadow);
3980
3981 // TODO: Store origins.
3982 setOrigin(V: &I, Origin: getCleanOrigin());
3983 }
3984
3985 void handleMaskedCompressStore(IntrinsicInst &I) {
3986 IRBuilder<> IRB(&I);
3987 Value *Values = I.getArgOperand(i: 0);
3988 Value *Ptr = I.getArgOperand(i: 1);
3989 MaybeAlign Align = I.getParamAlign(ArgNo: 1);
3990 Value *Mask = I.getArgOperand(i: 2);
3991
3992 if (ClCheckAccessAddress) {
3993 insertShadowCheck(Val: Ptr, OrigIns: &I);
3994 insertShadowCheck(Val: Mask, OrigIns: &I);
3995 }
3996
3997 Value *Shadow = getShadow(V: Values);
3998 Type *ElementShadowTy =
3999 getShadowTy(OrigTy: cast<VectorType>(Val: Values->getType())->getElementType());
4000 auto [ShadowPtr, OriginPtrs] =
4001 getShadowOriginPtr(Addr: Ptr, IRB, ShadowTy: ElementShadowTy, Alignment: Align, /*isStore*/ true);
4002
4003 IRB.CreateMaskedCompressStore(Val: Shadow, Ptr: ShadowPtr, Align, Mask);
4004
4005 // TODO: Store origins.
4006 }
4007
4008 void handleMaskedGather(IntrinsicInst &I) {
4009 IRBuilder<> IRB(&I);
4010 Value *Ptrs = I.getArgOperand(i: 0);
4011 const Align Alignment(
4012 cast<ConstantInt>(Val: I.getArgOperand(i: 1))->getZExtValue());
4013 Value *Mask = I.getArgOperand(i: 2);
4014 Value *PassThru = I.getArgOperand(i: 3);
4015
4016 Type *PtrsShadowTy = getShadowTy(V: Ptrs);
4017 if (ClCheckAccessAddress) {
4018 insertShadowCheck(Val: Mask, OrigIns: &I);
4019 Value *MaskedPtrShadow = IRB.CreateSelect(
4020 C: Mask, True: getShadow(V: Ptrs), False: Constant::getNullValue(Ty: (PtrsShadowTy)),
4021 Name: "_msmaskedptrs");
4022 insertShadowCheck(Shadow: MaskedPtrShadow, Origin: getOrigin(V: Ptrs), OrigIns: &I);
4023 }
4024
4025 if (!PropagateShadow) {
4026 setShadow(V: &I, SV: getCleanShadow(V: &I));
4027 setOrigin(V: &I, Origin: getCleanOrigin());
4028 return;
4029 }
4030
4031 Type *ShadowTy = getShadowTy(V: &I);
4032 Type *ElementShadowTy = cast<VectorType>(Val: ShadowTy)->getElementType();
4033 auto [ShadowPtrs, OriginPtrs] = getShadowOriginPtr(
4034 Addr: Ptrs, IRB, ShadowTy: ElementShadowTy, Alignment, /*isStore*/ false);
4035
4036 Value *Shadow =
4037 IRB.CreateMaskedGather(Ty: ShadowTy, Ptrs: ShadowPtrs, Alignment, Mask,
4038 PassThru: getShadow(V: PassThru), Name: "_msmaskedgather");
4039
4040 setShadow(V: &I, SV: Shadow);
4041
4042 // TODO: Store origins.
4043 setOrigin(V: &I, Origin: getCleanOrigin());
4044 }
4045
4046 void handleMaskedScatter(IntrinsicInst &I) {
4047 IRBuilder<> IRB(&I);
4048 Value *Values = I.getArgOperand(i: 0);
4049 Value *Ptrs = I.getArgOperand(i: 1);
4050 const Align Alignment(
4051 cast<ConstantInt>(Val: I.getArgOperand(i: 2))->getZExtValue());
4052 Value *Mask = I.getArgOperand(i: 3);
4053
4054 Type *PtrsShadowTy = getShadowTy(V: Ptrs);
4055 if (ClCheckAccessAddress) {
4056 insertShadowCheck(Val: Mask, OrigIns: &I);
4057 Value *MaskedPtrShadow = IRB.CreateSelect(
4058 C: Mask, True: getShadow(V: Ptrs), False: Constant::getNullValue(Ty: (PtrsShadowTy)),
4059 Name: "_msmaskedptrs");
4060 insertShadowCheck(Shadow: MaskedPtrShadow, Origin: getOrigin(V: Ptrs), OrigIns: &I);
4061 }
4062
4063 Value *Shadow = getShadow(V: Values);
4064 Type *ElementShadowTy =
4065 getShadowTy(OrigTy: cast<VectorType>(Val: Values->getType())->getElementType());
4066 auto [ShadowPtrs, OriginPtrs] = getShadowOriginPtr(
4067 Addr: Ptrs, IRB, ShadowTy: ElementShadowTy, Alignment, /*isStore*/ true);
4068
4069 IRB.CreateMaskedScatter(Val: Shadow, Ptrs: ShadowPtrs, Alignment, Mask);
4070
4071 // TODO: Store origin.
4072 }
4073
4074 // Intrinsic::masked_store
4075 //
4076 // Note: handleAVXMaskedStore handles AVX/AVX2 variants, though AVX512 masked
4077 // stores are lowered to Intrinsic::masked_store.
4078 void handleMaskedStore(IntrinsicInst &I) {
4079 IRBuilder<> IRB(&I);
4080 Value *V = I.getArgOperand(i: 0);
4081 Value *Ptr = I.getArgOperand(i: 1);
4082 const Align Alignment(
4083 cast<ConstantInt>(Val: I.getArgOperand(i: 2))->getZExtValue());
4084 Value *Mask = I.getArgOperand(i: 3);
4085 Value *Shadow = getShadow(V);
4086
4087 if (ClCheckAccessAddress) {
4088 insertShadowCheck(Val: Ptr, OrigIns: &I);
4089 insertShadowCheck(Val: Mask, OrigIns: &I);
4090 }
4091
4092 Value *ShadowPtr;
4093 Value *OriginPtr;
4094 std::tie(args&: ShadowPtr, args&: OriginPtr) = getShadowOriginPtr(
4095 Addr: Ptr, IRB, ShadowTy: Shadow->getType(), Alignment, /*isStore*/ true);
4096
4097 IRB.CreateMaskedStore(Val: Shadow, Ptr: ShadowPtr, Alignment, Mask);
4098
4099 if (!MS.TrackOrigins)
4100 return;
4101
4102 auto &DL = F.getDataLayout();
4103 paintOrigin(IRB, Origin: getOrigin(V), OriginPtr,
4104 TS: DL.getTypeStoreSize(Ty: Shadow->getType()),
4105 Alignment: std::max(a: Alignment, b: kMinOriginAlignment));
4106 }
4107
4108 // Intrinsic::masked_load
4109 //
4110 // Note: handleAVXMaskedLoad handles AVX/AVX2 variants, though AVX512 masked
4111 // loads are lowered to Intrinsic::masked_load.
4112 void handleMaskedLoad(IntrinsicInst &I) {
4113 IRBuilder<> IRB(&I);
4114 Value *Ptr = I.getArgOperand(i: 0);
4115 const Align Alignment(
4116 cast<ConstantInt>(Val: I.getArgOperand(i: 1))->getZExtValue());
4117 Value *Mask = I.getArgOperand(i: 2);
4118 Value *PassThru = I.getArgOperand(i: 3);
4119
4120 if (ClCheckAccessAddress) {
4121 insertShadowCheck(Val: Ptr, OrigIns: &I);
4122 insertShadowCheck(Val: Mask, OrigIns: &I);
4123 }
4124
4125 if (!PropagateShadow) {
4126 setShadow(V: &I, SV: getCleanShadow(V: &I));
4127 setOrigin(V: &I, Origin: getCleanOrigin());
4128 return;
4129 }
4130
4131 Type *ShadowTy = getShadowTy(V: &I);
4132 Value *ShadowPtr, *OriginPtr;
4133 std::tie(args&: ShadowPtr, args&: OriginPtr) =
4134 getShadowOriginPtr(Addr: Ptr, IRB, ShadowTy, Alignment, /*isStore*/ false);
4135 setShadow(V: &I, SV: IRB.CreateMaskedLoad(Ty: ShadowTy, Ptr: ShadowPtr, Alignment, Mask,
4136 PassThru: getShadow(V: PassThru), Name: "_msmaskedld"));
4137
4138 if (!MS.TrackOrigins)
4139 return;
4140
4141 // Choose between PassThru's and the loaded value's origins.
4142 Value *MaskedPassThruShadow = IRB.CreateAnd(
4143 LHS: getShadow(V: PassThru), RHS: IRB.CreateSExt(V: IRB.CreateNeg(V: Mask), DestTy: ShadowTy));
4144
4145 Value *NotNull = convertToBool(V: MaskedPassThruShadow, IRB, name: "_mscmp");
4146
4147 Value *PtrOrigin = IRB.CreateLoad(Ty: MS.OriginTy, Ptr: OriginPtr);
4148 Value *Origin = IRB.CreateSelect(C: NotNull, True: getOrigin(V: PassThru), False: PtrOrigin);
4149
4150 setOrigin(V: &I, Origin);
4151 }
4152
4153 // e.g., void @llvm.x86.avx.maskstore.ps.256(ptr, <8 x i32>, <8 x float>)
4154 // dst mask src
4155 //
4156 // AVX512 masked stores are lowered to Intrinsic::masked_load and are handled
4157 // by handleMaskedStore.
4158 //
4159 // This function handles AVX and AVX2 masked stores; these use the MSBs of a
4160 // vector of integers, unlike the LLVM masked intrinsics, which require a
4161 // vector of booleans. X86InstCombineIntrinsic.cpp::simplifyX86MaskedLoad
4162 // mentions that the x86 backend does not know how to efficiently convert
4163 // from a vector of booleans back into the AVX mask format; therefore, they
4164 // (and we) do not reduce AVX/AVX2 masked intrinsics into LLVM masked
4165 // intrinsics.
4166 void handleAVXMaskedStore(IntrinsicInst &I) {
4167 assert(I.arg_size() == 3);
4168
4169 IRBuilder<> IRB(&I);
4170
4171 Value *Dst = I.getArgOperand(i: 0);
4172 assert(Dst->getType()->isPointerTy() && "Destination is not a pointer!");
4173
4174 Value *Mask = I.getArgOperand(i: 1);
4175 assert(isa<VectorType>(Mask->getType()) && "Mask is not a vector!");
4176
4177 Value *Src = I.getArgOperand(i: 2);
4178 assert(isa<VectorType>(Src->getType()) && "Source is not a vector!");
4179
4180 const Align Alignment = Align(1);
4181
4182 Value *SrcShadow = getShadow(V: Src);
4183
4184 if (ClCheckAccessAddress) {
4185 insertShadowCheck(Val: Dst, OrigIns: &I);
4186 insertShadowCheck(Val: Mask, OrigIns: &I);
4187 }
4188
4189 Value *DstShadowPtr;
4190 Value *DstOriginPtr;
4191 std::tie(args&: DstShadowPtr, args&: DstOriginPtr) = getShadowOriginPtr(
4192 Addr: Dst, IRB, ShadowTy: SrcShadow->getType(), Alignment, /*isStore*/ true);
4193
4194 SmallVector<Value *, 2> ShadowArgs;
4195 ShadowArgs.append(NumInputs: 1, Elt: DstShadowPtr);
4196 ShadowArgs.append(NumInputs: 1, Elt: Mask);
4197 // The intrinsic may require floating-point but shadows can be arbitrary
4198 // bit patterns, of which some would be interpreted as "invalid"
4199 // floating-point values (NaN etc.); we assume the intrinsic will happily
4200 // copy them.
4201 ShadowArgs.append(NumInputs: 1, Elt: IRB.CreateBitCast(V: SrcShadow, DestTy: Src->getType()));
4202
4203 CallInst *CI =
4204 IRB.CreateIntrinsic(RetTy: IRB.getVoidTy(), ID: I.getIntrinsicID(), Args: ShadowArgs);
4205 setShadow(V: &I, SV: CI);
4206
4207 if (!MS.TrackOrigins)
4208 return;
4209
4210 // Approximation only
4211 auto &DL = F.getDataLayout();
4212 paintOrigin(IRB, Origin: getOrigin(V: Src), OriginPtr: DstOriginPtr,
4213 TS: DL.getTypeStoreSize(Ty: SrcShadow->getType()),
4214 Alignment: std::max(a: Alignment, b: kMinOriginAlignment));
4215 }
4216
4217 // e.g., <8 x float> @llvm.x86.avx.maskload.ps.256(ptr, <8 x i32>)
4218 // return src mask
4219 //
4220 // Masked-off values are replaced with 0, which conveniently also represents
4221 // initialized memory.
4222 //
4223 // AVX512 masked stores are lowered to Intrinsic::masked_load and are handled
4224 // by handleMaskedStore.
4225 //
4226 // We do not combine this with handleMaskedLoad; see comment in
4227 // handleAVXMaskedStore for the rationale.
4228 //
4229 // This is subtly different than handleIntrinsicByApplyingToShadow(I, 1)
4230 // because we need to apply getShadowOriginPtr, not getShadow, to the first
4231 // parameter.
4232 void handleAVXMaskedLoad(IntrinsicInst &I) {
4233 assert(I.arg_size() == 2);
4234
4235 IRBuilder<> IRB(&I);
4236
4237 Value *Src = I.getArgOperand(i: 0);
4238 assert(Src->getType()->isPointerTy() && "Source is not a pointer!");
4239
4240 Value *Mask = I.getArgOperand(i: 1);
4241 assert(isa<VectorType>(Mask->getType()) && "Mask is not a vector!");
4242
4243 const Align Alignment = Align(1);
4244
4245 if (ClCheckAccessAddress) {
4246 insertShadowCheck(Val: Mask, OrigIns: &I);
4247 }
4248
4249 Type *SrcShadowTy = getShadowTy(V: Src);
4250 Value *SrcShadowPtr, *SrcOriginPtr;
4251 std::tie(args&: SrcShadowPtr, args&: SrcOriginPtr) =
4252 getShadowOriginPtr(Addr: Src, IRB, ShadowTy: SrcShadowTy, Alignment, /*isStore*/ false);
4253
4254 SmallVector<Value *, 2> ShadowArgs;
4255 ShadowArgs.append(NumInputs: 1, Elt: SrcShadowPtr);
4256 ShadowArgs.append(NumInputs: 1, Elt: Mask);
4257
4258 CallInst *CI =
4259 IRB.CreateIntrinsic(RetTy: I.getType(), ID: I.getIntrinsicID(), Args: ShadowArgs);
4260 // The AVX masked load intrinsics do not have integer variants. We use the
4261 // floating-point variants, which will happily copy the shadows even if
4262 // they are interpreted as "invalid" floating-point values (NaN etc.).
4263 setShadow(V: &I, SV: IRB.CreateBitCast(V: CI, DestTy: getShadowTy(V: &I)));
4264
4265 if (!MS.TrackOrigins)
4266 return;
4267
4268 // The "pass-through" value is always zero (initialized). To the extent
4269 // that that results in initialized aligned 4-byte chunks, the origin value
4270 // is ignored. It is therefore correct to simply copy the origin from src.
4271 Value *PtrSrcOrigin = IRB.CreateLoad(Ty: MS.OriginTy, Ptr: SrcOriginPtr);
4272 setOrigin(V: &I, Origin: PtrSrcOrigin);
4273 }
4274
4275 // Instrument AVX permutation intrinsic.
4276 // We apply the same permutation (argument index 1) to the shadow.
4277 void handleAVXVpermilvar(IntrinsicInst &I) {
4278 IRBuilder<> IRB(&I);
4279 Value *Shadow = getShadow(I: &I, i: 0);
4280 insertShadowCheck(Val: I.getArgOperand(i: 1), OrigIns: &I);
4281
4282 // Shadows are integer-ish types but some intrinsics require a
4283 // different (e.g., floating-point) type.
4284 Shadow = IRB.CreateBitCast(V: Shadow, DestTy: I.getArgOperand(i: 0)->getType());
4285 CallInst *CI = IRB.CreateIntrinsic(RetTy: I.getType(), ID: I.getIntrinsicID(),
4286 Args: {Shadow, I.getArgOperand(i: 1)});
4287
4288 setShadow(V: &I, SV: IRB.CreateBitCast(V: CI, DestTy: getShadowTy(V: &I)));
4289 setOriginForNaryOp(I);
4290 }
4291
4292 // Instrument BMI / BMI2 intrinsics.
4293 // All of these intrinsics are Z = I(X, Y)
4294 // where the types of all operands and the result match, and are either i32 or
4295 // i64. The following instrumentation happens to work for all of them:
4296 // Sz = I(Sx, Y) | (sext (Sy != 0))
4297 void handleBmiIntrinsic(IntrinsicInst &I) {
4298 IRBuilder<> IRB(&I);
4299 Type *ShadowTy = getShadowTy(V: &I);
4300
4301 // If any bit of the mask operand is poisoned, then the whole thing is.
4302 Value *SMask = getShadow(I: &I, i: 1);
4303 SMask = IRB.CreateSExt(V: IRB.CreateICmpNE(LHS: SMask, RHS: getCleanShadow(OrigTy: ShadowTy)),
4304 DestTy: ShadowTy);
4305 // Apply the same intrinsic to the shadow of the first operand.
4306 Value *S = IRB.CreateCall(Callee: I.getCalledFunction(),
4307 Args: {getShadow(I: &I, i: 0), I.getOperand(i_nocapture: 1)});
4308 S = IRB.CreateOr(LHS: SMask, RHS: S);
4309 setShadow(V: &I, SV: S);
4310 setOriginForNaryOp(I);
4311 }
4312
4313 static SmallVector<int, 8> getPclmulMask(unsigned Width, bool OddElements) {
4314 SmallVector<int, 8> Mask;
4315 for (unsigned X = OddElements ? 1 : 0; X < Width; X += 2) {
4316 Mask.append(NumInputs: 2, Elt: X);
4317 }
4318 return Mask;
4319 }
4320
4321 // Instrument pclmul intrinsics.
4322 // These intrinsics operate either on odd or on even elements of the input
4323 // vectors, depending on the constant in the 3rd argument, ignoring the rest.
4324 // Replace the unused elements with copies of the used ones, ex:
4325 // (0, 1, 2, 3) -> (0, 0, 2, 2) (even case)
4326 // or
4327 // (0, 1, 2, 3) -> (1, 1, 3, 3) (odd case)
4328 // and then apply the usual shadow combining logic.
4329 void handlePclmulIntrinsic(IntrinsicInst &I) {
4330 IRBuilder<> IRB(&I);
4331 unsigned Width =
4332 cast<FixedVectorType>(Val: I.getArgOperand(i: 0)->getType())->getNumElements();
4333 assert(isa<ConstantInt>(I.getArgOperand(2)) &&
4334 "pclmul 3rd operand must be a constant");
4335 unsigned Imm = cast<ConstantInt>(Val: I.getArgOperand(i: 2))->getZExtValue();
4336 Value *Shuf0 = IRB.CreateShuffleVector(V: getShadow(I: &I, i: 0),
4337 Mask: getPclmulMask(Width, OddElements: Imm & 0x01));
4338 Value *Shuf1 = IRB.CreateShuffleVector(V: getShadow(I: &I, i: 1),
4339 Mask: getPclmulMask(Width, OddElements: Imm & 0x10));
4340 ShadowAndOriginCombiner SOC(this, IRB);
4341 SOC.Add(OpShadow: Shuf0, OpOrigin: getOrigin(I: &I, i: 0));
4342 SOC.Add(OpShadow: Shuf1, OpOrigin: getOrigin(I: &I, i: 1));
4343 SOC.Done(I: &I);
4344 }
4345
4346 // Instrument _mm_*_sd|ss intrinsics
4347 void handleUnarySdSsIntrinsic(IntrinsicInst &I) {
4348 IRBuilder<> IRB(&I);
4349 unsigned Width =
4350 cast<FixedVectorType>(Val: I.getArgOperand(i: 0)->getType())->getNumElements();
4351 Value *First = getShadow(I: &I, i: 0);
4352 Value *Second = getShadow(I: &I, i: 1);
4353 // First element of second operand, remaining elements of first operand
4354 SmallVector<int, 16> Mask;
4355 Mask.push_back(Elt: Width);
4356 for (unsigned i = 1; i < Width; i++)
4357 Mask.push_back(Elt: i);
4358 Value *Shadow = IRB.CreateShuffleVector(V1: First, V2: Second, Mask);
4359
4360 setShadow(V: &I, SV: Shadow);
4361 setOriginForNaryOp(I);
4362 }
4363
4364 void handleVtestIntrinsic(IntrinsicInst &I) {
4365 IRBuilder<> IRB(&I);
4366 Value *Shadow0 = getShadow(I: &I, i: 0);
4367 Value *Shadow1 = getShadow(I: &I, i: 1);
4368 Value *Or = IRB.CreateOr(LHS: Shadow0, RHS: Shadow1);
4369 Value *NZ = IRB.CreateICmpNE(LHS: Or, RHS: Constant::getNullValue(Ty: Or->getType()));
4370 Value *Scalar = convertShadowToScalar(V: NZ, IRB);
4371 Value *Shadow = IRB.CreateZExt(V: Scalar, DestTy: getShadowTy(V: &I));
4372
4373 setShadow(V: &I, SV: Shadow);
4374 setOriginForNaryOp(I);
4375 }
4376
4377 void handleBinarySdSsIntrinsic(IntrinsicInst &I) {
4378 IRBuilder<> IRB(&I);
4379 unsigned Width =
4380 cast<FixedVectorType>(Val: I.getArgOperand(i: 0)->getType())->getNumElements();
4381 Value *First = getShadow(I: &I, i: 0);
4382 Value *Second = getShadow(I: &I, i: 1);
4383 Value *OrShadow = IRB.CreateOr(LHS: First, RHS: Second);
4384 // First element of both OR'd together, remaining elements of first operand
4385 SmallVector<int, 16> Mask;
4386 Mask.push_back(Elt: Width);
4387 for (unsigned i = 1; i < Width; i++)
4388 Mask.push_back(Elt: i);
4389 Value *Shadow = IRB.CreateShuffleVector(V1: First, V2: OrShadow, Mask);
4390
4391 setShadow(V: &I, SV: Shadow);
4392 setOriginForNaryOp(I);
4393 }
4394
4395 // _mm_round_ps / _mm_round_ps.
4396 // Similar to maybeHandleSimpleNomemIntrinsic except
4397 // the second argument is guranteed to be a constant integer.
4398 void handleRoundPdPsIntrinsic(IntrinsicInst &I) {
4399 assert(I.getArgOperand(0)->getType() == I.getType());
4400 assert(I.arg_size() == 2);
4401 assert(isa<ConstantInt>(I.getArgOperand(1)));
4402
4403 IRBuilder<> IRB(&I);
4404 ShadowAndOriginCombiner SC(this, IRB);
4405 SC.Add(V: I.getArgOperand(i: 0));
4406 SC.Done(I: &I);
4407 }
4408
4409 // Instrument abs intrinsic.
4410 // handleUnknownIntrinsic can't handle it because of the last
4411 // is_int_min_poison argument which does not match the result type.
4412 void handleAbsIntrinsic(IntrinsicInst &I) {
4413 assert(I.getType()->isIntOrIntVectorTy());
4414 assert(I.getArgOperand(0)->getType() == I.getType());
4415
4416 // FIXME: Handle is_int_min_poison.
4417 IRBuilder<> IRB(&I);
4418 setShadow(V: &I, SV: getShadow(I: &I, i: 0));
4419 setOrigin(V: &I, Origin: getOrigin(I: &I, i: 0));
4420 }
4421
4422 void handleIsFpClass(IntrinsicInst &I) {
4423 IRBuilder<> IRB(&I);
4424 Value *Shadow = getShadow(I: &I, i: 0);
4425 setShadow(V: &I, SV: IRB.CreateICmpNE(LHS: Shadow, RHS: getCleanShadow(V: Shadow)));
4426 setOrigin(V: &I, Origin: getOrigin(I: &I, i: 0));
4427 }
4428
4429 void handleArithmeticWithOverflow(IntrinsicInst &I) {
4430 IRBuilder<> IRB(&I);
4431 Value *Shadow0 = getShadow(I: &I, i: 0);
4432 Value *Shadow1 = getShadow(I: &I, i: 1);
4433 Value *ShadowElt0 = IRB.CreateOr(LHS: Shadow0, RHS: Shadow1);
4434 Value *ShadowElt1 =
4435 IRB.CreateICmpNE(LHS: ShadowElt0, RHS: getCleanShadow(V: ShadowElt0));
4436
4437 Value *Shadow = PoisonValue::get(T: getShadowTy(V: &I));
4438 Shadow = IRB.CreateInsertValue(Agg: Shadow, Val: ShadowElt0, Idxs: 0);
4439 Shadow = IRB.CreateInsertValue(Agg: Shadow, Val: ShadowElt1, Idxs: 1);
4440
4441 setShadow(V: &I, SV: Shadow);
4442 setOriginForNaryOp(I);
4443 }
4444
4445 Value *extractLowerShadow(IRBuilder<> &IRB, Value *V) {
4446 assert(isa<FixedVectorType>(V->getType()));
4447 assert(cast<FixedVectorType>(V->getType())->getNumElements() > 0);
4448 Value *Shadow = getShadow(V);
4449 return IRB.CreateExtractElement(Vec: Shadow,
4450 Idx: ConstantInt::get(Ty: IRB.getInt32Ty(), V: 0));
4451 }
4452
4453 // For sh.* compiler intrinsics:
4454 // llvm.x86.avx512fp16.mask.{add/sub/mul/div/max/min}.sh.round
4455 // (<8 x half>, <8 x half>, <8 x half>, i8, i32)
4456 // A B WriteThru Mask RoundingMode
4457 //
4458 // DstShadow[0] = Mask[0] ? (AShadow[0] | BShadow[0]) : WriteThruShadow[0]
4459 // DstShadow[1..7] = AShadow[1..7]
4460 void visitGenericScalarHalfwordInst(IntrinsicInst &I) {
4461 IRBuilder<> IRB(&I);
4462
4463 assert(I.arg_size() == 5);
4464 Value *A = I.getOperand(i_nocapture: 0);
4465 Value *B = I.getOperand(i_nocapture: 1);
4466 Value *WriteThrough = I.getOperand(i_nocapture: 2);
4467 Value *Mask = I.getOperand(i_nocapture: 3);
4468 Value *RoundingMode = I.getOperand(i_nocapture: 4);
4469
4470 // Technically, we could probably just check whether the LSB is
4471 // initialized, but intuitively it feels like a partly uninitialized mask
4472 // is unintended, and we should warn the user immediately.
4473 insertShadowCheck(Val: Mask, OrigIns: &I);
4474 insertShadowCheck(Val: RoundingMode, OrigIns: &I);
4475
4476 assert(isa<FixedVectorType>(A->getType()));
4477 unsigned NumElements =
4478 cast<FixedVectorType>(Val: A->getType())->getNumElements();
4479 assert(NumElements == 8);
4480 assert(A->getType() == B->getType());
4481 assert(B->getType() == WriteThrough->getType());
4482 assert(Mask->getType()->getPrimitiveSizeInBits() == NumElements);
4483 assert(RoundingMode->getType()->isIntegerTy());
4484
4485 Value *ALowerShadow = extractLowerShadow(IRB, V: A);
4486 Value *BLowerShadow = extractLowerShadow(IRB, V: B);
4487
4488 Value *ABLowerShadow = IRB.CreateOr(LHS: ALowerShadow, RHS: BLowerShadow);
4489
4490 Value *WriteThroughLowerShadow = extractLowerShadow(IRB, V: WriteThrough);
4491
4492 Mask = IRB.CreateBitCast(
4493 V: Mask, DestTy: FixedVectorType::get(ElementType: IRB.getInt1Ty(), NumElts: NumElements));
4494 Value *MaskLower =
4495 IRB.CreateExtractElement(Vec: Mask, Idx: ConstantInt::get(Ty: IRB.getInt32Ty(), V: 0));
4496
4497 Value *AShadow = getShadow(V: A);
4498 Value *DstLowerShadow =
4499 IRB.CreateSelect(C: MaskLower, True: ABLowerShadow, False: WriteThroughLowerShadow);
4500 Value *DstShadow = IRB.CreateInsertElement(
4501 Vec: AShadow, NewElt: DstLowerShadow, Idx: ConstantInt::get(Ty: IRB.getInt32Ty(), V: 0),
4502 Name: "_msprop");
4503
4504 setShadow(V: &I, SV: DstShadow);
4505 setOriginForNaryOp(I);
4506 }
4507
4508 // Handle Arm NEON vector load intrinsics (vld*).
4509 //
4510 // The WithLane instructions (ld[234]lane) are similar to:
4511 // call {<4 x i32>, <4 x i32>, <4 x i32>}
4512 // @llvm.aarch64.neon.ld3lane.v4i32.p0
4513 // (<4 x i32> %L1, <4 x i32> %L2, <4 x i32> %L3, i64 %lane, ptr
4514 // %A)
4515 //
4516 // The non-WithLane instructions (ld[234], ld1x[234], ld[234]r) are similar
4517 // to:
4518 // call {<8 x i8>, <8 x i8>} @llvm.aarch64.neon.ld2.v8i8.p0(ptr %A)
4519 void handleNEONVectorLoad(IntrinsicInst &I, bool WithLane) {
4520 unsigned int numArgs = I.arg_size();
4521
4522 // Return type is a struct of vectors of integers or floating-point
4523 assert(I.getType()->isStructTy());
4524 [[maybe_unused]] StructType *RetTy = cast<StructType>(Val: I.getType());
4525 assert(RetTy->getNumElements() > 0);
4526 assert(RetTy->getElementType(0)->isIntOrIntVectorTy() ||
4527 RetTy->getElementType(0)->isFPOrFPVectorTy());
4528 for (unsigned int i = 0; i < RetTy->getNumElements(); i++)
4529 assert(RetTy->getElementType(i) == RetTy->getElementType(0));
4530
4531 if (WithLane) {
4532 // 2, 3 or 4 vectors, plus lane number, plus input pointer
4533 assert(4 <= numArgs && numArgs <= 6);
4534
4535 // Return type is a struct of the input vectors
4536 assert(RetTy->getNumElements() + 2 == numArgs);
4537 for (unsigned int i = 0; i < RetTy->getNumElements(); i++)
4538 assert(I.getArgOperand(i)->getType() == RetTy->getElementType(0));
4539 } else {
4540 assert(numArgs == 1);
4541 }
4542
4543 IRBuilder<> IRB(&I);
4544
4545 SmallVector<Value *, 6> ShadowArgs;
4546 if (WithLane) {
4547 for (unsigned int i = 0; i < numArgs - 2; i++)
4548 ShadowArgs.push_back(Elt: getShadow(V: I.getArgOperand(i)));
4549
4550 // Lane number, passed verbatim
4551 Value *LaneNumber = I.getArgOperand(i: numArgs - 2);
4552 ShadowArgs.push_back(Elt: LaneNumber);
4553
4554 // TODO: blend shadow of lane number into output shadow?
4555 insertShadowCheck(Val: LaneNumber, OrigIns: &I);
4556 }
4557
4558 Value *Src = I.getArgOperand(i: numArgs - 1);
4559 assert(Src->getType()->isPointerTy() && "Source is not a pointer!");
4560
4561 Type *SrcShadowTy = getShadowTy(V: Src);
4562 auto [SrcShadowPtr, SrcOriginPtr] =
4563 getShadowOriginPtr(Addr: Src, IRB, ShadowTy: SrcShadowTy, Alignment: Align(1), /*isStore*/ false);
4564 ShadowArgs.push_back(Elt: SrcShadowPtr);
4565
4566 // The NEON vector load instructions handled by this function all have
4567 // integer variants. It is easier to use those rather than trying to cast
4568 // a struct of vectors of floats into a struct of vectors of integers.
4569 CallInst *CI =
4570 IRB.CreateIntrinsic(RetTy: getShadowTy(V: &I), ID: I.getIntrinsicID(), Args: ShadowArgs);
4571 setShadow(V: &I, SV: CI);
4572
4573 if (!MS.TrackOrigins)
4574 return;
4575
4576 Value *PtrSrcOrigin = IRB.CreateLoad(Ty: MS.OriginTy, Ptr: SrcOriginPtr);
4577 setOrigin(V: &I, Origin: PtrSrcOrigin);
4578 }
4579
4580 /// Handle Arm NEON vector store intrinsics (vst{2,3,4}, vst1x_{2,3,4},
4581 /// and vst{2,3,4}lane).
4582 ///
4583 /// Arm NEON vector store intrinsics have the output address (pointer) as the
4584 /// last argument, with the initial arguments being the inputs (and lane
4585 /// number for vst{2,3,4}lane). They return void.
4586 ///
4587 /// - st4 interleaves the output e.g., st4 (inA, inB, inC, inD, outP) writes
4588 /// abcdabcdabcdabcd... into *outP
4589 /// - st1_x4 is non-interleaved e.g., st1_x4 (inA, inB, inC, inD, outP)
4590 /// writes aaaa...bbbb...cccc...dddd... into *outP
4591 /// - st4lane has arguments of (inA, inB, inC, inD, lane, outP)
4592 /// These instructions can all be instrumented with essentially the same
4593 /// MSan logic, simply by applying the corresponding intrinsic to the shadow.
4594 void handleNEONVectorStoreIntrinsic(IntrinsicInst &I, bool useLane) {
4595 IRBuilder<> IRB(&I);
4596
4597 // Don't use getNumOperands() because it includes the callee
4598 int numArgOperands = I.arg_size();
4599
4600 // The last arg operand is the output (pointer)
4601 assert(numArgOperands >= 1);
4602 Value *Addr = I.getArgOperand(i: numArgOperands - 1);
4603 assert(Addr->getType()->isPointerTy());
4604 int skipTrailingOperands = 1;
4605
4606 if (ClCheckAccessAddress)
4607 insertShadowCheck(Val: Addr, OrigIns: &I);
4608
4609 // Second-last operand is the lane number (for vst{2,3,4}lane)
4610 if (useLane) {
4611 skipTrailingOperands++;
4612 assert(numArgOperands >= static_cast<int>(skipTrailingOperands));
4613 assert(isa<IntegerType>(
4614 I.getArgOperand(numArgOperands - skipTrailingOperands)->getType()));
4615 }
4616
4617 SmallVector<Value *, 8> ShadowArgs;
4618 // All the initial operands are the inputs
4619 for (int i = 0; i < numArgOperands - skipTrailingOperands; i++) {
4620 assert(isa<FixedVectorType>(I.getArgOperand(i)->getType()));
4621 Value *Shadow = getShadow(I: &I, i);
4622 ShadowArgs.append(NumInputs: 1, Elt: Shadow);
4623 }
4624
4625 // MSan's GetShadowTy assumes the LHS is the type we want the shadow for
4626 // e.g., for:
4627 // [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to i128
4628 // we know the type of the output (and its shadow) is <16 x i8>.
4629 //
4630 // Arm NEON VST is unusual because the last argument is the output address:
4631 // define void @st2_16b(<16 x i8> %A, <16 x i8> %B, ptr %P) {
4632 // call void @llvm.aarch64.neon.st2.v16i8.p0
4633 // (<16 x i8> [[A]], <16 x i8> [[B]], ptr [[P]])
4634 // and we have no type information about P's operand. We must manually
4635 // compute the type (<16 x i8> x 2).
4636 FixedVectorType *OutputVectorTy = FixedVectorType::get(
4637 ElementType: cast<FixedVectorType>(Val: I.getArgOperand(i: 0)->getType())->getElementType(),
4638 NumElts: cast<FixedVectorType>(Val: I.getArgOperand(i: 0)->getType())->getNumElements() *
4639 (numArgOperands - skipTrailingOperands));
4640 Type *OutputShadowTy = getShadowTy(OrigTy: OutputVectorTy);
4641
4642 if (useLane)
4643 ShadowArgs.append(NumInputs: 1,
4644 Elt: I.getArgOperand(i: numArgOperands - skipTrailingOperands));
4645
4646 Value *OutputShadowPtr, *OutputOriginPtr;
4647 // AArch64 NEON does not need alignment (unless OS requires it)
4648 std::tie(args&: OutputShadowPtr, args&: OutputOriginPtr) = getShadowOriginPtr(
4649 Addr, IRB, ShadowTy: OutputShadowTy, Alignment: Align(1), /*isStore*/ true);
4650 ShadowArgs.append(NumInputs: 1, Elt: OutputShadowPtr);
4651
4652 CallInst *CI =
4653 IRB.CreateIntrinsic(RetTy: IRB.getVoidTy(), ID: I.getIntrinsicID(), Args: ShadowArgs);
4654 setShadow(V: &I, SV: CI);
4655
4656 if (MS.TrackOrigins) {
4657 // TODO: if we modelled the vst* instruction more precisely, we could
4658 // more accurately track the origins (e.g., if both inputs are
4659 // uninitialized for vst2, we currently blame the second input, even
4660 // though part of the output depends only on the first input).
4661 //
4662 // This is particularly imprecise for vst{2,3,4}lane, since only one
4663 // lane of each input is actually copied to the output.
4664 OriginCombiner OC(this, IRB);
4665 for (int i = 0; i < numArgOperands - skipTrailingOperands; i++)
4666 OC.Add(V: I.getArgOperand(i));
4667
4668 const DataLayout &DL = F.getDataLayout();
4669 OC.DoneAndStoreOrigin(TS: DL.getTypeStoreSize(Ty: OutputVectorTy),
4670 OriginPtr: OutputOriginPtr);
4671 }
4672 }
4673
4674 /// Handle intrinsics by applying the intrinsic to the shadows.
4675 ///
4676 /// The trailing arguments are passed verbatim to the intrinsic, though any
4677 /// uninitialized trailing arguments can also taint the shadow e.g., for an
4678 /// intrinsic with one trailing verbatim argument:
4679 /// out = intrinsic(var1, var2, opType)
4680 /// we compute:
4681 /// shadow[out] =
4682 /// intrinsic(shadow[var1], shadow[var2], opType) | shadow[opType]
4683 ///
4684 /// Typically, shadowIntrinsicID will be specified by the caller to be
4685 /// I.getIntrinsicID(), but the caller can choose to replace it with another
4686 /// intrinsic of the same type.
4687 ///
4688 /// CAUTION: this assumes that the intrinsic will handle arbitrary
4689 /// bit-patterns (for example, if the intrinsic accepts floats for
4690 /// var1, we require that it doesn't care if inputs are NaNs).
4691 ///
4692 /// For example, this can be applied to the Arm NEON vector table intrinsics
4693 /// (tbl{1,2,3,4}).
4694 ///
4695 /// The origin is approximated using setOriginForNaryOp.
4696 void handleIntrinsicByApplyingToShadow(IntrinsicInst &I,
4697 Intrinsic::ID shadowIntrinsicID,
4698 unsigned int trailingVerbatimArgs) {
4699 IRBuilder<> IRB(&I);
4700
4701 assert(trailingVerbatimArgs < I.arg_size());
4702
4703 SmallVector<Value *, 8> ShadowArgs;
4704 // Don't use getNumOperands() because it includes the callee
4705 for (unsigned int i = 0; i < I.arg_size() - trailingVerbatimArgs; i++) {
4706 Value *Shadow = getShadow(I: &I, i);
4707
4708 // Shadows are integer-ish types but some intrinsics require a
4709 // different (e.g., floating-point) type.
4710 ShadowArgs.push_back(
4711 Elt: IRB.CreateBitCast(V: Shadow, DestTy: I.getArgOperand(i)->getType()));
4712 }
4713
4714 for (unsigned int i = I.arg_size() - trailingVerbatimArgs; i < I.arg_size();
4715 i++) {
4716 Value *Arg = I.getArgOperand(i);
4717 ShadowArgs.push_back(Elt: Arg);
4718 }
4719
4720 CallInst *CI =
4721 IRB.CreateIntrinsic(RetTy: I.getType(), ID: shadowIntrinsicID, Args: ShadowArgs);
4722 Value *CombinedShadow = CI;
4723
4724 // Combine the computed shadow with the shadow of trailing args
4725 for (unsigned int i = I.arg_size() - trailingVerbatimArgs; i < I.arg_size();
4726 i++) {
4727 Value *Shadow =
4728 CreateShadowCast(IRB, V: getShadow(I: &I, i), dstTy: CombinedShadow->getType());
4729 CombinedShadow = IRB.CreateOr(LHS: Shadow, RHS: CombinedShadow, Name: "_msprop");
4730 }
4731
4732 setShadow(V: &I, SV: IRB.CreateBitCast(V: CombinedShadow, DestTy: getShadowTy(V: &I)));
4733
4734 setOriginForNaryOp(I);
4735 }
4736
4737 // Approximation only
4738 //
4739 // e.g., <16 x i8> @llvm.aarch64.neon.pmull64(i64, i64)
4740 void handleNEONVectorMultiplyIntrinsic(IntrinsicInst &I) {
4741 assert(I.arg_size() == 2);
4742
4743 handleShadowOr(I);
4744 }
4745
4746 void visitIntrinsicInst(IntrinsicInst &I) {
4747 switch (I.getIntrinsicID()) {
4748 case Intrinsic::uadd_with_overflow:
4749 case Intrinsic::sadd_with_overflow:
4750 case Intrinsic::usub_with_overflow:
4751 case Intrinsic::ssub_with_overflow:
4752 case Intrinsic::umul_with_overflow:
4753 case Intrinsic::smul_with_overflow:
4754 handleArithmeticWithOverflow(I);
4755 break;
4756 case Intrinsic::abs:
4757 handleAbsIntrinsic(I);
4758 break;
4759 case Intrinsic::bitreverse:
4760 handleIntrinsicByApplyingToShadow(I, shadowIntrinsicID: I.getIntrinsicID(),
4761 /*trailingVerbatimArgs*/ 0);
4762 break;
4763 case Intrinsic::is_fpclass:
4764 handleIsFpClass(I);
4765 break;
4766 case Intrinsic::lifetime_start:
4767 handleLifetimeStart(I);
4768 break;
4769 case Intrinsic::launder_invariant_group:
4770 case Intrinsic::strip_invariant_group:
4771 handleInvariantGroup(I);
4772 break;
4773 case Intrinsic::bswap:
4774 handleBswap(I);
4775 break;
4776 case Intrinsic::ctlz:
4777 case Intrinsic::cttz:
4778 handleCountLeadingTrailingZeros(I);
4779 break;
4780 case Intrinsic::masked_compressstore:
4781 handleMaskedCompressStore(I);
4782 break;
4783 case Intrinsic::masked_expandload:
4784 handleMaskedExpandLoad(I);
4785 break;
4786 case Intrinsic::masked_gather:
4787 handleMaskedGather(I);
4788 break;
4789 case Intrinsic::masked_scatter:
4790 handleMaskedScatter(I);
4791 break;
4792 case Intrinsic::masked_store:
4793 handleMaskedStore(I);
4794 break;
4795 case Intrinsic::masked_load:
4796 handleMaskedLoad(I);
4797 break;
4798 case Intrinsic::vector_reduce_and:
4799 handleVectorReduceAndIntrinsic(I);
4800 break;
4801 case Intrinsic::vector_reduce_or:
4802 handleVectorReduceOrIntrinsic(I);
4803 break;
4804
4805 case Intrinsic::vector_reduce_add:
4806 case Intrinsic::vector_reduce_xor:
4807 case Intrinsic::vector_reduce_mul:
4808 // Signed/Unsigned Min/Max
4809 // TODO: handling similarly to AND/OR may be more precise.
4810 case Intrinsic::vector_reduce_smax:
4811 case Intrinsic::vector_reduce_smin:
4812 case Intrinsic::vector_reduce_umax:
4813 case Intrinsic::vector_reduce_umin:
4814 // TODO: this has no false positives, but arguably we should check that all
4815 // the bits are initialized.
4816 case Intrinsic::vector_reduce_fmax:
4817 case Intrinsic::vector_reduce_fmin:
4818 handleVectorReduceIntrinsic(I, /*AllowShadowCast=*/false);
4819 break;
4820
4821 case Intrinsic::vector_reduce_fadd:
4822 case Intrinsic::vector_reduce_fmul:
4823 handleVectorReduceWithStarterIntrinsic(I);
4824 break;
4825
4826 case Intrinsic::x86_sse_stmxcsr:
4827 handleStmxcsr(I);
4828 break;
4829 case Intrinsic::x86_sse_ldmxcsr:
4830 handleLdmxcsr(I);
4831 break;
4832 case Intrinsic::x86_avx512_vcvtsd2usi64:
4833 case Intrinsic::x86_avx512_vcvtsd2usi32:
4834 case Intrinsic::x86_avx512_vcvtss2usi64:
4835 case Intrinsic::x86_avx512_vcvtss2usi32:
4836 case Intrinsic::x86_avx512_cvttss2usi64:
4837 case Intrinsic::x86_avx512_cvttss2usi:
4838 case Intrinsic::x86_avx512_cvttsd2usi64:
4839 case Intrinsic::x86_avx512_cvttsd2usi:
4840 case Intrinsic::x86_avx512_cvtusi2ss:
4841 case Intrinsic::x86_avx512_cvtusi642sd:
4842 case Intrinsic::x86_avx512_cvtusi642ss:
4843 handleSSEVectorConvertIntrinsic(I, NumUsedElements: 1, HasRoundingMode: true);
4844 break;
4845 case Intrinsic::x86_sse2_cvtsd2si64:
4846 case Intrinsic::x86_sse2_cvtsd2si:
4847 case Intrinsic::x86_sse2_cvtsd2ss:
4848 case Intrinsic::x86_sse2_cvttsd2si64:
4849 case Intrinsic::x86_sse2_cvttsd2si:
4850 case Intrinsic::x86_sse_cvtss2si64:
4851 case Intrinsic::x86_sse_cvtss2si:
4852 case Intrinsic::x86_sse_cvttss2si64:
4853 case Intrinsic::x86_sse_cvttss2si:
4854 handleSSEVectorConvertIntrinsic(I, NumUsedElements: 1);
4855 break;
4856 case Intrinsic::x86_sse_cvtps2pi:
4857 case Intrinsic::x86_sse_cvttps2pi:
4858 handleSSEVectorConvertIntrinsic(I, NumUsedElements: 2);
4859 break;
4860
4861 // TODO:
4862 // <1 x i64> @llvm.x86.sse.cvtpd2pi(<2 x double>)
4863 // <2 x double> @llvm.x86.sse.cvtpi2pd(<1 x i64>)
4864 // <4 x float> @llvm.x86.sse.cvtpi2ps(<4 x float>, <1 x i64>)
4865
4866 case Intrinsic::x86_vcvtps2ph_128:
4867 case Intrinsic::x86_vcvtps2ph_256: {
4868 handleSSEVectorConvertIntrinsicByProp(I, /*HasRoundingMode=*/true);
4869 break;
4870 }
4871
4872 case Intrinsic::x86_sse2_cvtpd2ps:
4873 case Intrinsic::x86_sse2_cvtps2dq:
4874 case Intrinsic::x86_sse2_cvtpd2dq:
4875 case Intrinsic::x86_sse2_cvttps2dq:
4876 case Intrinsic::x86_sse2_cvttpd2dq:
4877 case Intrinsic::x86_avx_cvt_pd2_ps_256:
4878 case Intrinsic::x86_avx_cvt_ps2dq_256:
4879 case Intrinsic::x86_avx_cvt_pd2dq_256:
4880 case Intrinsic::x86_avx_cvtt_ps2dq_256:
4881 case Intrinsic::x86_avx_cvtt_pd2dq_256: {
4882 handleSSEVectorConvertIntrinsicByProp(I, /*HasRoundingMode=*/false);
4883 break;
4884 }
4885
4886 case Intrinsic::x86_avx512_psll_w_512:
4887 case Intrinsic::x86_avx512_psll_d_512:
4888 case Intrinsic::x86_avx512_psll_q_512:
4889 case Intrinsic::x86_avx512_pslli_w_512:
4890 case Intrinsic::x86_avx512_pslli_d_512:
4891 case Intrinsic::x86_avx512_pslli_q_512:
4892 case Intrinsic::x86_avx512_psrl_w_512:
4893 case Intrinsic::x86_avx512_psrl_d_512:
4894 case Intrinsic::x86_avx512_psrl_q_512:
4895 case Intrinsic::x86_avx512_psra_w_512:
4896 case Intrinsic::x86_avx512_psra_d_512:
4897 case Intrinsic::x86_avx512_psra_q_512:
4898 case Intrinsic::x86_avx512_psrli_w_512:
4899 case Intrinsic::x86_avx512_psrli_d_512:
4900 case Intrinsic::x86_avx512_psrli_q_512:
4901 case Intrinsic::x86_avx512_psrai_w_512:
4902 case Intrinsic::x86_avx512_psrai_d_512:
4903 case Intrinsic::x86_avx512_psrai_q_512:
4904 case Intrinsic::x86_avx512_psra_q_256:
4905 case Intrinsic::x86_avx512_psra_q_128:
4906 case Intrinsic::x86_avx512_psrai_q_256:
4907 case Intrinsic::x86_avx512_psrai_q_128:
4908 case Intrinsic::x86_avx2_psll_w:
4909 case Intrinsic::x86_avx2_psll_d:
4910 case Intrinsic::x86_avx2_psll_q:
4911 case Intrinsic::x86_avx2_pslli_w:
4912 case Intrinsic::x86_avx2_pslli_d:
4913 case Intrinsic::x86_avx2_pslli_q:
4914 case Intrinsic::x86_avx2_psrl_w:
4915 case Intrinsic::x86_avx2_psrl_d:
4916 case Intrinsic::x86_avx2_psrl_q:
4917 case Intrinsic::x86_avx2_psra_w:
4918 case Intrinsic::x86_avx2_psra_d:
4919 case Intrinsic::x86_avx2_psrli_w:
4920 case Intrinsic::x86_avx2_psrli_d:
4921 case Intrinsic::x86_avx2_psrli_q:
4922 case Intrinsic::x86_avx2_psrai_w:
4923 case Intrinsic::x86_avx2_psrai_d:
4924 case Intrinsic::x86_sse2_psll_w:
4925 case Intrinsic::x86_sse2_psll_d:
4926 case Intrinsic::x86_sse2_psll_q:
4927 case Intrinsic::x86_sse2_pslli_w:
4928 case Intrinsic::x86_sse2_pslli_d:
4929 case Intrinsic::x86_sse2_pslli_q:
4930 case Intrinsic::x86_sse2_psrl_w:
4931 case Intrinsic::x86_sse2_psrl_d:
4932 case Intrinsic::x86_sse2_psrl_q:
4933 case Intrinsic::x86_sse2_psra_w:
4934 case Intrinsic::x86_sse2_psra_d:
4935 case Intrinsic::x86_sse2_psrli_w:
4936 case Intrinsic::x86_sse2_psrli_d:
4937 case Intrinsic::x86_sse2_psrli_q:
4938 case Intrinsic::x86_sse2_psrai_w:
4939 case Intrinsic::x86_sse2_psrai_d:
4940 case Intrinsic::x86_mmx_psll_w:
4941 case Intrinsic::x86_mmx_psll_d:
4942 case Intrinsic::x86_mmx_psll_q:
4943 case Intrinsic::x86_mmx_pslli_w:
4944 case Intrinsic::x86_mmx_pslli_d:
4945 case Intrinsic::x86_mmx_pslli_q:
4946 case Intrinsic::x86_mmx_psrl_w:
4947 case Intrinsic::x86_mmx_psrl_d:
4948 case Intrinsic::x86_mmx_psrl_q:
4949 case Intrinsic::x86_mmx_psra_w:
4950 case Intrinsic::x86_mmx_psra_d:
4951 case Intrinsic::x86_mmx_psrli_w:
4952 case Intrinsic::x86_mmx_psrli_d:
4953 case Intrinsic::x86_mmx_psrli_q:
4954 case Intrinsic::x86_mmx_psrai_w:
4955 case Intrinsic::x86_mmx_psrai_d:
4956 case Intrinsic::aarch64_neon_rshrn:
4957 case Intrinsic::aarch64_neon_sqrshl:
4958 case Intrinsic::aarch64_neon_sqrshrn:
4959 case Intrinsic::aarch64_neon_sqrshrun:
4960 case Intrinsic::aarch64_neon_sqshl:
4961 case Intrinsic::aarch64_neon_sqshlu:
4962 case Intrinsic::aarch64_neon_sqshrn:
4963 case Intrinsic::aarch64_neon_sqshrun:
4964 case Intrinsic::aarch64_neon_srshl:
4965 case Intrinsic::aarch64_neon_sshl:
4966 case Intrinsic::aarch64_neon_uqrshl:
4967 case Intrinsic::aarch64_neon_uqrshrn:
4968 case Intrinsic::aarch64_neon_uqshl:
4969 case Intrinsic::aarch64_neon_uqshrn:
4970 case Intrinsic::aarch64_neon_urshl:
4971 case Intrinsic::aarch64_neon_ushl:
4972 // Not handled here: aarch64_neon_vsli (vector shift left and insert)
4973 handleVectorShiftIntrinsic(I, /* Variable */ false);
4974 break;
4975 case Intrinsic::x86_avx2_psllv_d:
4976 case Intrinsic::x86_avx2_psllv_d_256:
4977 case Intrinsic::x86_avx512_psllv_d_512:
4978 case Intrinsic::x86_avx2_psllv_q:
4979 case Intrinsic::x86_avx2_psllv_q_256:
4980 case Intrinsic::x86_avx512_psllv_q_512:
4981 case Intrinsic::x86_avx2_psrlv_d:
4982 case Intrinsic::x86_avx2_psrlv_d_256:
4983 case Intrinsic::x86_avx512_psrlv_d_512:
4984 case Intrinsic::x86_avx2_psrlv_q:
4985 case Intrinsic::x86_avx2_psrlv_q_256:
4986 case Intrinsic::x86_avx512_psrlv_q_512:
4987 case Intrinsic::x86_avx2_psrav_d:
4988 case Intrinsic::x86_avx2_psrav_d_256:
4989 case Intrinsic::x86_avx512_psrav_d_512:
4990 case Intrinsic::x86_avx512_psrav_q_128:
4991 case Intrinsic::x86_avx512_psrav_q_256:
4992 case Intrinsic::x86_avx512_psrav_q_512:
4993 handleVectorShiftIntrinsic(I, /* Variable */ true);
4994 break;
4995
4996 case Intrinsic::x86_sse2_packsswb_128:
4997 case Intrinsic::x86_sse2_packssdw_128:
4998 case Intrinsic::x86_sse2_packuswb_128:
4999 case Intrinsic::x86_sse41_packusdw:
5000 case Intrinsic::x86_avx2_packsswb:
5001 case Intrinsic::x86_avx2_packssdw:
5002 case Intrinsic::x86_avx2_packuswb:
5003 case Intrinsic::x86_avx2_packusdw:
5004 handleVectorPackIntrinsic(I);
5005 break;
5006
5007 case Intrinsic::x86_sse41_pblendvb:
5008 case Intrinsic::x86_sse41_blendvpd:
5009 case Intrinsic::x86_sse41_blendvps:
5010 case Intrinsic::x86_avx_blendv_pd_256:
5011 case Intrinsic::x86_avx_blendv_ps_256:
5012 case Intrinsic::x86_avx2_pblendvb:
5013 handleBlendvIntrinsic(I);
5014 break;
5015
5016 case Intrinsic::x86_avx_dp_ps_256:
5017 case Intrinsic::x86_sse41_dppd:
5018 case Intrinsic::x86_sse41_dpps:
5019 handleDppIntrinsic(I);
5020 break;
5021
5022 case Intrinsic::x86_mmx_packsswb:
5023 case Intrinsic::x86_mmx_packuswb:
5024 handleVectorPackIntrinsic(I, MMXEltSizeInBits: 16);
5025 break;
5026
5027 case Intrinsic::x86_mmx_packssdw:
5028 handleVectorPackIntrinsic(I, MMXEltSizeInBits: 32);
5029 break;
5030
5031 case Intrinsic::x86_mmx_psad_bw:
5032 handleVectorSadIntrinsic(I, IsMMX: true);
5033 break;
5034 case Intrinsic::x86_sse2_psad_bw:
5035 case Intrinsic::x86_avx2_psad_bw:
5036 handleVectorSadIntrinsic(I);
5037 break;
5038
5039 case Intrinsic::x86_sse2_pmadd_wd:
5040 case Intrinsic::x86_avx2_pmadd_wd:
5041 case Intrinsic::x86_ssse3_pmadd_ub_sw_128:
5042 case Intrinsic::x86_avx2_pmadd_ub_sw:
5043 handleVectorPmaddIntrinsic(I);
5044 break;
5045
5046 case Intrinsic::x86_ssse3_pmadd_ub_sw:
5047 handleVectorPmaddIntrinsic(I, MMXEltSizeInBits: 8);
5048 break;
5049
5050 case Intrinsic::x86_mmx_pmadd_wd:
5051 handleVectorPmaddIntrinsic(I, MMXEltSizeInBits: 16);
5052 break;
5053
5054 case Intrinsic::x86_sse_cmp_ss:
5055 case Intrinsic::x86_sse2_cmp_sd:
5056 case Intrinsic::x86_sse_comieq_ss:
5057 case Intrinsic::x86_sse_comilt_ss:
5058 case Intrinsic::x86_sse_comile_ss:
5059 case Intrinsic::x86_sse_comigt_ss:
5060 case Intrinsic::x86_sse_comige_ss:
5061 case Intrinsic::x86_sse_comineq_ss:
5062 case Intrinsic::x86_sse_ucomieq_ss:
5063 case Intrinsic::x86_sse_ucomilt_ss:
5064 case Intrinsic::x86_sse_ucomile_ss:
5065 case Intrinsic::x86_sse_ucomigt_ss:
5066 case Intrinsic::x86_sse_ucomige_ss:
5067 case Intrinsic::x86_sse_ucomineq_ss:
5068 case Intrinsic::x86_sse2_comieq_sd:
5069 case Intrinsic::x86_sse2_comilt_sd:
5070 case Intrinsic::x86_sse2_comile_sd:
5071 case Intrinsic::x86_sse2_comigt_sd:
5072 case Intrinsic::x86_sse2_comige_sd:
5073 case Intrinsic::x86_sse2_comineq_sd:
5074 case Intrinsic::x86_sse2_ucomieq_sd:
5075 case Intrinsic::x86_sse2_ucomilt_sd:
5076 case Intrinsic::x86_sse2_ucomile_sd:
5077 case Intrinsic::x86_sse2_ucomigt_sd:
5078 case Intrinsic::x86_sse2_ucomige_sd:
5079 case Intrinsic::x86_sse2_ucomineq_sd:
5080 handleVectorCompareScalarIntrinsic(I);
5081 break;
5082
5083 case Intrinsic::x86_avx_cmp_pd_256:
5084 case Intrinsic::x86_avx_cmp_ps_256:
5085 case Intrinsic::x86_sse2_cmp_pd:
5086 case Intrinsic::x86_sse_cmp_ps:
5087 handleVectorComparePackedIntrinsic(I);
5088 break;
5089
5090 case Intrinsic::x86_bmi_bextr_32:
5091 case Intrinsic::x86_bmi_bextr_64:
5092 case Intrinsic::x86_bmi_bzhi_32:
5093 case Intrinsic::x86_bmi_bzhi_64:
5094 case Intrinsic::x86_bmi_pdep_32:
5095 case Intrinsic::x86_bmi_pdep_64:
5096 case Intrinsic::x86_bmi_pext_32:
5097 case Intrinsic::x86_bmi_pext_64:
5098 handleBmiIntrinsic(I);
5099 break;
5100
5101 case Intrinsic::x86_pclmulqdq:
5102 case Intrinsic::x86_pclmulqdq_256:
5103 case Intrinsic::x86_pclmulqdq_512:
5104 handlePclmulIntrinsic(I);
5105 break;
5106
5107 case Intrinsic::x86_avx_round_pd_256:
5108 case Intrinsic::x86_avx_round_ps_256:
5109 case Intrinsic::x86_sse41_round_pd:
5110 case Intrinsic::x86_sse41_round_ps:
5111 handleRoundPdPsIntrinsic(I);
5112 break;
5113
5114 case Intrinsic::x86_sse41_round_sd:
5115 case Intrinsic::x86_sse41_round_ss:
5116 handleUnarySdSsIntrinsic(I);
5117 break;
5118
5119 case Intrinsic::x86_sse2_max_sd:
5120 case Intrinsic::x86_sse_max_ss:
5121 case Intrinsic::x86_sse2_min_sd:
5122 case Intrinsic::x86_sse_min_ss:
5123 handleBinarySdSsIntrinsic(I);
5124 break;
5125
5126 case Intrinsic::x86_avx_vtestc_pd:
5127 case Intrinsic::x86_avx_vtestc_pd_256:
5128 case Intrinsic::x86_avx_vtestc_ps:
5129 case Intrinsic::x86_avx_vtestc_ps_256:
5130 case Intrinsic::x86_avx_vtestnzc_pd:
5131 case Intrinsic::x86_avx_vtestnzc_pd_256:
5132 case Intrinsic::x86_avx_vtestnzc_ps:
5133 case Intrinsic::x86_avx_vtestnzc_ps_256:
5134 case Intrinsic::x86_avx_vtestz_pd:
5135 case Intrinsic::x86_avx_vtestz_pd_256:
5136 case Intrinsic::x86_avx_vtestz_ps:
5137 case Intrinsic::x86_avx_vtestz_ps_256:
5138 case Intrinsic::x86_avx_ptestc_256:
5139 case Intrinsic::x86_avx_ptestnzc_256:
5140 case Intrinsic::x86_avx_ptestz_256:
5141 case Intrinsic::x86_sse41_ptestc:
5142 case Intrinsic::x86_sse41_ptestnzc:
5143 case Intrinsic::x86_sse41_ptestz:
5144 handleVtestIntrinsic(I);
5145 break;
5146
5147 // Packed Horizontal Add/Subtract
5148 case Intrinsic::x86_ssse3_phadd_w:
5149 case Intrinsic::x86_ssse3_phadd_w_128:
5150 case Intrinsic::x86_avx2_phadd_w:
5151 case Intrinsic::x86_ssse3_phsub_w:
5152 case Intrinsic::x86_ssse3_phsub_w_128:
5153 case Intrinsic::x86_avx2_phsub_w: {
5154 handlePairwiseShadowOrIntrinsic(I, /*ReinterpretElemWidth=*/16);
5155 break;
5156 }
5157
5158 // Packed Horizontal Add/Subtract
5159 case Intrinsic::x86_ssse3_phadd_d:
5160 case Intrinsic::x86_ssse3_phadd_d_128:
5161 case Intrinsic::x86_avx2_phadd_d:
5162 case Intrinsic::x86_ssse3_phsub_d:
5163 case Intrinsic::x86_ssse3_phsub_d_128:
5164 case Intrinsic::x86_avx2_phsub_d: {
5165 handlePairwiseShadowOrIntrinsic(I, /*ReinterpretElemWidth=*/32);
5166 break;
5167 }
5168
5169 // Packed Horizontal Add/Subtract and Saturate
5170 case Intrinsic::x86_ssse3_phadd_sw:
5171 case Intrinsic::x86_ssse3_phadd_sw_128:
5172 case Intrinsic::x86_avx2_phadd_sw:
5173 case Intrinsic::x86_ssse3_phsub_sw:
5174 case Intrinsic::x86_ssse3_phsub_sw_128:
5175 case Intrinsic::x86_avx2_phsub_sw: {
5176 handlePairwiseShadowOrIntrinsic(I, /*ReinterpretElemWidth=*/16);
5177 break;
5178 }
5179
5180 // Packed Single/Double Precision Floating-Point Horizontal Add
5181 case Intrinsic::x86_sse3_hadd_ps:
5182 case Intrinsic::x86_sse3_hadd_pd:
5183 case Intrinsic::x86_avx_hadd_pd_256:
5184 case Intrinsic::x86_avx_hadd_ps_256:
5185 case Intrinsic::x86_sse3_hsub_ps:
5186 case Intrinsic::x86_sse3_hsub_pd:
5187 case Intrinsic::x86_avx_hsub_pd_256:
5188 case Intrinsic::x86_avx_hsub_ps_256: {
5189 handlePairwiseShadowOrIntrinsic(I);
5190 break;
5191 }
5192
5193 case Intrinsic::x86_avx_maskstore_ps:
5194 case Intrinsic::x86_avx_maskstore_pd:
5195 case Intrinsic::x86_avx_maskstore_ps_256:
5196 case Intrinsic::x86_avx_maskstore_pd_256:
5197 case Intrinsic::x86_avx2_maskstore_d:
5198 case Intrinsic::x86_avx2_maskstore_q:
5199 case Intrinsic::x86_avx2_maskstore_d_256:
5200 case Intrinsic::x86_avx2_maskstore_q_256: {
5201 handleAVXMaskedStore(I);
5202 break;
5203 }
5204
5205 case Intrinsic::x86_avx_maskload_ps:
5206 case Intrinsic::x86_avx_maskload_pd:
5207 case Intrinsic::x86_avx_maskload_ps_256:
5208 case Intrinsic::x86_avx_maskload_pd_256:
5209 case Intrinsic::x86_avx2_maskload_d:
5210 case Intrinsic::x86_avx2_maskload_q:
5211 case Intrinsic::x86_avx2_maskload_d_256:
5212 case Intrinsic::x86_avx2_maskload_q_256: {
5213 handleAVXMaskedLoad(I);
5214 break;
5215 }
5216
5217 // Packed
5218 case Intrinsic::x86_avx512fp16_add_ph_512:
5219 case Intrinsic::x86_avx512fp16_sub_ph_512:
5220 case Intrinsic::x86_avx512fp16_mul_ph_512:
5221 case Intrinsic::x86_avx512fp16_div_ph_512:
5222 case Intrinsic::x86_avx512fp16_max_ph_512:
5223 case Intrinsic::x86_avx512fp16_min_ph_512:
5224 case Intrinsic::x86_avx512_min_ps_512:
5225 case Intrinsic::x86_avx512_min_pd_512:
5226 case Intrinsic::x86_avx512_max_ps_512:
5227 case Intrinsic::x86_avx512_max_pd_512: {
5228 // These AVX512 variants contain the rounding mode as a trailing flag.
5229 // Earlier variants do not have a trailing flag and are already handled
5230 // by maybeHandleSimpleNomemIntrinsic(I, 0) via handleUnknownIntrinsic.
5231 [[maybe_unused]] bool Success =
5232 maybeHandleSimpleNomemIntrinsic(I, /*trailingFlags=*/1);
5233 assert(Success);
5234 break;
5235 }
5236
5237 case Intrinsic::x86_avx_vpermilvar_pd:
5238 case Intrinsic::x86_avx_vpermilvar_pd_256:
5239 case Intrinsic::x86_avx512_vpermilvar_pd_512:
5240 case Intrinsic::x86_avx_vpermilvar_ps:
5241 case Intrinsic::x86_avx_vpermilvar_ps_256:
5242 case Intrinsic::x86_avx512_vpermilvar_ps_512: {
5243 handleAVXVpermilvar(I);
5244 break;
5245 }
5246
5247 case Intrinsic::x86_avx512fp16_mask_add_sh_round:
5248 case Intrinsic::x86_avx512fp16_mask_sub_sh_round:
5249 case Intrinsic::x86_avx512fp16_mask_mul_sh_round:
5250 case Intrinsic::x86_avx512fp16_mask_div_sh_round:
5251 case Intrinsic::x86_avx512fp16_mask_max_sh_round:
5252 case Intrinsic::x86_avx512fp16_mask_min_sh_round: {
5253 visitGenericScalarHalfwordInst(I);
5254 break;
5255 }
5256
5257 case Intrinsic::fshl:
5258 case Intrinsic::fshr:
5259 handleFunnelShift(I);
5260 break;
5261
5262 case Intrinsic::is_constant:
5263 // The result of llvm.is.constant() is always defined.
5264 setShadow(V: &I, SV: getCleanShadow(V: &I));
5265 setOrigin(V: &I, Origin: getCleanOrigin());
5266 break;
5267
5268 // TODO: handling max/min similarly to AND/OR may be more precise
5269 // Floating-Point Maximum/Minimum Pairwise
5270 case Intrinsic::aarch64_neon_fmaxp:
5271 case Intrinsic::aarch64_neon_fminp:
5272 // Floating-Point Maximum/Minimum Number Pairwise
5273 case Intrinsic::aarch64_neon_fmaxnmp:
5274 case Intrinsic::aarch64_neon_fminnmp:
5275 // Signed/Unsigned Maximum/Minimum Pairwise
5276 case Intrinsic::aarch64_neon_smaxp:
5277 case Intrinsic::aarch64_neon_sminp:
5278 case Intrinsic::aarch64_neon_umaxp:
5279 case Intrinsic::aarch64_neon_uminp:
5280 // Add Pairwise
5281 case Intrinsic::aarch64_neon_addp:
5282 // Floating-point Add Pairwise
5283 case Intrinsic::aarch64_neon_faddp:
5284 // Add Long Pairwise
5285 case Intrinsic::aarch64_neon_saddlp:
5286 case Intrinsic::aarch64_neon_uaddlp: {
5287 handlePairwiseShadowOrIntrinsic(I);
5288 break;
5289 }
5290
5291 // Floating-point Convert to integer, rounding to nearest with ties to Away
5292 case Intrinsic::aarch64_neon_fcvtas:
5293 case Intrinsic::aarch64_neon_fcvtau:
5294 // Floating-point convert to integer, rounding toward minus infinity
5295 case Intrinsic::aarch64_neon_fcvtms:
5296 case Intrinsic::aarch64_neon_fcvtmu:
5297 // Floating-point convert to integer, rounding to nearest with ties to even
5298 case Intrinsic::aarch64_neon_fcvtns:
5299 case Intrinsic::aarch64_neon_fcvtnu:
5300 // Floating-point convert to integer, rounding toward plus infinity
5301 case Intrinsic::aarch64_neon_fcvtps:
5302 case Intrinsic::aarch64_neon_fcvtpu:
5303 // Floating-point Convert to integer, rounding toward Zero
5304 case Intrinsic::aarch64_neon_fcvtzs:
5305 case Intrinsic::aarch64_neon_fcvtzu:
5306 // Floating-point convert to lower precision narrow, rounding to odd
5307 case Intrinsic::aarch64_neon_fcvtxn: {
5308 handleNEONVectorConvertIntrinsic(I);
5309 break;
5310 }
5311
5312 // Add reduction to scalar
5313 case Intrinsic::aarch64_neon_faddv:
5314 case Intrinsic::aarch64_neon_saddv:
5315 case Intrinsic::aarch64_neon_uaddv:
5316 // Signed/Unsigned min/max (Vector)
5317 // TODO: handling similarly to AND/OR may be more precise.
5318 case Intrinsic::aarch64_neon_smaxv:
5319 case Intrinsic::aarch64_neon_sminv:
5320 case Intrinsic::aarch64_neon_umaxv:
5321 case Intrinsic::aarch64_neon_uminv:
5322 // Floating-point min/max (vector)
5323 // The f{min,max}"nm"v variants handle NaN differently than f{min,max}v,
5324 // but our shadow propagation is the same.
5325 case Intrinsic::aarch64_neon_fmaxv:
5326 case Intrinsic::aarch64_neon_fminv:
5327 case Intrinsic::aarch64_neon_fmaxnmv:
5328 case Intrinsic::aarch64_neon_fminnmv:
5329 // Sum long across vector
5330 case Intrinsic::aarch64_neon_saddlv:
5331 case Intrinsic::aarch64_neon_uaddlv:
5332 handleVectorReduceIntrinsic(I, /*AllowShadowCast=*/true);
5333 break;
5334
5335 case Intrinsic::aarch64_neon_ld1x2:
5336 case Intrinsic::aarch64_neon_ld1x3:
5337 case Intrinsic::aarch64_neon_ld1x4:
5338 case Intrinsic::aarch64_neon_ld2:
5339 case Intrinsic::aarch64_neon_ld3:
5340 case Intrinsic::aarch64_neon_ld4:
5341 case Intrinsic::aarch64_neon_ld2r:
5342 case Intrinsic::aarch64_neon_ld3r:
5343 case Intrinsic::aarch64_neon_ld4r: {
5344 handleNEONVectorLoad(I, /*WithLane=*/false);
5345 break;
5346 }
5347
5348 case Intrinsic::aarch64_neon_ld2lane:
5349 case Intrinsic::aarch64_neon_ld3lane:
5350 case Intrinsic::aarch64_neon_ld4lane: {
5351 handleNEONVectorLoad(I, /*WithLane=*/true);
5352 break;
5353 }
5354
5355 // Saturating extract narrow
5356 case Intrinsic::aarch64_neon_sqxtn:
5357 case Intrinsic::aarch64_neon_sqxtun:
5358 case Intrinsic::aarch64_neon_uqxtn:
5359 // These only have one argument, but we (ab)use handleShadowOr because it
5360 // does work on single argument intrinsics and will typecast the shadow
5361 // (and update the origin).
5362 handleShadowOr(I);
5363 break;
5364
5365 case Intrinsic::aarch64_neon_st1x2:
5366 case Intrinsic::aarch64_neon_st1x3:
5367 case Intrinsic::aarch64_neon_st1x4:
5368 case Intrinsic::aarch64_neon_st2:
5369 case Intrinsic::aarch64_neon_st3:
5370 case Intrinsic::aarch64_neon_st4: {
5371 handleNEONVectorStoreIntrinsic(I, useLane: false);
5372 break;
5373 }
5374
5375 case Intrinsic::aarch64_neon_st2lane:
5376 case Intrinsic::aarch64_neon_st3lane:
5377 case Intrinsic::aarch64_neon_st4lane: {
5378 handleNEONVectorStoreIntrinsic(I, useLane: true);
5379 break;
5380 }
5381
5382 // Arm NEON vector table intrinsics have the source/table register(s) as
5383 // arguments, followed by the index register. They return the output.
5384 //
5385 // 'TBL writes a zero if an index is out-of-range, while TBX leaves the
5386 // original value unchanged in the destination register.'
5387 // Conveniently, zero denotes a clean shadow, which means out-of-range
5388 // indices for TBL will initialize the user data with zero and also clean
5389 // the shadow. (For TBX, neither the user data nor the shadow will be
5390 // updated, which is also correct.)
5391 case Intrinsic::aarch64_neon_tbl1:
5392 case Intrinsic::aarch64_neon_tbl2:
5393 case Intrinsic::aarch64_neon_tbl3:
5394 case Intrinsic::aarch64_neon_tbl4:
5395 case Intrinsic::aarch64_neon_tbx1:
5396 case Intrinsic::aarch64_neon_tbx2:
5397 case Intrinsic::aarch64_neon_tbx3:
5398 case Intrinsic::aarch64_neon_tbx4: {
5399 // The last trailing argument (index register) should be handled verbatim
5400 handleIntrinsicByApplyingToShadow(
5401 I, /*shadowIntrinsicID=*/I.getIntrinsicID(),
5402 /*trailingVerbatimArgs*/ 1);
5403 break;
5404 }
5405
5406 case Intrinsic::aarch64_neon_fmulx:
5407 case Intrinsic::aarch64_neon_pmul:
5408 case Intrinsic::aarch64_neon_pmull:
5409 case Intrinsic::aarch64_neon_smull:
5410 case Intrinsic::aarch64_neon_pmull64:
5411 case Intrinsic::aarch64_neon_umull: {
5412 handleNEONVectorMultiplyIntrinsic(I);
5413 break;
5414 }
5415
5416 case Intrinsic::scmp:
5417 case Intrinsic::ucmp: {
5418 handleShadowOr(I);
5419 break;
5420 }
5421
5422 default:
5423 if (!handleUnknownIntrinsic(I))
5424 visitInstruction(I);
5425 break;
5426 }
5427 }
5428
5429 void visitLibAtomicLoad(CallBase &CB) {
5430 // Since we use getNextNode here, we can't have CB terminate the BB.
5431 assert(isa<CallInst>(CB));
5432
5433 IRBuilder<> IRB(&CB);
5434 Value *Size = CB.getArgOperand(i: 0);
5435 Value *SrcPtr = CB.getArgOperand(i: 1);
5436 Value *DstPtr = CB.getArgOperand(i: 2);
5437 Value *Ordering = CB.getArgOperand(i: 3);
5438 // Convert the call to have at least Acquire ordering to make sure
5439 // the shadow operations aren't reordered before it.
5440 Value *NewOrdering =
5441 IRB.CreateExtractElement(Vec: makeAddAcquireOrderingTable(IRB), Idx: Ordering);
5442 CB.setArgOperand(i: 3, v: NewOrdering);
5443
5444 NextNodeIRBuilder NextIRB(&CB);
5445 Value *SrcShadowPtr, *SrcOriginPtr;
5446 std::tie(args&: SrcShadowPtr, args&: SrcOriginPtr) =
5447 getShadowOriginPtr(Addr: SrcPtr, IRB&: NextIRB, ShadowTy: NextIRB.getInt8Ty(), Alignment: Align(1),
5448 /*isStore*/ false);
5449 Value *DstShadowPtr =
5450 getShadowOriginPtr(Addr: DstPtr, IRB&: NextIRB, ShadowTy: NextIRB.getInt8Ty(), Alignment: Align(1),
5451 /*isStore*/ true)
5452 .first;
5453
5454 NextIRB.CreateMemCpy(Dst: DstShadowPtr, DstAlign: Align(1), Src: SrcShadowPtr, SrcAlign: Align(1), Size);
5455 if (MS.TrackOrigins) {
5456 Value *SrcOrigin = NextIRB.CreateAlignedLoad(Ty: MS.OriginTy, Ptr: SrcOriginPtr,
5457 Align: kMinOriginAlignment);
5458 Value *NewOrigin = updateOrigin(V: SrcOrigin, IRB&: NextIRB);
5459 NextIRB.CreateCall(Callee: MS.MsanSetOriginFn, Args: {DstPtr, Size, NewOrigin});
5460 }
5461 }
5462
5463 void visitLibAtomicStore(CallBase &CB) {
5464 IRBuilder<> IRB(&CB);
5465 Value *Size = CB.getArgOperand(i: 0);
5466 Value *DstPtr = CB.getArgOperand(i: 2);
5467 Value *Ordering = CB.getArgOperand(i: 3);
5468 // Convert the call to have at least Release ordering to make sure
5469 // the shadow operations aren't reordered after it.
5470 Value *NewOrdering =
5471 IRB.CreateExtractElement(Vec: makeAddReleaseOrderingTable(IRB), Idx: Ordering);
5472 CB.setArgOperand(i: 3, v: NewOrdering);
5473
5474 Value *DstShadowPtr =
5475 getShadowOriginPtr(Addr: DstPtr, IRB, ShadowTy: IRB.getInt8Ty(), Alignment: Align(1),
5476 /*isStore*/ true)
5477 .first;
5478
5479 // Atomic store always paints clean shadow/origin. See file header.
5480 IRB.CreateMemSet(Ptr: DstShadowPtr, Val: getCleanShadow(OrigTy: IRB.getInt8Ty()), Size,
5481 Align: Align(1));
5482 }
5483
5484 void visitCallBase(CallBase &CB) {
5485 assert(!CB.getMetadata(LLVMContext::MD_nosanitize));
5486 if (CB.isInlineAsm()) {
5487 // For inline asm (either a call to asm function, or callbr instruction),
5488 // do the usual thing: check argument shadow and mark all outputs as
5489 // clean. Note that any side effects of the inline asm that are not
5490 // immediately visible in its constraints are not handled.
5491 if (ClHandleAsmConservative)
5492 visitAsmInstruction(I&: CB);
5493 else
5494 visitInstruction(I&: CB);
5495 return;
5496 }
5497 LibFunc LF;
5498 if (TLI->getLibFunc(CB, F&: LF)) {
5499 // libatomic.a functions need to have special handling because there isn't
5500 // a good way to intercept them or compile the library with
5501 // instrumentation.
5502 switch (LF) {
5503 case LibFunc_atomic_load:
5504 if (!isa<CallInst>(Val: CB)) {
5505 llvm::errs() << "MSAN -- cannot instrument invoke of libatomic load."
5506 "Ignoring!\n";
5507 break;
5508 }
5509 visitLibAtomicLoad(CB);
5510 return;
5511 case LibFunc_atomic_store:
5512 visitLibAtomicStore(CB);
5513 return;
5514 default:
5515 break;
5516 }
5517 }
5518
5519 if (auto *Call = dyn_cast<CallInst>(Val: &CB)) {
5520 assert(!isa<IntrinsicInst>(Call) && "intrinsics are handled elsewhere");
5521
5522 // We are going to insert code that relies on the fact that the callee
5523 // will become a non-readonly function after it is instrumented by us. To
5524 // prevent this code from being optimized out, mark that function
5525 // non-readonly in advance.
5526 // TODO: We can likely do better than dropping memory() completely here.
5527 AttributeMask B;
5528 B.addAttribute(Val: Attribute::Memory).addAttribute(Val: Attribute::Speculatable);
5529
5530 Call->removeFnAttrs(AttrsToRemove: B);
5531 if (Function *Func = Call->getCalledFunction()) {
5532 Func->removeFnAttrs(Attrs: B);
5533 }
5534
5535 maybeMarkSanitizerLibraryCallNoBuiltin(CI: Call, TLI);
5536 }
5537 IRBuilder<> IRB(&CB);
5538 bool MayCheckCall = MS.EagerChecks;
5539 if (Function *Func = CB.getCalledFunction()) {
5540 // __sanitizer_unaligned_{load,store} functions may be called by users
5541 // and always expects shadows in the TLS. So don't check them.
5542 MayCheckCall &= !Func->getName().starts_with(Prefix: "__sanitizer_unaligned_");
5543 }
5544
5545 unsigned ArgOffset = 0;
5546 LLVM_DEBUG(dbgs() << " CallSite: " << CB << "\n");
5547 for (const auto &[i, A] : llvm::enumerate(First: CB.args())) {
5548 if (!A->getType()->isSized()) {
5549 LLVM_DEBUG(dbgs() << "Arg " << i << " is not sized: " << CB << "\n");
5550 continue;
5551 }
5552
5553 if (A->getType()->isScalableTy()) {
5554 LLVM_DEBUG(dbgs() << "Arg " << i << " is vscale: " << CB << "\n");
5555 // Handle as noundef, but don't reserve tls slots.
5556 insertShadowCheck(Val: A, OrigIns: &CB);
5557 continue;
5558 }
5559
5560 unsigned Size = 0;
5561 const DataLayout &DL = F.getDataLayout();
5562
5563 bool ByVal = CB.paramHasAttr(ArgNo: i, Kind: Attribute::ByVal);
5564 bool NoUndef = CB.paramHasAttr(ArgNo: i, Kind: Attribute::NoUndef);
5565 bool EagerCheck = MayCheckCall && !ByVal && NoUndef;
5566
5567 if (EagerCheck) {
5568 insertShadowCheck(Val: A, OrigIns: &CB);
5569 Size = DL.getTypeAllocSize(Ty: A->getType());
5570 } else {
5571 [[maybe_unused]] Value *Store = nullptr;
5572 // Compute the Shadow for arg even if it is ByVal, because
5573 // in that case getShadow() will copy the actual arg shadow to
5574 // __msan_param_tls.
5575 Value *ArgShadow = getShadow(V: A);
5576 Value *ArgShadowBase = getShadowPtrForArgument(IRB, ArgOffset);
5577 LLVM_DEBUG(dbgs() << " Arg#" << i << ": " << *A
5578 << " Shadow: " << *ArgShadow << "\n");
5579 if (ByVal) {
5580 // ByVal requires some special handling as it's too big for a single
5581 // load
5582 assert(A->getType()->isPointerTy() &&
5583 "ByVal argument is not a pointer!");
5584 Size = DL.getTypeAllocSize(Ty: CB.getParamByValType(ArgNo: i));
5585 if (ArgOffset + Size > kParamTLSSize)
5586 break;
5587 const MaybeAlign ParamAlignment(CB.getParamAlign(ArgNo: i));
5588 MaybeAlign Alignment = std::nullopt;
5589 if (ParamAlignment)
5590 Alignment = std::min(a: *ParamAlignment, b: kShadowTLSAlignment);
5591 Value *AShadowPtr, *AOriginPtr;
5592 std::tie(args&: AShadowPtr, args&: AOriginPtr) =
5593 getShadowOriginPtr(Addr: A, IRB, ShadowTy: IRB.getInt8Ty(), Alignment,
5594 /*isStore*/ false);
5595 if (!PropagateShadow) {
5596 Store = IRB.CreateMemSet(Ptr: ArgShadowBase,
5597 Val: Constant::getNullValue(Ty: IRB.getInt8Ty()),
5598 Size, Align: Alignment);
5599 } else {
5600 Store = IRB.CreateMemCpy(Dst: ArgShadowBase, DstAlign: Alignment, Src: AShadowPtr,
5601 SrcAlign: Alignment, Size);
5602 if (MS.TrackOrigins) {
5603 Value *ArgOriginBase = getOriginPtrForArgument(IRB, ArgOffset);
5604 // FIXME: OriginSize should be:
5605 // alignTo(A % kMinOriginAlignment + Size, kMinOriginAlignment)
5606 unsigned OriginSize = alignTo(Size, A: kMinOriginAlignment);
5607 IRB.CreateMemCpy(
5608 Dst: ArgOriginBase,
5609 /* by origin_tls[ArgOffset] */ DstAlign: kMinOriginAlignment,
5610 Src: AOriginPtr,
5611 /* by getShadowOriginPtr */ SrcAlign: kMinOriginAlignment, Size: OriginSize);
5612 }
5613 }
5614 } else {
5615 // Any other parameters mean we need bit-grained tracking of uninit
5616 // data
5617 Size = DL.getTypeAllocSize(Ty: A->getType());
5618 if (ArgOffset + Size > kParamTLSSize)
5619 break;
5620 Store = IRB.CreateAlignedStore(Val: ArgShadow, Ptr: ArgShadowBase,
5621 Align: kShadowTLSAlignment);
5622 Constant *Cst = dyn_cast<Constant>(Val: ArgShadow);
5623 if (MS.TrackOrigins && !(Cst && Cst->isNullValue())) {
5624 IRB.CreateStore(Val: getOrigin(V: A),
5625 Ptr: getOriginPtrForArgument(IRB, ArgOffset));
5626 }
5627 }
5628 assert(Store != nullptr);
5629 LLVM_DEBUG(dbgs() << " Param:" << *Store << "\n");
5630 }
5631 assert(Size != 0);
5632 ArgOffset += alignTo(Size, A: kShadowTLSAlignment);
5633 }
5634 LLVM_DEBUG(dbgs() << " done with call args\n");
5635
5636 FunctionType *FT = CB.getFunctionType();
5637 if (FT->isVarArg()) {
5638 VAHelper->visitCallBase(CB, IRB);
5639 }
5640
5641 // Now, get the shadow for the RetVal.
5642 if (!CB.getType()->isSized())
5643 return;
5644 // Don't emit the epilogue for musttail call returns.
5645 if (isa<CallInst>(Val: CB) && cast<CallInst>(Val&: CB).isMustTailCall())
5646 return;
5647
5648 if (MayCheckCall && CB.hasRetAttr(Kind: Attribute::NoUndef)) {
5649 setShadow(V: &CB, SV: getCleanShadow(V: &CB));
5650 setOrigin(V: &CB, Origin: getCleanOrigin());
5651 return;
5652 }
5653
5654 IRBuilder<> IRBBefore(&CB);
5655 // Until we have full dynamic coverage, make sure the retval shadow is 0.
5656 Value *Base = getShadowPtrForRetval(IRB&: IRBBefore);
5657 IRBBefore.CreateAlignedStore(Val: getCleanShadow(V: &CB), Ptr: Base,
5658 Align: kShadowTLSAlignment);
5659 BasicBlock::iterator NextInsn;
5660 if (isa<CallInst>(Val: CB)) {
5661 NextInsn = ++CB.getIterator();
5662 assert(NextInsn != CB.getParent()->end());
5663 } else {
5664 BasicBlock *NormalDest = cast<InvokeInst>(Val&: CB).getNormalDest();
5665 if (!NormalDest->getSinglePredecessor()) {
5666 // FIXME: this case is tricky, so we are just conservative here.
5667 // Perhaps we need to split the edge between this BB and NormalDest,
5668 // but a naive attempt to use SplitEdge leads to a crash.
5669 setShadow(V: &CB, SV: getCleanShadow(V: &CB));
5670 setOrigin(V: &CB, Origin: getCleanOrigin());
5671 return;
5672 }
5673 // FIXME: NextInsn is likely in a basic block that has not been visited
5674 // yet. Anything inserted there will be instrumented by MSan later!
5675 NextInsn = NormalDest->getFirstInsertionPt();
5676 assert(NextInsn != NormalDest->end() &&
5677 "Could not find insertion point for retval shadow load");
5678 }
5679 IRBuilder<> IRBAfter(&*NextInsn);
5680 Value *RetvalShadow = IRBAfter.CreateAlignedLoad(
5681 Ty: getShadowTy(V: &CB), Ptr: getShadowPtrForRetval(IRB&: IRBAfter), Align: kShadowTLSAlignment,
5682 Name: "_msret");
5683 setShadow(V: &CB, SV: RetvalShadow);
5684 if (MS.TrackOrigins)
5685 setOrigin(V: &CB, Origin: IRBAfter.CreateLoad(Ty: MS.OriginTy, Ptr: getOriginPtrForRetval()));
5686 }
5687
5688 bool isAMustTailRetVal(Value *RetVal) {
5689 if (auto *I = dyn_cast<BitCastInst>(Val: RetVal)) {
5690 RetVal = I->getOperand(i_nocapture: 0);
5691 }
5692 if (auto *I = dyn_cast<CallInst>(Val: RetVal)) {
5693 return I->isMustTailCall();
5694 }
5695 return false;
5696 }
5697
5698 void visitReturnInst(ReturnInst &I) {
5699 IRBuilder<> IRB(&I);
5700 Value *RetVal = I.getReturnValue();
5701 if (!RetVal)
5702 return;
5703 // Don't emit the epilogue for musttail call returns.
5704 if (isAMustTailRetVal(RetVal))
5705 return;
5706 Value *ShadowPtr = getShadowPtrForRetval(IRB);
5707 bool HasNoUndef = F.hasRetAttribute(Kind: Attribute::NoUndef);
5708 bool StoreShadow = !(MS.EagerChecks && HasNoUndef);
5709 // FIXME: Consider using SpecialCaseList to specify a list of functions that
5710 // must always return fully initialized values. For now, we hardcode "main".
5711 bool EagerCheck = (MS.EagerChecks && HasNoUndef) || (F.getName() == "main");
5712
5713 Value *Shadow = getShadow(V: RetVal);
5714 bool StoreOrigin = true;
5715 if (EagerCheck) {
5716 insertShadowCheck(Val: RetVal, OrigIns: &I);
5717 Shadow = getCleanShadow(V: RetVal);
5718 StoreOrigin = false;
5719 }
5720
5721 // The caller may still expect information passed over TLS if we pass our
5722 // check
5723 if (StoreShadow) {
5724 IRB.CreateAlignedStore(Val: Shadow, Ptr: ShadowPtr, Align: kShadowTLSAlignment);
5725 if (MS.TrackOrigins && StoreOrigin)
5726 IRB.CreateStore(Val: getOrigin(V: RetVal), Ptr: getOriginPtrForRetval());
5727 }
5728 }
5729
5730 void visitPHINode(PHINode &I) {
5731 IRBuilder<> IRB(&I);
5732 if (!PropagateShadow) {
5733 setShadow(V: &I, SV: getCleanShadow(V: &I));
5734 setOrigin(V: &I, Origin: getCleanOrigin());
5735 return;
5736 }
5737
5738 ShadowPHINodes.push_back(Elt: &I);
5739 setShadow(V: &I, SV: IRB.CreatePHI(Ty: getShadowTy(V: &I), NumReservedValues: I.getNumIncomingValues(),
5740 Name: "_msphi_s"));
5741 if (MS.TrackOrigins)
5742 setOrigin(
5743 V: &I, Origin: IRB.CreatePHI(Ty: MS.OriginTy, NumReservedValues: I.getNumIncomingValues(), Name: "_msphi_o"));
5744 }
5745
5746 Value *getLocalVarIdptr(AllocaInst &I) {
5747 ConstantInt *IntConst =
5748 ConstantInt::get(Ty: Type::getInt32Ty(C&: (*F.getParent()).getContext()), V: 0);
5749 return new GlobalVariable(*F.getParent(), IntConst->getType(),
5750 /*isConstant=*/false, GlobalValue::PrivateLinkage,
5751 IntConst);
5752 }
5753
5754 Value *getLocalVarDescription(AllocaInst &I) {
5755 return createPrivateConstGlobalForString(M&: *F.getParent(), Str: I.getName());
5756 }
5757
5758 void poisonAllocaUserspace(AllocaInst &I, IRBuilder<> &IRB, Value *Len) {
5759 if (PoisonStack && ClPoisonStackWithCall) {
5760 IRB.CreateCall(Callee: MS.MsanPoisonStackFn, Args: {&I, Len});
5761 } else {
5762 Value *ShadowBase, *OriginBase;
5763 std::tie(args&: ShadowBase, args&: OriginBase) = getShadowOriginPtr(
5764 Addr: &I, IRB, ShadowTy: IRB.getInt8Ty(), Alignment: Align(1), /*isStore*/ true);
5765
5766 Value *PoisonValue = IRB.getInt8(C: PoisonStack ? ClPoisonStackPattern : 0);
5767 IRB.CreateMemSet(Ptr: ShadowBase, Val: PoisonValue, Size: Len, Align: I.getAlign());
5768 }
5769
5770 if (PoisonStack && MS.TrackOrigins) {
5771 Value *Idptr = getLocalVarIdptr(I);
5772 if (ClPrintStackNames) {
5773 Value *Descr = getLocalVarDescription(I);
5774 IRB.CreateCall(Callee: MS.MsanSetAllocaOriginWithDescriptionFn,
5775 Args: {&I, Len, Idptr, Descr});
5776 } else {
5777 IRB.CreateCall(Callee: MS.MsanSetAllocaOriginNoDescriptionFn, Args: {&I, Len, Idptr});
5778 }
5779 }
5780 }
5781
5782 void poisonAllocaKmsan(AllocaInst &I, IRBuilder<> &IRB, Value *Len) {
5783 Value *Descr = getLocalVarDescription(I);
5784 if (PoisonStack) {
5785 IRB.CreateCall(Callee: MS.MsanPoisonAllocaFn, Args: {&I, Len, Descr});
5786 } else {
5787 IRB.CreateCall(Callee: MS.MsanUnpoisonAllocaFn, Args: {&I, Len});
5788 }
5789 }
5790
5791 void instrumentAlloca(AllocaInst &I, Instruction *InsPoint = nullptr) {
5792 if (!InsPoint)
5793 InsPoint = &I;
5794 NextNodeIRBuilder IRB(InsPoint);
5795 const DataLayout &DL = F.getDataLayout();
5796 TypeSize TS = DL.getTypeAllocSize(Ty: I.getAllocatedType());
5797 Value *Len = IRB.CreateTypeSize(Ty: MS.IntptrTy, Size: TS);
5798 if (I.isArrayAllocation())
5799 Len = IRB.CreateMul(LHS: Len,
5800 RHS: IRB.CreateZExtOrTrunc(V: I.getArraySize(), DestTy: MS.IntptrTy));
5801
5802 if (MS.CompileKernel)
5803 poisonAllocaKmsan(I, IRB, Len);
5804 else
5805 poisonAllocaUserspace(I, IRB, Len);
5806 }
5807
5808 void visitAllocaInst(AllocaInst &I) {
5809 setShadow(V: &I, SV: getCleanShadow(V: &I));
5810 setOrigin(V: &I, Origin: getCleanOrigin());
5811 // We'll get to this alloca later unless it's poisoned at the corresponding
5812 // llvm.lifetime.start.
5813 AllocaSet.insert(X: &I);
5814 }
5815
5816 void visitSelectInst(SelectInst &I) {
5817 // a = select b, c, d
5818 Value *B = I.getCondition();
5819 Value *C = I.getTrueValue();
5820 Value *D = I.getFalseValue();
5821
5822 handleSelectLikeInst(I, B, C, D);
5823 }
5824
5825 void handleSelectLikeInst(Instruction &I, Value *B, Value *C, Value *D) {
5826 IRBuilder<> IRB(&I);
5827
5828 Value *Sb = getShadow(V: B);
5829 Value *Sc = getShadow(V: C);
5830 Value *Sd = getShadow(V: D);
5831
5832 Value *Ob = MS.TrackOrigins ? getOrigin(V: B) : nullptr;
5833 Value *Oc = MS.TrackOrigins ? getOrigin(V: C) : nullptr;
5834 Value *Od = MS.TrackOrigins ? getOrigin(V: D) : nullptr;
5835
5836 // Result shadow if condition shadow is 0.
5837 Value *Sa0 = IRB.CreateSelect(C: B, True: Sc, False: Sd);
5838 Value *Sa1;
5839 if (I.getType()->isAggregateType()) {
5840 // To avoid "sign extending" i1 to an arbitrary aggregate type, we just do
5841 // an extra "select". This results in much more compact IR.
5842 // Sa = select Sb, poisoned, (select b, Sc, Sd)
5843 Sa1 = getPoisonedShadow(ShadowTy: getShadowTy(OrigTy: I.getType()));
5844 } else {
5845 // Sa = select Sb, [ (c^d) | Sc | Sd ], [ b ? Sc : Sd ]
5846 // If Sb (condition is poisoned), look for bits in c and d that are equal
5847 // and both unpoisoned.
5848 // If !Sb (condition is unpoisoned), simply pick one of Sc and Sd.
5849
5850 // Cast arguments to shadow-compatible type.
5851 C = CreateAppToShadowCast(IRB, V: C);
5852 D = CreateAppToShadowCast(IRB, V: D);
5853
5854 // Result shadow if condition shadow is 1.
5855 Sa1 = IRB.CreateOr(Ops: {IRB.CreateXor(LHS: C, RHS: D), Sc, Sd});
5856 }
5857 Value *Sa = IRB.CreateSelect(C: Sb, True: Sa1, False: Sa0, Name: "_msprop_select");
5858 setShadow(V: &I, SV: Sa);
5859 if (MS.TrackOrigins) {
5860 // Origins are always i32, so any vector conditions must be flattened.
5861 // FIXME: consider tracking vector origins for app vectors?
5862 if (B->getType()->isVectorTy()) {
5863 B = convertToBool(V: B, IRB);
5864 Sb = convertToBool(V: Sb, IRB);
5865 }
5866 // a = select b, c, d
5867 // Oa = Sb ? Ob : (b ? Oc : Od)
5868 setOrigin(V: &I, Origin: IRB.CreateSelect(C: Sb, True: Ob, False: IRB.CreateSelect(C: B, True: Oc, False: Od)));
5869 }
5870 }
5871
5872 void visitLandingPadInst(LandingPadInst &I) {
5873 // Do nothing.
5874 // See https://github.com/google/sanitizers/issues/504
5875 setShadow(V: &I, SV: getCleanShadow(V: &I));
5876 setOrigin(V: &I, Origin: getCleanOrigin());
5877 }
5878
5879 void visitCatchSwitchInst(CatchSwitchInst &I) {
5880 setShadow(V: &I, SV: getCleanShadow(V: &I));
5881 setOrigin(V: &I, Origin: getCleanOrigin());
5882 }
5883
5884 void visitFuncletPadInst(FuncletPadInst &I) {
5885 setShadow(V: &I, SV: getCleanShadow(V: &I));
5886 setOrigin(V: &I, Origin: getCleanOrigin());
5887 }
5888
5889 void visitGetElementPtrInst(GetElementPtrInst &I) { handleShadowOr(I); }
5890
5891 void visitExtractValueInst(ExtractValueInst &I) {
5892 IRBuilder<> IRB(&I);
5893 Value *Agg = I.getAggregateOperand();
5894 LLVM_DEBUG(dbgs() << "ExtractValue: " << I << "\n");
5895 Value *AggShadow = getShadow(V: Agg);
5896 LLVM_DEBUG(dbgs() << " AggShadow: " << *AggShadow << "\n");
5897 Value *ResShadow = IRB.CreateExtractValue(Agg: AggShadow, Idxs: I.getIndices());
5898 LLVM_DEBUG(dbgs() << " ResShadow: " << *ResShadow << "\n");
5899 setShadow(V: &I, SV: ResShadow);
5900 setOriginForNaryOp(I);
5901 }
5902
5903 void visitInsertValueInst(InsertValueInst &I) {
5904 IRBuilder<> IRB(&I);
5905 LLVM_DEBUG(dbgs() << "InsertValue: " << I << "\n");
5906 Value *AggShadow = getShadow(V: I.getAggregateOperand());
5907 Value *InsShadow = getShadow(V: I.getInsertedValueOperand());
5908 LLVM_DEBUG(dbgs() << " AggShadow: " << *AggShadow << "\n");
5909 LLVM_DEBUG(dbgs() << " InsShadow: " << *InsShadow << "\n");
5910 Value *Res = IRB.CreateInsertValue(Agg: AggShadow, Val: InsShadow, Idxs: I.getIndices());
5911 LLVM_DEBUG(dbgs() << " Res: " << *Res << "\n");
5912 setShadow(V: &I, SV: Res);
5913 setOriginForNaryOp(I);
5914 }
5915
5916 void dumpInst(Instruction &I) {
5917 if (CallInst *CI = dyn_cast<CallInst>(Val: &I)) {
5918 errs() << "ZZZ call " << CI->getCalledFunction()->getName() << "\n";
5919 } else {
5920 errs() << "ZZZ " << I.getOpcodeName() << "\n";
5921 }
5922 errs() << "QQQ " << I << "\n";
5923 }
5924
5925 void visitResumeInst(ResumeInst &I) {
5926 LLVM_DEBUG(dbgs() << "Resume: " << I << "\n");
5927 // Nothing to do here.
5928 }
5929
5930 void visitCleanupReturnInst(CleanupReturnInst &CRI) {
5931 LLVM_DEBUG(dbgs() << "CleanupReturn: " << CRI << "\n");
5932 // Nothing to do here.
5933 }
5934
5935 void visitCatchReturnInst(CatchReturnInst &CRI) {
5936 LLVM_DEBUG(dbgs() << "CatchReturn: " << CRI << "\n");
5937 // Nothing to do here.
5938 }
5939
5940 void instrumentAsmArgument(Value *Operand, Type *ElemTy, Instruction &I,
5941 IRBuilder<> &IRB, const DataLayout &DL,
5942 bool isOutput) {
5943 // For each assembly argument, we check its value for being initialized.
5944 // If the argument is a pointer, we assume it points to a single element
5945 // of the corresponding type (or to a 8-byte word, if the type is unsized).
5946 // Each such pointer is instrumented with a call to the runtime library.
5947 Type *OpType = Operand->getType();
5948 // Check the operand value itself.
5949 insertShadowCheck(Val: Operand, OrigIns: &I);
5950 if (!OpType->isPointerTy() || !isOutput) {
5951 assert(!isOutput);
5952 return;
5953 }
5954 if (!ElemTy->isSized())
5955 return;
5956 auto Size = DL.getTypeStoreSize(Ty: ElemTy);
5957 Value *SizeVal = IRB.CreateTypeSize(Ty: MS.IntptrTy, Size);
5958 if (MS.CompileKernel) {
5959 IRB.CreateCall(Callee: MS.MsanInstrumentAsmStoreFn, Args: {Operand, SizeVal});
5960 } else {
5961 // ElemTy, derived from elementtype(), does not encode the alignment of
5962 // the pointer. Conservatively assume that the shadow memory is unaligned.
5963 // When Size is large, avoid StoreInst as it would expand to many
5964 // instructions.
5965 auto [ShadowPtr, _] =
5966 getShadowOriginPtrUserspace(Addr: Operand, IRB, ShadowTy: IRB.getInt8Ty(), Alignment: Align(1));
5967 if (Size <= 32)
5968 IRB.CreateAlignedStore(Val: getCleanShadow(OrigTy: ElemTy), Ptr: ShadowPtr, Align: Align(1));
5969 else
5970 IRB.CreateMemSet(Ptr: ShadowPtr, Val: ConstantInt::getNullValue(Ty: IRB.getInt8Ty()),
5971 Size: SizeVal, Align: Align(1));
5972 }
5973 }
5974
5975 /// Get the number of output arguments returned by pointers.
5976 int getNumOutputArgs(InlineAsm *IA, CallBase *CB) {
5977 int NumRetOutputs = 0;
5978 int NumOutputs = 0;
5979 Type *RetTy = cast<Value>(Val: CB)->getType();
5980 if (!RetTy->isVoidTy()) {
5981 // Register outputs are returned via the CallInst return value.
5982 auto *ST = dyn_cast<StructType>(Val: RetTy);
5983 if (ST)
5984 NumRetOutputs = ST->getNumElements();
5985 else
5986 NumRetOutputs = 1;
5987 }
5988 InlineAsm::ConstraintInfoVector Constraints = IA->ParseConstraints();
5989 for (const InlineAsm::ConstraintInfo &Info : Constraints) {
5990 switch (Info.Type) {
5991 case InlineAsm::isOutput:
5992 NumOutputs++;
5993 break;
5994 default:
5995 break;
5996 }
5997 }
5998 return NumOutputs - NumRetOutputs;
5999 }
6000
6001 void visitAsmInstruction(Instruction &I) {
6002 // Conservative inline assembly handling: check for poisoned shadow of
6003 // asm() arguments, then unpoison the result and all the memory locations
6004 // pointed to by those arguments.
6005 // An inline asm() statement in C++ contains lists of input and output
6006 // arguments used by the assembly code. These are mapped to operands of the
6007 // CallInst as follows:
6008 // - nR register outputs ("=r) are returned by value in a single structure
6009 // (SSA value of the CallInst);
6010 // - nO other outputs ("=m" and others) are returned by pointer as first
6011 // nO operands of the CallInst;
6012 // - nI inputs ("r", "m" and others) are passed to CallInst as the
6013 // remaining nI operands.
6014 // The total number of asm() arguments in the source is nR+nO+nI, and the
6015 // corresponding CallInst has nO+nI+1 operands (the last operand is the
6016 // function to be called).
6017 const DataLayout &DL = F.getDataLayout();
6018 CallBase *CB = cast<CallBase>(Val: &I);
6019 IRBuilder<> IRB(&I);
6020 InlineAsm *IA = cast<InlineAsm>(Val: CB->getCalledOperand());
6021 int OutputArgs = getNumOutputArgs(IA, CB);
6022 // The last operand of a CallInst is the function itself.
6023 int NumOperands = CB->getNumOperands() - 1;
6024
6025 // Check input arguments. Doing so before unpoisoning output arguments, so
6026 // that we won't overwrite uninit values before checking them.
6027 for (int i = OutputArgs; i < NumOperands; i++) {
6028 Value *Operand = CB->getOperand(i_nocapture: i);
6029 instrumentAsmArgument(Operand, ElemTy: CB->getParamElementType(ArgNo: i), I, IRB, DL,
6030 /*isOutput*/ false);
6031 }
6032 // Unpoison output arguments. This must happen before the actual InlineAsm
6033 // call, so that the shadow for memory published in the asm() statement
6034 // remains valid.
6035 for (int i = 0; i < OutputArgs; i++) {
6036 Value *Operand = CB->getOperand(i_nocapture: i);
6037 instrumentAsmArgument(Operand, ElemTy: CB->getParamElementType(ArgNo: i), I, IRB, DL,
6038 /*isOutput*/ true);
6039 }
6040
6041 setShadow(V: &I, SV: getCleanShadow(V: &I));
6042 setOrigin(V: &I, Origin: getCleanOrigin());
6043 }
6044
6045 void visitFreezeInst(FreezeInst &I) {
6046 // Freeze always returns a fully defined value.
6047 setShadow(V: &I, SV: getCleanShadow(V: &I));
6048 setOrigin(V: &I, Origin: getCleanOrigin());
6049 }
6050
6051 void visitInstruction(Instruction &I) {
6052 // Everything else: stop propagating and check for poisoned shadow.
6053 if (ClDumpStrictInstructions)
6054 dumpInst(I);
6055 LLVM_DEBUG(dbgs() << "DEFAULT: " << I << "\n");
6056 for (size_t i = 0, n = I.getNumOperands(); i < n; i++) {
6057 Value *Operand = I.getOperand(i);
6058 if (Operand->getType()->isSized())
6059 insertShadowCheck(Val: Operand, OrigIns: &I);
6060 }
6061 setShadow(V: &I, SV: getCleanShadow(V: &I));
6062 setOrigin(V: &I, Origin: getCleanOrigin());
6063 }
6064};
6065
6066struct VarArgHelperBase : public VarArgHelper {
6067 Function &F;
6068 MemorySanitizer &MS;
6069 MemorySanitizerVisitor &MSV;
6070 SmallVector<CallInst *, 16> VAStartInstrumentationList;
6071 const unsigned VAListTagSize;
6072
6073 VarArgHelperBase(Function &F, MemorySanitizer &MS,
6074 MemorySanitizerVisitor &MSV, unsigned VAListTagSize)
6075 : F(F), MS(MS), MSV(MSV), VAListTagSize(VAListTagSize) {}
6076
6077 Value *getShadowAddrForVAArgument(IRBuilder<> &IRB, unsigned ArgOffset) {
6078 Value *Base = IRB.CreatePointerCast(V: MS.VAArgTLS, DestTy: MS.IntptrTy);
6079 return IRB.CreateAdd(LHS: Base, RHS: ConstantInt::get(Ty: MS.IntptrTy, V: ArgOffset));
6080 }
6081
6082 /// Compute the shadow address for a given va_arg.
6083 Value *getShadowPtrForVAArgument(IRBuilder<> &IRB, unsigned ArgOffset) {
6084 Value *Base = IRB.CreatePointerCast(V: MS.VAArgTLS, DestTy: MS.IntptrTy);
6085 Base = IRB.CreateAdd(LHS: Base, RHS: ConstantInt::get(Ty: MS.IntptrTy, V: ArgOffset));
6086 return IRB.CreateIntToPtr(V: Base, DestTy: MS.PtrTy, Name: "_msarg_va_s");
6087 }
6088
6089 /// Compute the shadow address for a given va_arg.
6090 Value *getShadowPtrForVAArgument(IRBuilder<> &IRB, unsigned ArgOffset,
6091 unsigned ArgSize) {
6092 // Make sure we don't overflow __msan_va_arg_tls.
6093 if (ArgOffset + ArgSize > kParamTLSSize)
6094 return nullptr;
6095 return getShadowPtrForVAArgument(IRB, ArgOffset);
6096 }
6097
6098 /// Compute the origin address for a given va_arg.
6099 Value *getOriginPtrForVAArgument(IRBuilder<> &IRB, int ArgOffset) {
6100 Value *Base = IRB.CreatePointerCast(V: MS.VAArgOriginTLS, DestTy: MS.IntptrTy);
6101 // getOriginPtrForVAArgument() is always called after
6102 // getShadowPtrForVAArgument(), so __msan_va_arg_origin_tls can never
6103 // overflow.
6104 Base = IRB.CreateAdd(LHS: Base, RHS: ConstantInt::get(Ty: MS.IntptrTy, V: ArgOffset));
6105 return IRB.CreateIntToPtr(V: Base, DestTy: MS.PtrTy, Name: "_msarg_va_o");
6106 }
6107
6108 void CleanUnusedTLS(IRBuilder<> &IRB, Value *ShadowBase,
6109 unsigned BaseOffset) {
6110 // The tails of __msan_va_arg_tls is not large enough to fit full
6111 // value shadow, but it will be copied to backup anyway. Make it
6112 // clean.
6113 if (BaseOffset >= kParamTLSSize)
6114 return;
6115 Value *TailSize =
6116 ConstantInt::getSigned(Ty: IRB.getInt32Ty(), V: kParamTLSSize - BaseOffset);
6117 IRB.CreateMemSet(Ptr: ShadowBase, Val: ConstantInt::getNullValue(Ty: IRB.getInt8Ty()),
6118 Size: TailSize, Align: Align(8));
6119 }
6120
6121 void unpoisonVAListTagForInst(IntrinsicInst &I) {
6122 IRBuilder<> IRB(&I);
6123 Value *VAListTag = I.getArgOperand(i: 0);
6124 const Align Alignment = Align(8);
6125 auto [ShadowPtr, OriginPtr] = MSV.getShadowOriginPtr(
6126 Addr: VAListTag, IRB, ShadowTy: IRB.getInt8Ty(), Alignment, /*isStore*/ true);
6127 // Unpoison the whole __va_list_tag.
6128 IRB.CreateMemSet(Ptr: ShadowPtr, Val: Constant::getNullValue(Ty: IRB.getInt8Ty()),
6129 Size: VAListTagSize, Align: Alignment, isVolatile: false);
6130 }
6131
6132 void visitVAStartInst(VAStartInst &I) override {
6133 if (F.getCallingConv() == CallingConv::Win64)
6134 return;
6135 VAStartInstrumentationList.push_back(Elt: &I);
6136 unpoisonVAListTagForInst(I);
6137 }
6138
6139 void visitVACopyInst(VACopyInst &I) override {
6140 if (F.getCallingConv() == CallingConv::Win64)
6141 return;
6142 unpoisonVAListTagForInst(I);
6143 }
6144};
6145
6146/// AMD64-specific implementation of VarArgHelper.
6147struct VarArgAMD64Helper : public VarArgHelperBase {
6148 // An unfortunate workaround for asymmetric lowering of va_arg stuff.
6149 // See a comment in visitCallBase for more details.
6150 static const unsigned AMD64GpEndOffset = 48; // AMD64 ABI Draft 0.99.6 p3.5.7
6151 static const unsigned AMD64FpEndOffsetSSE = 176;
6152 // If SSE is disabled, fp_offset in va_list is zero.
6153 static const unsigned AMD64FpEndOffsetNoSSE = AMD64GpEndOffset;
6154
6155 unsigned AMD64FpEndOffset;
6156 AllocaInst *VAArgTLSCopy = nullptr;
6157 AllocaInst *VAArgTLSOriginCopy = nullptr;
6158 Value *VAArgOverflowSize = nullptr;
6159
6160 enum ArgKind { AK_GeneralPurpose, AK_FloatingPoint, AK_Memory };
6161
6162 VarArgAMD64Helper(Function &F, MemorySanitizer &MS,
6163 MemorySanitizerVisitor &MSV)
6164 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/24) {
6165 AMD64FpEndOffset = AMD64FpEndOffsetSSE;
6166 for (const auto &Attr : F.getAttributes().getFnAttrs()) {
6167 if (Attr.isStringAttribute() &&
6168 (Attr.getKindAsString() == "target-features")) {
6169 if (Attr.getValueAsString().contains(Other: "-sse"))
6170 AMD64FpEndOffset = AMD64FpEndOffsetNoSSE;
6171 break;
6172 }
6173 }
6174 }
6175
6176 ArgKind classifyArgument(Value *arg) {
6177 // A very rough approximation of X86_64 argument classification rules.
6178 Type *T = arg->getType();
6179 if (T->isX86_FP80Ty())
6180 return AK_Memory;
6181 if (T->isFPOrFPVectorTy())
6182 return AK_FloatingPoint;
6183 if (T->isIntegerTy() && T->getPrimitiveSizeInBits() <= 64)
6184 return AK_GeneralPurpose;
6185 if (T->isPointerTy())
6186 return AK_GeneralPurpose;
6187 return AK_Memory;
6188 }
6189
6190 // For VarArg functions, store the argument shadow in an ABI-specific format
6191 // that corresponds to va_list layout.
6192 // We do this because Clang lowers va_arg in the frontend, and this pass
6193 // only sees the low level code that deals with va_list internals.
6194 // A much easier alternative (provided that Clang emits va_arg instructions)
6195 // would have been to associate each live instance of va_list with a copy of
6196 // MSanParamTLS, and extract shadow on va_arg() call in the argument list
6197 // order.
6198 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
6199 unsigned GpOffset = 0;
6200 unsigned FpOffset = AMD64GpEndOffset;
6201 unsigned OverflowOffset = AMD64FpEndOffset;
6202 const DataLayout &DL = F.getDataLayout();
6203
6204 for (const auto &[ArgNo, A] : llvm::enumerate(First: CB.args())) {
6205 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
6206 bool IsByVal = CB.paramHasAttr(ArgNo, Kind: Attribute::ByVal);
6207 if (IsByVal) {
6208 // ByVal arguments always go to the overflow area.
6209 // Fixed arguments passed through the overflow area will be stepped
6210 // over by va_start, so don't count them towards the offset.
6211 if (IsFixed)
6212 continue;
6213 assert(A->getType()->isPointerTy());
6214 Type *RealTy = CB.getParamByValType(ArgNo);
6215 uint64_t ArgSize = DL.getTypeAllocSize(Ty: RealTy);
6216 uint64_t AlignedSize = alignTo(Value: ArgSize, Align: 8);
6217 unsigned BaseOffset = OverflowOffset;
6218 Value *ShadowBase = getShadowPtrForVAArgument(IRB, ArgOffset: OverflowOffset);
6219 Value *OriginBase = nullptr;
6220 if (MS.TrackOrigins)
6221 OriginBase = getOriginPtrForVAArgument(IRB, ArgOffset: OverflowOffset);
6222 OverflowOffset += AlignedSize;
6223
6224 if (OverflowOffset > kParamTLSSize) {
6225 CleanUnusedTLS(IRB, ShadowBase, BaseOffset);
6226 continue; // We have no space to copy shadow there.
6227 }
6228
6229 Value *ShadowPtr, *OriginPtr;
6230 std::tie(args&: ShadowPtr, args&: OriginPtr) =
6231 MSV.getShadowOriginPtr(Addr: A, IRB, ShadowTy: IRB.getInt8Ty(), Alignment: kShadowTLSAlignment,
6232 /*isStore*/ false);
6233 IRB.CreateMemCpy(Dst: ShadowBase, DstAlign: kShadowTLSAlignment, Src: ShadowPtr,
6234 SrcAlign: kShadowTLSAlignment, Size: ArgSize);
6235 if (MS.TrackOrigins)
6236 IRB.CreateMemCpy(Dst: OriginBase, DstAlign: kShadowTLSAlignment, Src: OriginPtr,
6237 SrcAlign: kShadowTLSAlignment, Size: ArgSize);
6238 } else {
6239 ArgKind AK = classifyArgument(arg: A);
6240 if (AK == AK_GeneralPurpose && GpOffset >= AMD64GpEndOffset)
6241 AK = AK_Memory;
6242 if (AK == AK_FloatingPoint && FpOffset >= AMD64FpEndOffset)
6243 AK = AK_Memory;
6244 Value *ShadowBase, *OriginBase = nullptr;
6245 switch (AK) {
6246 case AK_GeneralPurpose:
6247 ShadowBase = getShadowPtrForVAArgument(IRB, ArgOffset: GpOffset);
6248 if (MS.TrackOrigins)
6249 OriginBase = getOriginPtrForVAArgument(IRB, ArgOffset: GpOffset);
6250 GpOffset += 8;
6251 assert(GpOffset <= kParamTLSSize);
6252 break;
6253 case AK_FloatingPoint:
6254 ShadowBase = getShadowPtrForVAArgument(IRB, ArgOffset: FpOffset);
6255 if (MS.TrackOrigins)
6256 OriginBase = getOriginPtrForVAArgument(IRB, ArgOffset: FpOffset);
6257 FpOffset += 16;
6258 assert(FpOffset <= kParamTLSSize);
6259 break;
6260 case AK_Memory:
6261 if (IsFixed)
6262 continue;
6263 uint64_t ArgSize = DL.getTypeAllocSize(Ty: A->getType());
6264 uint64_t AlignedSize = alignTo(Value: ArgSize, Align: 8);
6265 unsigned BaseOffset = OverflowOffset;
6266 ShadowBase = getShadowPtrForVAArgument(IRB, ArgOffset: OverflowOffset);
6267 if (MS.TrackOrigins) {
6268 OriginBase = getOriginPtrForVAArgument(IRB, ArgOffset: OverflowOffset);
6269 }
6270 OverflowOffset += AlignedSize;
6271 if (OverflowOffset > kParamTLSSize) {
6272 // We have no space to copy shadow there.
6273 CleanUnusedTLS(IRB, ShadowBase, BaseOffset);
6274 continue;
6275 }
6276 }
6277 // Take fixed arguments into account for GpOffset and FpOffset,
6278 // but don't actually store shadows for them.
6279 // TODO(glider): don't call get*PtrForVAArgument() for them.
6280 if (IsFixed)
6281 continue;
6282 Value *Shadow = MSV.getShadow(V: A);
6283 IRB.CreateAlignedStore(Val: Shadow, Ptr: ShadowBase, Align: kShadowTLSAlignment);
6284 if (MS.TrackOrigins) {
6285 Value *Origin = MSV.getOrigin(V: A);
6286 TypeSize StoreSize = DL.getTypeStoreSize(Ty: Shadow->getType());
6287 MSV.paintOrigin(IRB, Origin, OriginPtr: OriginBase, TS: StoreSize,
6288 Alignment: std::max(a: kShadowTLSAlignment, b: kMinOriginAlignment));
6289 }
6290 }
6291 }
6292 Constant *OverflowSize =
6293 ConstantInt::get(Ty: IRB.getInt64Ty(), V: OverflowOffset - AMD64FpEndOffset);
6294 IRB.CreateStore(Val: OverflowSize, Ptr: MS.VAArgOverflowSizeTLS);
6295 }
6296
6297 void finalizeInstrumentation() override {
6298 assert(!VAArgOverflowSize && !VAArgTLSCopy &&
6299 "finalizeInstrumentation called twice");
6300 if (!VAStartInstrumentationList.empty()) {
6301 // If there is a va_start in this function, make a backup copy of
6302 // va_arg_tls somewhere in the function entry block.
6303 IRBuilder<> IRB(MSV.FnPrologueEnd);
6304 VAArgOverflowSize =
6305 IRB.CreateLoad(Ty: IRB.getInt64Ty(), Ptr: MS.VAArgOverflowSizeTLS);
6306 Value *CopySize = IRB.CreateAdd(
6307 LHS: ConstantInt::get(Ty: MS.IntptrTy, V: AMD64FpEndOffset), RHS: VAArgOverflowSize);
6308 VAArgTLSCopy = IRB.CreateAlloca(Ty: Type::getInt8Ty(C&: *MS.C), ArraySize: CopySize);
6309 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
6310 IRB.CreateMemSet(Ptr: VAArgTLSCopy, Val: Constant::getNullValue(Ty: IRB.getInt8Ty()),
6311 Size: CopySize, Align: kShadowTLSAlignment, isVolatile: false);
6312
6313 Value *SrcSize = IRB.CreateBinaryIntrinsic(
6314 ID: Intrinsic::umin, LHS: CopySize,
6315 RHS: ConstantInt::get(Ty: MS.IntptrTy, V: kParamTLSSize));
6316 IRB.CreateMemCpy(Dst: VAArgTLSCopy, DstAlign: kShadowTLSAlignment, Src: MS.VAArgTLS,
6317 SrcAlign: kShadowTLSAlignment, Size: SrcSize);
6318 if (MS.TrackOrigins) {
6319 VAArgTLSOriginCopy = IRB.CreateAlloca(Ty: Type::getInt8Ty(C&: *MS.C), ArraySize: CopySize);
6320 VAArgTLSOriginCopy->setAlignment(kShadowTLSAlignment);
6321 IRB.CreateMemCpy(Dst: VAArgTLSOriginCopy, DstAlign: kShadowTLSAlignment,
6322 Src: MS.VAArgOriginTLS, SrcAlign: kShadowTLSAlignment, Size: SrcSize);
6323 }
6324 }
6325
6326 // Instrument va_start.
6327 // Copy va_list shadow from the backup copy of the TLS contents.
6328 for (CallInst *OrigInst : VAStartInstrumentationList) {
6329 NextNodeIRBuilder IRB(OrigInst);
6330 Value *VAListTag = OrigInst->getArgOperand(i: 0);
6331
6332 Value *RegSaveAreaPtrPtr = IRB.CreateIntToPtr(
6333 V: IRB.CreateAdd(LHS: IRB.CreatePtrToInt(V: VAListTag, DestTy: MS.IntptrTy),
6334 RHS: ConstantInt::get(Ty: MS.IntptrTy, V: 16)),
6335 DestTy: MS.PtrTy);
6336 Value *RegSaveAreaPtr = IRB.CreateLoad(Ty: MS.PtrTy, Ptr: RegSaveAreaPtrPtr);
6337 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
6338 const Align Alignment = Align(16);
6339 std::tie(args&: RegSaveAreaShadowPtr, args&: RegSaveAreaOriginPtr) =
6340 MSV.getShadowOriginPtr(Addr: RegSaveAreaPtr, IRB, ShadowTy: IRB.getInt8Ty(),
6341 Alignment, /*isStore*/ true);
6342 IRB.CreateMemCpy(Dst: RegSaveAreaShadowPtr, DstAlign: Alignment, Src: VAArgTLSCopy, SrcAlign: Alignment,
6343 Size: AMD64FpEndOffset);
6344 if (MS.TrackOrigins)
6345 IRB.CreateMemCpy(Dst: RegSaveAreaOriginPtr, DstAlign: Alignment, Src: VAArgTLSOriginCopy,
6346 SrcAlign: Alignment, Size: AMD64FpEndOffset);
6347 Value *OverflowArgAreaPtrPtr = IRB.CreateIntToPtr(
6348 V: IRB.CreateAdd(LHS: IRB.CreatePtrToInt(V: VAListTag, DestTy: MS.IntptrTy),
6349 RHS: ConstantInt::get(Ty: MS.IntptrTy, V: 8)),
6350 DestTy: MS.PtrTy);
6351 Value *OverflowArgAreaPtr =
6352 IRB.CreateLoad(Ty: MS.PtrTy, Ptr: OverflowArgAreaPtrPtr);
6353 Value *OverflowArgAreaShadowPtr, *OverflowArgAreaOriginPtr;
6354 std::tie(args&: OverflowArgAreaShadowPtr, args&: OverflowArgAreaOriginPtr) =
6355 MSV.getShadowOriginPtr(Addr: OverflowArgAreaPtr, IRB, ShadowTy: IRB.getInt8Ty(),
6356 Alignment, /*isStore*/ true);
6357 Value *SrcPtr = IRB.CreateConstGEP1_32(Ty: IRB.getInt8Ty(), Ptr: VAArgTLSCopy,
6358 Idx0: AMD64FpEndOffset);
6359 IRB.CreateMemCpy(Dst: OverflowArgAreaShadowPtr, DstAlign: Alignment, Src: SrcPtr, SrcAlign: Alignment,
6360 Size: VAArgOverflowSize);
6361 if (MS.TrackOrigins) {
6362 SrcPtr = IRB.CreateConstGEP1_32(Ty: IRB.getInt8Ty(), Ptr: VAArgTLSOriginCopy,
6363 Idx0: AMD64FpEndOffset);
6364 IRB.CreateMemCpy(Dst: OverflowArgAreaOriginPtr, DstAlign: Alignment, Src: SrcPtr, SrcAlign: Alignment,
6365 Size: VAArgOverflowSize);
6366 }
6367 }
6368 }
6369};
6370
6371/// AArch64-specific implementation of VarArgHelper.
6372struct VarArgAArch64Helper : public VarArgHelperBase {
6373 static const unsigned kAArch64GrArgSize = 64;
6374 static const unsigned kAArch64VrArgSize = 128;
6375
6376 static const unsigned AArch64GrBegOffset = 0;
6377 static const unsigned AArch64GrEndOffset = kAArch64GrArgSize;
6378 // Make VR space aligned to 16 bytes.
6379 static const unsigned AArch64VrBegOffset = AArch64GrEndOffset;
6380 static const unsigned AArch64VrEndOffset =
6381 AArch64VrBegOffset + kAArch64VrArgSize;
6382 static const unsigned AArch64VAEndOffset = AArch64VrEndOffset;
6383
6384 AllocaInst *VAArgTLSCopy = nullptr;
6385 Value *VAArgOverflowSize = nullptr;
6386
6387 enum ArgKind { AK_GeneralPurpose, AK_FloatingPoint, AK_Memory };
6388
6389 VarArgAArch64Helper(Function &F, MemorySanitizer &MS,
6390 MemorySanitizerVisitor &MSV)
6391 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/32) {}
6392
6393 // A very rough approximation of aarch64 argument classification rules.
6394 std::pair<ArgKind, uint64_t> classifyArgument(Type *T) {
6395 if (T->isIntOrPtrTy() && T->getPrimitiveSizeInBits() <= 64)
6396 return {AK_GeneralPurpose, 1};
6397 if (T->isFloatingPointTy() && T->getPrimitiveSizeInBits() <= 128)
6398 return {AK_FloatingPoint, 1};
6399
6400 if (T->isArrayTy()) {
6401 auto R = classifyArgument(T: T->getArrayElementType());
6402 R.second *= T->getScalarType()->getArrayNumElements();
6403 return R;
6404 }
6405
6406 if (const FixedVectorType *FV = dyn_cast<FixedVectorType>(Val: T)) {
6407 auto R = classifyArgument(T: FV->getScalarType());
6408 R.second *= FV->getNumElements();
6409 return R;
6410 }
6411
6412 LLVM_DEBUG(errs() << "Unknown vararg type: " << *T << "\n");
6413 return {AK_Memory, 0};
6414 }
6415
6416 // The instrumentation stores the argument shadow in a non ABI-specific
6417 // format because it does not know which argument is named (since Clang,
6418 // like x86_64 case, lowers the va_args in the frontend and this pass only
6419 // sees the low level code that deals with va_list internals).
6420 // The first seven GR registers are saved in the first 56 bytes of the
6421 // va_arg tls arra, followed by the first 8 FP/SIMD registers, and then
6422 // the remaining arguments.
6423 // Using constant offset within the va_arg TLS array allows fast copy
6424 // in the finalize instrumentation.
6425 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
6426 unsigned GrOffset = AArch64GrBegOffset;
6427 unsigned VrOffset = AArch64VrBegOffset;
6428 unsigned OverflowOffset = AArch64VAEndOffset;
6429
6430 const DataLayout &DL = F.getDataLayout();
6431 for (const auto &[ArgNo, A] : llvm::enumerate(First: CB.args())) {
6432 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
6433 auto [AK, RegNum] = classifyArgument(T: A->getType());
6434 if (AK == AK_GeneralPurpose &&
6435 (GrOffset + RegNum * 8) > AArch64GrEndOffset)
6436 AK = AK_Memory;
6437 if (AK == AK_FloatingPoint &&
6438 (VrOffset + RegNum * 16) > AArch64VrEndOffset)
6439 AK = AK_Memory;
6440 Value *Base;
6441 switch (AK) {
6442 case AK_GeneralPurpose:
6443 Base = getShadowPtrForVAArgument(IRB, ArgOffset: GrOffset);
6444 GrOffset += 8 * RegNum;
6445 break;
6446 case AK_FloatingPoint:
6447 Base = getShadowPtrForVAArgument(IRB, ArgOffset: VrOffset);
6448 VrOffset += 16 * RegNum;
6449 break;
6450 case AK_Memory:
6451 // Don't count fixed arguments in the overflow area - va_start will
6452 // skip right over them.
6453 if (IsFixed)
6454 continue;
6455 uint64_t ArgSize = DL.getTypeAllocSize(Ty: A->getType());
6456 uint64_t AlignedSize = alignTo(Value: ArgSize, Align: 8);
6457 unsigned BaseOffset = OverflowOffset;
6458 Base = getShadowPtrForVAArgument(IRB, ArgOffset: BaseOffset);
6459 OverflowOffset += AlignedSize;
6460 if (OverflowOffset > kParamTLSSize) {
6461 // We have no space to copy shadow there.
6462 CleanUnusedTLS(IRB, ShadowBase: Base, BaseOffset);
6463 continue;
6464 }
6465 break;
6466 }
6467 // Count Gp/Vr fixed arguments to their respective offsets, but don't
6468 // bother to actually store a shadow.
6469 if (IsFixed)
6470 continue;
6471 IRB.CreateAlignedStore(Val: MSV.getShadow(V: A), Ptr: Base, Align: kShadowTLSAlignment);
6472 }
6473 Constant *OverflowSize =
6474 ConstantInt::get(Ty: IRB.getInt64Ty(), V: OverflowOffset - AArch64VAEndOffset);
6475 IRB.CreateStore(Val: OverflowSize, Ptr: MS.VAArgOverflowSizeTLS);
6476 }
6477
6478 // Retrieve a va_list field of 'void*' size.
6479 Value *getVAField64(IRBuilder<> &IRB, Value *VAListTag, int offset) {
6480 Value *SaveAreaPtrPtr = IRB.CreateIntToPtr(
6481 V: IRB.CreateAdd(LHS: IRB.CreatePtrToInt(V: VAListTag, DestTy: MS.IntptrTy),
6482 RHS: ConstantInt::get(Ty: MS.IntptrTy, V: offset)),
6483 DestTy: MS.PtrTy);
6484 return IRB.CreateLoad(Ty: Type::getInt64Ty(C&: *MS.C), Ptr: SaveAreaPtrPtr);
6485 }
6486
6487 // Retrieve a va_list field of 'int' size.
6488 Value *getVAField32(IRBuilder<> &IRB, Value *VAListTag, int offset) {
6489 Value *SaveAreaPtr = IRB.CreateIntToPtr(
6490 V: IRB.CreateAdd(LHS: IRB.CreatePtrToInt(V: VAListTag, DestTy: MS.IntptrTy),
6491 RHS: ConstantInt::get(Ty: MS.IntptrTy, V: offset)),
6492 DestTy: MS.PtrTy);
6493 Value *SaveArea32 = IRB.CreateLoad(Ty: IRB.getInt32Ty(), Ptr: SaveAreaPtr);
6494 return IRB.CreateSExt(V: SaveArea32, DestTy: MS.IntptrTy);
6495 }
6496
6497 void finalizeInstrumentation() override {
6498 assert(!VAArgOverflowSize && !VAArgTLSCopy &&
6499 "finalizeInstrumentation called twice");
6500 if (!VAStartInstrumentationList.empty()) {
6501 // If there is a va_start in this function, make a backup copy of
6502 // va_arg_tls somewhere in the function entry block.
6503 IRBuilder<> IRB(MSV.FnPrologueEnd);
6504 VAArgOverflowSize =
6505 IRB.CreateLoad(Ty: IRB.getInt64Ty(), Ptr: MS.VAArgOverflowSizeTLS);
6506 Value *CopySize = IRB.CreateAdd(
6507 LHS: ConstantInt::get(Ty: MS.IntptrTy, V: AArch64VAEndOffset), RHS: VAArgOverflowSize);
6508 VAArgTLSCopy = IRB.CreateAlloca(Ty: Type::getInt8Ty(C&: *MS.C), ArraySize: CopySize);
6509 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
6510 IRB.CreateMemSet(Ptr: VAArgTLSCopy, Val: Constant::getNullValue(Ty: IRB.getInt8Ty()),
6511 Size: CopySize, Align: kShadowTLSAlignment, isVolatile: false);
6512
6513 Value *SrcSize = IRB.CreateBinaryIntrinsic(
6514 ID: Intrinsic::umin, LHS: CopySize,
6515 RHS: ConstantInt::get(Ty: MS.IntptrTy, V: kParamTLSSize));
6516 IRB.CreateMemCpy(Dst: VAArgTLSCopy, DstAlign: kShadowTLSAlignment, Src: MS.VAArgTLS,
6517 SrcAlign: kShadowTLSAlignment, Size: SrcSize);
6518 }
6519
6520 Value *GrArgSize = ConstantInt::get(Ty: MS.IntptrTy, V: kAArch64GrArgSize);
6521 Value *VrArgSize = ConstantInt::get(Ty: MS.IntptrTy, V: kAArch64VrArgSize);
6522
6523 // Instrument va_start, copy va_list shadow from the backup copy of
6524 // the TLS contents.
6525 for (CallInst *OrigInst : VAStartInstrumentationList) {
6526 NextNodeIRBuilder IRB(OrigInst);
6527
6528 Value *VAListTag = OrigInst->getArgOperand(i: 0);
6529
6530 // The variadic ABI for AArch64 creates two areas to save the incoming
6531 // argument registers (one for 64-bit general register xn-x7 and another
6532 // for 128-bit FP/SIMD vn-v7).
6533 // We need then to propagate the shadow arguments on both regions
6534 // 'va::__gr_top + va::__gr_offs' and 'va::__vr_top + va::__vr_offs'.
6535 // The remaining arguments are saved on shadow for 'va::stack'.
6536 // One caveat is it requires only to propagate the non-named arguments,
6537 // however on the call site instrumentation 'all' the arguments are
6538 // saved. So to copy the shadow values from the va_arg TLS array
6539 // we need to adjust the offset for both GR and VR fields based on
6540 // the __{gr,vr}_offs value (since they are stores based on incoming
6541 // named arguments).
6542 Type *RegSaveAreaPtrTy = IRB.getPtrTy();
6543
6544 // Read the stack pointer from the va_list.
6545 Value *StackSaveAreaPtr =
6546 IRB.CreateIntToPtr(V: getVAField64(IRB, VAListTag, offset: 0), DestTy: RegSaveAreaPtrTy);
6547
6548 // Read both the __gr_top and __gr_off and add them up.
6549 Value *GrTopSaveAreaPtr = getVAField64(IRB, VAListTag, offset: 8);
6550 Value *GrOffSaveArea = getVAField32(IRB, VAListTag, offset: 24);
6551
6552 Value *GrRegSaveAreaPtr = IRB.CreateIntToPtr(
6553 V: IRB.CreateAdd(LHS: GrTopSaveAreaPtr, RHS: GrOffSaveArea), DestTy: RegSaveAreaPtrTy);
6554
6555 // Read both the __vr_top and __vr_off and add them up.
6556 Value *VrTopSaveAreaPtr = getVAField64(IRB, VAListTag, offset: 16);
6557 Value *VrOffSaveArea = getVAField32(IRB, VAListTag, offset: 28);
6558
6559 Value *VrRegSaveAreaPtr = IRB.CreateIntToPtr(
6560 V: IRB.CreateAdd(LHS: VrTopSaveAreaPtr, RHS: VrOffSaveArea), DestTy: RegSaveAreaPtrTy);
6561
6562 // It does not know how many named arguments is being used and, on the
6563 // callsite all the arguments were saved. Since __gr_off is defined as
6564 // '0 - ((8 - named_gr) * 8)', the idea is to just propagate the variadic
6565 // argument by ignoring the bytes of shadow from named arguments.
6566 Value *GrRegSaveAreaShadowPtrOff =
6567 IRB.CreateAdd(LHS: GrArgSize, RHS: GrOffSaveArea);
6568
6569 Value *GrRegSaveAreaShadowPtr =
6570 MSV.getShadowOriginPtr(Addr: GrRegSaveAreaPtr, IRB, ShadowTy: IRB.getInt8Ty(),
6571 Alignment: Align(8), /*isStore*/ true)
6572 .first;
6573
6574 Value *GrSrcPtr =
6575 IRB.CreateInBoundsPtrAdd(Ptr: VAArgTLSCopy, Offset: GrRegSaveAreaShadowPtrOff);
6576 Value *GrCopySize = IRB.CreateSub(LHS: GrArgSize, RHS: GrRegSaveAreaShadowPtrOff);
6577
6578 IRB.CreateMemCpy(Dst: GrRegSaveAreaShadowPtr, DstAlign: Align(8), Src: GrSrcPtr, SrcAlign: Align(8),
6579 Size: GrCopySize);
6580
6581 // Again, but for FP/SIMD values.
6582 Value *VrRegSaveAreaShadowPtrOff =
6583 IRB.CreateAdd(LHS: VrArgSize, RHS: VrOffSaveArea);
6584
6585 Value *VrRegSaveAreaShadowPtr =
6586 MSV.getShadowOriginPtr(Addr: VrRegSaveAreaPtr, IRB, ShadowTy: IRB.getInt8Ty(),
6587 Alignment: Align(8), /*isStore*/ true)
6588 .first;
6589
6590 Value *VrSrcPtr = IRB.CreateInBoundsPtrAdd(
6591 Ptr: IRB.CreateInBoundsPtrAdd(Ptr: VAArgTLSCopy,
6592 Offset: IRB.getInt32(C: AArch64VrBegOffset)),
6593 Offset: VrRegSaveAreaShadowPtrOff);
6594 Value *VrCopySize = IRB.CreateSub(LHS: VrArgSize, RHS: VrRegSaveAreaShadowPtrOff);
6595
6596 IRB.CreateMemCpy(Dst: VrRegSaveAreaShadowPtr, DstAlign: Align(8), Src: VrSrcPtr, SrcAlign: Align(8),
6597 Size: VrCopySize);
6598
6599 // And finally for remaining arguments.
6600 Value *StackSaveAreaShadowPtr =
6601 MSV.getShadowOriginPtr(Addr: StackSaveAreaPtr, IRB, ShadowTy: IRB.getInt8Ty(),
6602 Alignment: Align(16), /*isStore*/ true)
6603 .first;
6604
6605 Value *StackSrcPtr = IRB.CreateInBoundsPtrAdd(
6606 Ptr: VAArgTLSCopy, Offset: IRB.getInt32(C: AArch64VAEndOffset));
6607
6608 IRB.CreateMemCpy(Dst: StackSaveAreaShadowPtr, DstAlign: Align(16), Src: StackSrcPtr,
6609 SrcAlign: Align(16), Size: VAArgOverflowSize);
6610 }
6611 }
6612};
6613
6614/// PowerPC64-specific implementation of VarArgHelper.
6615struct VarArgPowerPC64Helper : public VarArgHelperBase {
6616 AllocaInst *VAArgTLSCopy = nullptr;
6617 Value *VAArgSize = nullptr;
6618
6619 VarArgPowerPC64Helper(Function &F, MemorySanitizer &MS,
6620 MemorySanitizerVisitor &MSV)
6621 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/8) {}
6622
6623 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
6624 // For PowerPC, we need to deal with alignment of stack arguments -
6625 // they are mostly aligned to 8 bytes, but vectors and i128 arrays
6626 // are aligned to 16 bytes, byvals can be aligned to 8 or 16 bytes,
6627 // For that reason, we compute current offset from stack pointer (which is
6628 // always properly aligned), and offset for the first vararg, then subtract
6629 // them.
6630 unsigned VAArgBase;
6631 Triple TargetTriple(F.getParent()->getTargetTriple());
6632 // Parameter save area starts at 48 bytes from frame pointer for ABIv1,
6633 // and 32 bytes for ABIv2. This is usually determined by target
6634 // endianness, but in theory could be overridden by function attribute.
6635 if (TargetTriple.isPPC64ELFv2ABI())
6636 VAArgBase = 32;
6637 else
6638 VAArgBase = 48;
6639 unsigned VAArgOffset = VAArgBase;
6640 const DataLayout &DL = F.getDataLayout();
6641 for (const auto &[ArgNo, A] : llvm::enumerate(First: CB.args())) {
6642 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
6643 bool IsByVal = CB.paramHasAttr(ArgNo, Kind: Attribute::ByVal);
6644 if (IsByVal) {
6645 assert(A->getType()->isPointerTy());
6646 Type *RealTy = CB.getParamByValType(ArgNo);
6647 uint64_t ArgSize = DL.getTypeAllocSize(Ty: RealTy);
6648 Align ArgAlign = CB.getParamAlign(ArgNo).value_or(u: Align(8));
6649 if (ArgAlign < 8)
6650 ArgAlign = Align(8);
6651 VAArgOffset = alignTo(Size: VAArgOffset, A: ArgAlign);
6652 if (!IsFixed) {
6653 Value *Base =
6654 getShadowPtrForVAArgument(IRB, ArgOffset: VAArgOffset - VAArgBase, ArgSize);
6655 if (Base) {
6656 Value *AShadowPtr, *AOriginPtr;
6657 std::tie(args&: AShadowPtr, args&: AOriginPtr) =
6658 MSV.getShadowOriginPtr(Addr: A, IRB, ShadowTy: IRB.getInt8Ty(),
6659 Alignment: kShadowTLSAlignment, /*isStore*/ false);
6660
6661 IRB.CreateMemCpy(Dst: Base, DstAlign: kShadowTLSAlignment, Src: AShadowPtr,
6662 SrcAlign: kShadowTLSAlignment, Size: ArgSize);
6663 }
6664 }
6665 VAArgOffset += alignTo(Size: ArgSize, A: Align(8));
6666 } else {
6667 Value *Base;
6668 uint64_t ArgSize = DL.getTypeAllocSize(Ty: A->getType());
6669 Align ArgAlign = Align(8);
6670 if (A->getType()->isArrayTy()) {
6671 // Arrays are aligned to element size, except for long double
6672 // arrays, which are aligned to 8 bytes.
6673 Type *ElementTy = A->getType()->getArrayElementType();
6674 if (!ElementTy->isPPC_FP128Ty())
6675 ArgAlign = Align(DL.getTypeAllocSize(Ty: ElementTy));
6676 } else if (A->getType()->isVectorTy()) {
6677 // Vectors are naturally aligned.
6678 ArgAlign = Align(ArgSize);
6679 }
6680 if (ArgAlign < 8)
6681 ArgAlign = Align(8);
6682 VAArgOffset = alignTo(Size: VAArgOffset, A: ArgAlign);
6683 if (DL.isBigEndian()) {
6684 // Adjusting the shadow for argument with size < 8 to match the
6685 // placement of bits in big endian system
6686 if (ArgSize < 8)
6687 VAArgOffset += (8 - ArgSize);
6688 }
6689 if (!IsFixed) {
6690 Base =
6691 getShadowPtrForVAArgument(IRB, ArgOffset: VAArgOffset - VAArgBase, ArgSize);
6692 if (Base)
6693 IRB.CreateAlignedStore(Val: MSV.getShadow(V: A), Ptr: Base, Align: kShadowTLSAlignment);
6694 }
6695 VAArgOffset += ArgSize;
6696 VAArgOffset = alignTo(Size: VAArgOffset, A: Align(8));
6697 }
6698 if (IsFixed)
6699 VAArgBase = VAArgOffset;
6700 }
6701
6702 Constant *TotalVAArgSize =
6703 ConstantInt::get(Ty: MS.IntptrTy, V: VAArgOffset - VAArgBase);
6704 // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
6705 // a new class member i.e. it is the total size of all VarArgs.
6706 IRB.CreateStore(Val: TotalVAArgSize, Ptr: MS.VAArgOverflowSizeTLS);
6707 }
6708
6709 void finalizeInstrumentation() override {
6710 assert(!VAArgSize && !VAArgTLSCopy &&
6711 "finalizeInstrumentation called twice");
6712 IRBuilder<> IRB(MSV.FnPrologueEnd);
6713 VAArgSize = IRB.CreateLoad(Ty: IRB.getInt64Ty(), Ptr: MS.VAArgOverflowSizeTLS);
6714 Value *CopySize = VAArgSize;
6715
6716 if (!VAStartInstrumentationList.empty()) {
6717 // If there is a va_start in this function, make a backup copy of
6718 // va_arg_tls somewhere in the function entry block.
6719
6720 VAArgTLSCopy = IRB.CreateAlloca(Ty: Type::getInt8Ty(C&: *MS.C), ArraySize: CopySize);
6721 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
6722 IRB.CreateMemSet(Ptr: VAArgTLSCopy, Val: Constant::getNullValue(Ty: IRB.getInt8Ty()),
6723 Size: CopySize, Align: kShadowTLSAlignment, isVolatile: false);
6724
6725 Value *SrcSize = IRB.CreateBinaryIntrinsic(
6726 ID: Intrinsic::umin, LHS: CopySize,
6727 RHS: ConstantInt::get(Ty: IRB.getInt64Ty(), V: kParamTLSSize));
6728 IRB.CreateMemCpy(Dst: VAArgTLSCopy, DstAlign: kShadowTLSAlignment, Src: MS.VAArgTLS,
6729 SrcAlign: kShadowTLSAlignment, Size: SrcSize);
6730 }
6731
6732 // Instrument va_start.
6733 // Copy va_list shadow from the backup copy of the TLS contents.
6734 for (CallInst *OrigInst : VAStartInstrumentationList) {
6735 NextNodeIRBuilder IRB(OrigInst);
6736 Value *VAListTag = OrigInst->getArgOperand(i: 0);
6737 Value *RegSaveAreaPtrPtr = IRB.CreatePtrToInt(V: VAListTag, DestTy: MS.IntptrTy);
6738
6739 RegSaveAreaPtrPtr = IRB.CreateIntToPtr(V: RegSaveAreaPtrPtr, DestTy: MS.PtrTy);
6740
6741 Value *RegSaveAreaPtr = IRB.CreateLoad(Ty: MS.PtrTy, Ptr: RegSaveAreaPtrPtr);
6742 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
6743 const DataLayout &DL = F.getDataLayout();
6744 unsigned IntptrSize = DL.getTypeStoreSize(Ty: MS.IntptrTy);
6745 const Align Alignment = Align(IntptrSize);
6746 std::tie(args&: RegSaveAreaShadowPtr, args&: RegSaveAreaOriginPtr) =
6747 MSV.getShadowOriginPtr(Addr: RegSaveAreaPtr, IRB, ShadowTy: IRB.getInt8Ty(),
6748 Alignment, /*isStore*/ true);
6749 IRB.CreateMemCpy(Dst: RegSaveAreaShadowPtr, DstAlign: Alignment, Src: VAArgTLSCopy, SrcAlign: Alignment,
6750 Size: CopySize);
6751 }
6752 }
6753};
6754
6755/// PowerPC32-specific implementation of VarArgHelper.
6756struct VarArgPowerPC32Helper : public VarArgHelperBase {
6757 AllocaInst *VAArgTLSCopy = nullptr;
6758 Value *VAArgSize = nullptr;
6759
6760 VarArgPowerPC32Helper(Function &F, MemorySanitizer &MS,
6761 MemorySanitizerVisitor &MSV)
6762 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/12) {}
6763
6764 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
6765 unsigned VAArgBase;
6766 // Parameter save area is 8 bytes from frame pointer in PPC32
6767 VAArgBase = 8;
6768 unsigned VAArgOffset = VAArgBase;
6769 const DataLayout &DL = F.getDataLayout();
6770 unsigned IntptrSize = DL.getTypeStoreSize(Ty: MS.IntptrTy);
6771 for (const auto &[ArgNo, A] : llvm::enumerate(First: CB.args())) {
6772 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
6773 bool IsByVal = CB.paramHasAttr(ArgNo, Kind: Attribute::ByVal);
6774 if (IsByVal) {
6775 assert(A->getType()->isPointerTy());
6776 Type *RealTy = CB.getParamByValType(ArgNo);
6777 uint64_t ArgSize = DL.getTypeAllocSize(Ty: RealTy);
6778 Align ArgAlign = CB.getParamAlign(ArgNo).value_or(u: Align(IntptrSize));
6779 if (ArgAlign < IntptrSize)
6780 ArgAlign = Align(IntptrSize);
6781 VAArgOffset = alignTo(Size: VAArgOffset, A: ArgAlign);
6782 if (!IsFixed) {
6783 Value *Base =
6784 getShadowPtrForVAArgument(IRB, ArgOffset: VAArgOffset - VAArgBase, ArgSize);
6785 if (Base) {
6786 Value *AShadowPtr, *AOriginPtr;
6787 std::tie(args&: AShadowPtr, args&: AOriginPtr) =
6788 MSV.getShadowOriginPtr(Addr: A, IRB, ShadowTy: IRB.getInt8Ty(),
6789 Alignment: kShadowTLSAlignment, /*isStore*/ false);
6790
6791 IRB.CreateMemCpy(Dst: Base, DstAlign: kShadowTLSAlignment, Src: AShadowPtr,
6792 SrcAlign: kShadowTLSAlignment, Size: ArgSize);
6793 }
6794 }
6795 VAArgOffset += alignTo(Size: ArgSize, A: Align(IntptrSize));
6796 } else {
6797 Value *Base;
6798 Type *ArgTy = A->getType();
6799
6800 // On PPC 32 floating point variable arguments are stored in separate
6801 // area: fp_save_area = reg_save_area + 4*8. We do not copy shaodow for
6802 // them as they will be found when checking call arguments.
6803 if (!ArgTy->isFloatingPointTy()) {
6804 uint64_t ArgSize = DL.getTypeAllocSize(Ty: ArgTy);
6805 Align ArgAlign = Align(IntptrSize);
6806 if (ArgTy->isArrayTy()) {
6807 // Arrays are aligned to element size, except for long double
6808 // arrays, which are aligned to 8 bytes.
6809 Type *ElementTy = ArgTy->getArrayElementType();
6810 if (!ElementTy->isPPC_FP128Ty())
6811 ArgAlign = Align(DL.getTypeAllocSize(Ty: ElementTy));
6812 } else if (ArgTy->isVectorTy()) {
6813 // Vectors are naturally aligned.
6814 ArgAlign = Align(ArgSize);
6815 }
6816 if (ArgAlign < IntptrSize)
6817 ArgAlign = Align(IntptrSize);
6818 VAArgOffset = alignTo(Size: VAArgOffset, A: ArgAlign);
6819 if (DL.isBigEndian()) {
6820 // Adjusting the shadow for argument with size < IntptrSize to match
6821 // the placement of bits in big endian system
6822 if (ArgSize < IntptrSize)
6823 VAArgOffset += (IntptrSize - ArgSize);
6824 }
6825 if (!IsFixed) {
6826 Base = getShadowPtrForVAArgument(IRB, ArgOffset: VAArgOffset - VAArgBase,
6827 ArgSize);
6828 if (Base)
6829 IRB.CreateAlignedStore(Val: MSV.getShadow(V: A), Ptr: Base,
6830 Align: kShadowTLSAlignment);
6831 }
6832 VAArgOffset += ArgSize;
6833 VAArgOffset = alignTo(Size: VAArgOffset, A: Align(IntptrSize));
6834 }
6835 }
6836 }
6837
6838 Constant *TotalVAArgSize =
6839 ConstantInt::get(Ty: MS.IntptrTy, V: VAArgOffset - VAArgBase);
6840 // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
6841 // a new class member i.e. it is the total size of all VarArgs.
6842 IRB.CreateStore(Val: TotalVAArgSize, Ptr: MS.VAArgOverflowSizeTLS);
6843 }
6844
6845 void finalizeInstrumentation() override {
6846 assert(!VAArgSize && !VAArgTLSCopy &&
6847 "finalizeInstrumentation called twice");
6848 IRBuilder<> IRB(MSV.FnPrologueEnd);
6849 VAArgSize = IRB.CreateLoad(Ty: MS.IntptrTy, Ptr: MS.VAArgOverflowSizeTLS);
6850 Value *CopySize = VAArgSize;
6851
6852 if (!VAStartInstrumentationList.empty()) {
6853 // If there is a va_start in this function, make a backup copy of
6854 // va_arg_tls somewhere in the function entry block.
6855
6856 VAArgTLSCopy = IRB.CreateAlloca(Ty: Type::getInt8Ty(C&: *MS.C), ArraySize: CopySize);
6857 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
6858 IRB.CreateMemSet(Ptr: VAArgTLSCopy, Val: Constant::getNullValue(Ty: IRB.getInt8Ty()),
6859 Size: CopySize, Align: kShadowTLSAlignment, isVolatile: false);
6860
6861 Value *SrcSize = IRB.CreateBinaryIntrinsic(
6862 ID: Intrinsic::umin, LHS: CopySize,
6863 RHS: ConstantInt::get(Ty: MS.IntptrTy, V: kParamTLSSize));
6864 IRB.CreateMemCpy(Dst: VAArgTLSCopy, DstAlign: kShadowTLSAlignment, Src: MS.VAArgTLS,
6865 SrcAlign: kShadowTLSAlignment, Size: SrcSize);
6866 }
6867
6868 // Instrument va_start.
6869 // Copy va_list shadow from the backup copy of the TLS contents.
6870 for (CallInst *OrigInst : VAStartInstrumentationList) {
6871 NextNodeIRBuilder IRB(OrigInst);
6872 Value *VAListTag = OrigInst->getArgOperand(i: 0);
6873 Value *RegSaveAreaPtrPtr = IRB.CreatePtrToInt(V: VAListTag, DestTy: MS.IntptrTy);
6874 Value *RegSaveAreaSize = CopySize;
6875
6876 // In PPC32 va_list_tag is a struct
6877 RegSaveAreaPtrPtr =
6878 IRB.CreateAdd(LHS: RegSaveAreaPtrPtr, RHS: ConstantInt::get(Ty: MS.IntptrTy, V: 8));
6879
6880 // On PPC 32 reg_save_area can only hold 32 bytes of data
6881 RegSaveAreaSize = IRB.CreateBinaryIntrinsic(
6882 ID: Intrinsic::umin, LHS: CopySize, RHS: ConstantInt::get(Ty: MS.IntptrTy, V: 32));
6883
6884 RegSaveAreaPtrPtr = IRB.CreateIntToPtr(V: RegSaveAreaPtrPtr, DestTy: MS.PtrTy);
6885 Value *RegSaveAreaPtr = IRB.CreateLoad(Ty: MS.PtrTy, Ptr: RegSaveAreaPtrPtr);
6886
6887 const DataLayout &DL = F.getDataLayout();
6888 unsigned IntptrSize = DL.getTypeStoreSize(Ty: MS.IntptrTy);
6889 const Align Alignment = Align(IntptrSize);
6890
6891 { // Copy reg save area
6892 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
6893 std::tie(args&: RegSaveAreaShadowPtr, args&: RegSaveAreaOriginPtr) =
6894 MSV.getShadowOriginPtr(Addr: RegSaveAreaPtr, IRB, ShadowTy: IRB.getInt8Ty(),
6895 Alignment, /*isStore*/ true);
6896 IRB.CreateMemCpy(Dst: RegSaveAreaShadowPtr, DstAlign: Alignment, Src: VAArgTLSCopy,
6897 SrcAlign: Alignment, Size: RegSaveAreaSize);
6898
6899 RegSaveAreaShadowPtr =
6900 IRB.CreatePtrToInt(V: RegSaveAreaShadowPtr, DestTy: MS.IntptrTy);
6901 Value *FPSaveArea = IRB.CreateAdd(LHS: RegSaveAreaShadowPtr,
6902 RHS: ConstantInt::get(Ty: MS.IntptrTy, V: 32));
6903 FPSaveArea = IRB.CreateIntToPtr(V: FPSaveArea, DestTy: MS.PtrTy);
6904 // We fill fp shadow with zeroes as uninitialized fp args should have
6905 // been found during call base check
6906 IRB.CreateMemSet(Ptr: FPSaveArea, Val: ConstantInt::getNullValue(Ty: IRB.getInt8Ty()),
6907 Size: ConstantInt::get(Ty: MS.IntptrTy, V: 32), Align: Alignment);
6908 }
6909
6910 { // Copy overflow area
6911 // RegSaveAreaSize is min(CopySize, 32) -> no overflow can occur
6912 Value *OverflowAreaSize = IRB.CreateSub(LHS: CopySize, RHS: RegSaveAreaSize);
6913
6914 Value *OverflowAreaPtrPtr = IRB.CreatePtrToInt(V: VAListTag, DestTy: MS.IntptrTy);
6915 OverflowAreaPtrPtr =
6916 IRB.CreateAdd(LHS: OverflowAreaPtrPtr, RHS: ConstantInt::get(Ty: MS.IntptrTy, V: 4));
6917 OverflowAreaPtrPtr = IRB.CreateIntToPtr(V: OverflowAreaPtrPtr, DestTy: MS.PtrTy);
6918
6919 Value *OverflowAreaPtr = IRB.CreateLoad(Ty: MS.PtrTy, Ptr: OverflowAreaPtrPtr);
6920
6921 Value *OverflowAreaShadowPtr, *OverflowAreaOriginPtr;
6922 std::tie(args&: OverflowAreaShadowPtr, args&: OverflowAreaOriginPtr) =
6923 MSV.getShadowOriginPtr(Addr: OverflowAreaPtr, IRB, ShadowTy: IRB.getInt8Ty(),
6924 Alignment, /*isStore*/ true);
6925
6926 Value *OverflowVAArgTLSCopyPtr =
6927 IRB.CreatePtrToInt(V: VAArgTLSCopy, DestTy: MS.IntptrTy);
6928 OverflowVAArgTLSCopyPtr =
6929 IRB.CreateAdd(LHS: OverflowVAArgTLSCopyPtr, RHS: RegSaveAreaSize);
6930
6931 OverflowVAArgTLSCopyPtr =
6932 IRB.CreateIntToPtr(V: OverflowVAArgTLSCopyPtr, DestTy: MS.PtrTy);
6933 IRB.CreateMemCpy(Dst: OverflowAreaShadowPtr, DstAlign: Alignment,
6934 Src: OverflowVAArgTLSCopyPtr, SrcAlign: Alignment, Size: OverflowAreaSize);
6935 }
6936 }
6937 }
6938};
6939
6940/// SystemZ-specific implementation of VarArgHelper.
6941struct VarArgSystemZHelper : public VarArgHelperBase {
6942 static const unsigned SystemZGpOffset = 16;
6943 static const unsigned SystemZGpEndOffset = 56;
6944 static const unsigned SystemZFpOffset = 128;
6945 static const unsigned SystemZFpEndOffset = 160;
6946 static const unsigned SystemZMaxVrArgs = 8;
6947 static const unsigned SystemZRegSaveAreaSize = 160;
6948 static const unsigned SystemZOverflowOffset = 160;
6949 static const unsigned SystemZVAListTagSize = 32;
6950 static const unsigned SystemZOverflowArgAreaPtrOffset = 16;
6951 static const unsigned SystemZRegSaveAreaPtrOffset = 24;
6952
6953 bool IsSoftFloatABI;
6954 AllocaInst *VAArgTLSCopy = nullptr;
6955 AllocaInst *VAArgTLSOriginCopy = nullptr;
6956 Value *VAArgOverflowSize = nullptr;
6957
6958 enum class ArgKind {
6959 GeneralPurpose,
6960 FloatingPoint,
6961 Vector,
6962 Memory,
6963 Indirect,
6964 };
6965
6966 enum class ShadowExtension { None, Zero, Sign };
6967
6968 VarArgSystemZHelper(Function &F, MemorySanitizer &MS,
6969 MemorySanitizerVisitor &MSV)
6970 : VarArgHelperBase(F, MS, MSV, SystemZVAListTagSize),
6971 IsSoftFloatABI(F.getFnAttribute(Kind: "use-soft-float").getValueAsBool()) {}
6972
6973 ArgKind classifyArgument(Type *T) {
6974 // T is a SystemZABIInfo::classifyArgumentType() output, and there are
6975 // only a few possibilities of what it can be. In particular, enums, single
6976 // element structs and large types have already been taken care of.
6977
6978 // Some i128 and fp128 arguments are converted to pointers only in the
6979 // back end.
6980 if (T->isIntegerTy(Bitwidth: 128) || T->isFP128Ty())
6981 return ArgKind::Indirect;
6982 if (T->isFloatingPointTy())
6983 return IsSoftFloatABI ? ArgKind::GeneralPurpose : ArgKind::FloatingPoint;
6984 if (T->isIntegerTy() || T->isPointerTy())
6985 return ArgKind::GeneralPurpose;
6986 if (T->isVectorTy())
6987 return ArgKind::Vector;
6988 return ArgKind::Memory;
6989 }
6990
6991 ShadowExtension getShadowExtension(const CallBase &CB, unsigned ArgNo) {
6992 // ABI says: "One of the simple integer types no more than 64 bits wide.
6993 // ... If such an argument is shorter than 64 bits, replace it by a full
6994 // 64-bit integer representing the same number, using sign or zero
6995 // extension". Shadow for an integer argument has the same type as the
6996 // argument itself, so it can be sign or zero extended as well.
6997 bool ZExt = CB.paramHasAttr(ArgNo, Kind: Attribute::ZExt);
6998 bool SExt = CB.paramHasAttr(ArgNo, Kind: Attribute::SExt);
6999 if (ZExt) {
7000 assert(!SExt);
7001 return ShadowExtension::Zero;
7002 }
7003 if (SExt) {
7004 assert(!ZExt);
7005 return ShadowExtension::Sign;
7006 }
7007 return ShadowExtension::None;
7008 }
7009
7010 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
7011 unsigned GpOffset = SystemZGpOffset;
7012 unsigned FpOffset = SystemZFpOffset;
7013 unsigned VrIndex = 0;
7014 unsigned OverflowOffset = SystemZOverflowOffset;
7015 const DataLayout &DL = F.getDataLayout();
7016 for (const auto &[ArgNo, A] : llvm::enumerate(First: CB.args())) {
7017 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
7018 // SystemZABIInfo does not produce ByVal parameters.
7019 assert(!CB.paramHasAttr(ArgNo, Attribute::ByVal));
7020 Type *T = A->getType();
7021 ArgKind AK = classifyArgument(T);
7022 if (AK == ArgKind::Indirect) {
7023 T = MS.PtrTy;
7024 AK = ArgKind::GeneralPurpose;
7025 }
7026 if (AK == ArgKind::GeneralPurpose && GpOffset >= SystemZGpEndOffset)
7027 AK = ArgKind::Memory;
7028 if (AK == ArgKind::FloatingPoint && FpOffset >= SystemZFpEndOffset)
7029 AK = ArgKind::Memory;
7030 if (AK == ArgKind::Vector && (VrIndex >= SystemZMaxVrArgs || !IsFixed))
7031 AK = ArgKind::Memory;
7032 Value *ShadowBase = nullptr;
7033 Value *OriginBase = nullptr;
7034 ShadowExtension SE = ShadowExtension::None;
7035 switch (AK) {
7036 case ArgKind::GeneralPurpose: {
7037 // Always keep track of GpOffset, but store shadow only for varargs.
7038 uint64_t ArgSize = 8;
7039 if (GpOffset + ArgSize <= kParamTLSSize) {
7040 if (!IsFixed) {
7041 SE = getShadowExtension(CB, ArgNo);
7042 uint64_t GapSize = 0;
7043 if (SE == ShadowExtension::None) {
7044 uint64_t ArgAllocSize = DL.getTypeAllocSize(Ty: T);
7045 assert(ArgAllocSize <= ArgSize);
7046 GapSize = ArgSize - ArgAllocSize;
7047 }
7048 ShadowBase = getShadowAddrForVAArgument(IRB, ArgOffset: GpOffset + GapSize);
7049 if (MS.TrackOrigins)
7050 OriginBase = getOriginPtrForVAArgument(IRB, ArgOffset: GpOffset + GapSize);
7051 }
7052 GpOffset += ArgSize;
7053 } else {
7054 GpOffset = kParamTLSSize;
7055 }
7056 break;
7057 }
7058 case ArgKind::FloatingPoint: {
7059 // Always keep track of FpOffset, but store shadow only for varargs.
7060 uint64_t ArgSize = 8;
7061 if (FpOffset + ArgSize <= kParamTLSSize) {
7062 if (!IsFixed) {
7063 // PoP says: "A short floating-point datum requires only the
7064 // left-most 32 bit positions of a floating-point register".
7065 // Therefore, in contrast to AK_GeneralPurpose and AK_Memory,
7066 // don't extend shadow and don't mind the gap.
7067 ShadowBase = getShadowAddrForVAArgument(IRB, ArgOffset: FpOffset);
7068 if (MS.TrackOrigins)
7069 OriginBase = getOriginPtrForVAArgument(IRB, ArgOffset: FpOffset);
7070 }
7071 FpOffset += ArgSize;
7072 } else {
7073 FpOffset = kParamTLSSize;
7074 }
7075 break;
7076 }
7077 case ArgKind::Vector: {
7078 // Keep track of VrIndex. No need to store shadow, since vector varargs
7079 // go through AK_Memory.
7080 assert(IsFixed);
7081 VrIndex++;
7082 break;
7083 }
7084 case ArgKind::Memory: {
7085 // Keep track of OverflowOffset and store shadow only for varargs.
7086 // Ignore fixed args, since we need to copy only the vararg portion of
7087 // the overflow area shadow.
7088 if (!IsFixed) {
7089 uint64_t ArgAllocSize = DL.getTypeAllocSize(Ty: T);
7090 uint64_t ArgSize = alignTo(Value: ArgAllocSize, Align: 8);
7091 if (OverflowOffset + ArgSize <= kParamTLSSize) {
7092 SE = getShadowExtension(CB, ArgNo);
7093 uint64_t GapSize =
7094 SE == ShadowExtension::None ? ArgSize - ArgAllocSize : 0;
7095 ShadowBase =
7096 getShadowAddrForVAArgument(IRB, ArgOffset: OverflowOffset + GapSize);
7097 if (MS.TrackOrigins)
7098 OriginBase =
7099 getOriginPtrForVAArgument(IRB, ArgOffset: OverflowOffset + GapSize);
7100 OverflowOffset += ArgSize;
7101 } else {
7102 OverflowOffset = kParamTLSSize;
7103 }
7104 }
7105 break;
7106 }
7107 case ArgKind::Indirect:
7108 llvm_unreachable("Indirect must be converted to GeneralPurpose");
7109 }
7110 if (ShadowBase == nullptr)
7111 continue;
7112 Value *Shadow = MSV.getShadow(V: A);
7113 if (SE != ShadowExtension::None)
7114 Shadow = MSV.CreateShadowCast(IRB, V: Shadow, dstTy: IRB.getInt64Ty(),
7115 /*Signed*/ SE == ShadowExtension::Sign);
7116 ShadowBase = IRB.CreateIntToPtr(V: ShadowBase, DestTy: MS.PtrTy, Name: "_msarg_va_s");
7117 IRB.CreateStore(Val: Shadow, Ptr: ShadowBase);
7118 if (MS.TrackOrigins) {
7119 Value *Origin = MSV.getOrigin(V: A);
7120 TypeSize StoreSize = DL.getTypeStoreSize(Ty: Shadow->getType());
7121 MSV.paintOrigin(IRB, Origin, OriginPtr: OriginBase, TS: StoreSize,
7122 Alignment: kMinOriginAlignment);
7123 }
7124 }
7125 Constant *OverflowSize = ConstantInt::get(
7126 Ty: IRB.getInt64Ty(), V: OverflowOffset - SystemZOverflowOffset);
7127 IRB.CreateStore(Val: OverflowSize, Ptr: MS.VAArgOverflowSizeTLS);
7128 }
7129
7130 void copyRegSaveArea(IRBuilder<> &IRB, Value *VAListTag) {
7131 Value *RegSaveAreaPtrPtr = IRB.CreateIntToPtr(
7132 V: IRB.CreateAdd(
7133 LHS: IRB.CreatePtrToInt(V: VAListTag, DestTy: MS.IntptrTy),
7134 RHS: ConstantInt::get(Ty: MS.IntptrTy, V: SystemZRegSaveAreaPtrOffset)),
7135 DestTy: MS.PtrTy);
7136 Value *RegSaveAreaPtr = IRB.CreateLoad(Ty: MS.PtrTy, Ptr: RegSaveAreaPtrPtr);
7137 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
7138 const Align Alignment = Align(8);
7139 std::tie(args&: RegSaveAreaShadowPtr, args&: RegSaveAreaOriginPtr) =
7140 MSV.getShadowOriginPtr(Addr: RegSaveAreaPtr, IRB, ShadowTy: IRB.getInt8Ty(), Alignment,
7141 /*isStore*/ true);
7142 // TODO(iii): copy only fragments filled by visitCallBase()
7143 // TODO(iii): support packed-stack && !use-soft-float
7144 // For use-soft-float functions, it is enough to copy just the GPRs.
7145 unsigned RegSaveAreaSize =
7146 IsSoftFloatABI ? SystemZGpEndOffset : SystemZRegSaveAreaSize;
7147 IRB.CreateMemCpy(Dst: RegSaveAreaShadowPtr, DstAlign: Alignment, Src: VAArgTLSCopy, SrcAlign: Alignment,
7148 Size: RegSaveAreaSize);
7149 if (MS.TrackOrigins)
7150 IRB.CreateMemCpy(Dst: RegSaveAreaOriginPtr, DstAlign: Alignment, Src: VAArgTLSOriginCopy,
7151 SrcAlign: Alignment, Size: RegSaveAreaSize);
7152 }
7153
7154 // FIXME: This implementation limits OverflowOffset to kParamTLSSize, so we
7155 // don't know real overflow size and can't clear shadow beyond kParamTLSSize.
7156 void copyOverflowArea(IRBuilder<> &IRB, Value *VAListTag) {
7157 Value *OverflowArgAreaPtrPtr = IRB.CreateIntToPtr(
7158 V: IRB.CreateAdd(
7159 LHS: IRB.CreatePtrToInt(V: VAListTag, DestTy: MS.IntptrTy),
7160 RHS: ConstantInt::get(Ty: MS.IntptrTy, V: SystemZOverflowArgAreaPtrOffset)),
7161 DestTy: MS.PtrTy);
7162 Value *OverflowArgAreaPtr = IRB.CreateLoad(Ty: MS.PtrTy, Ptr: OverflowArgAreaPtrPtr);
7163 Value *OverflowArgAreaShadowPtr, *OverflowArgAreaOriginPtr;
7164 const Align Alignment = Align(8);
7165 std::tie(args&: OverflowArgAreaShadowPtr, args&: OverflowArgAreaOriginPtr) =
7166 MSV.getShadowOriginPtr(Addr: OverflowArgAreaPtr, IRB, ShadowTy: IRB.getInt8Ty(),
7167 Alignment, /*isStore*/ true);
7168 Value *SrcPtr = IRB.CreateConstGEP1_32(Ty: IRB.getInt8Ty(), Ptr: VAArgTLSCopy,
7169 Idx0: SystemZOverflowOffset);
7170 IRB.CreateMemCpy(Dst: OverflowArgAreaShadowPtr, DstAlign: Alignment, Src: SrcPtr, SrcAlign: Alignment,
7171 Size: VAArgOverflowSize);
7172 if (MS.TrackOrigins) {
7173 SrcPtr = IRB.CreateConstGEP1_32(Ty: IRB.getInt8Ty(), Ptr: VAArgTLSOriginCopy,
7174 Idx0: SystemZOverflowOffset);
7175 IRB.CreateMemCpy(Dst: OverflowArgAreaOriginPtr, DstAlign: Alignment, Src: SrcPtr, SrcAlign: Alignment,
7176 Size: VAArgOverflowSize);
7177 }
7178 }
7179
7180 void finalizeInstrumentation() override {
7181 assert(!VAArgOverflowSize && !VAArgTLSCopy &&
7182 "finalizeInstrumentation called twice");
7183 if (!VAStartInstrumentationList.empty()) {
7184 // If there is a va_start in this function, make a backup copy of
7185 // va_arg_tls somewhere in the function entry block.
7186 IRBuilder<> IRB(MSV.FnPrologueEnd);
7187 VAArgOverflowSize =
7188 IRB.CreateLoad(Ty: IRB.getInt64Ty(), Ptr: MS.VAArgOverflowSizeTLS);
7189 Value *CopySize =
7190 IRB.CreateAdd(LHS: ConstantInt::get(Ty: MS.IntptrTy, V: SystemZOverflowOffset),
7191 RHS: VAArgOverflowSize);
7192 VAArgTLSCopy = IRB.CreateAlloca(Ty: Type::getInt8Ty(C&: *MS.C), ArraySize: CopySize);
7193 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
7194 IRB.CreateMemSet(Ptr: VAArgTLSCopy, Val: Constant::getNullValue(Ty: IRB.getInt8Ty()),
7195 Size: CopySize, Align: kShadowTLSAlignment, isVolatile: false);
7196
7197 Value *SrcSize = IRB.CreateBinaryIntrinsic(
7198 ID: Intrinsic::umin, LHS: CopySize,
7199 RHS: ConstantInt::get(Ty: MS.IntptrTy, V: kParamTLSSize));
7200 IRB.CreateMemCpy(Dst: VAArgTLSCopy, DstAlign: kShadowTLSAlignment, Src: MS.VAArgTLS,
7201 SrcAlign: kShadowTLSAlignment, Size: SrcSize);
7202 if (MS.TrackOrigins) {
7203 VAArgTLSOriginCopy = IRB.CreateAlloca(Ty: Type::getInt8Ty(C&: *MS.C), ArraySize: CopySize);
7204 VAArgTLSOriginCopy->setAlignment(kShadowTLSAlignment);
7205 IRB.CreateMemCpy(Dst: VAArgTLSOriginCopy, DstAlign: kShadowTLSAlignment,
7206 Src: MS.VAArgOriginTLS, SrcAlign: kShadowTLSAlignment, Size: SrcSize);
7207 }
7208 }
7209
7210 // Instrument va_start.
7211 // Copy va_list shadow from the backup copy of the TLS contents.
7212 for (CallInst *OrigInst : VAStartInstrumentationList) {
7213 NextNodeIRBuilder IRB(OrigInst);
7214 Value *VAListTag = OrigInst->getArgOperand(i: 0);
7215 copyRegSaveArea(IRB, VAListTag);
7216 copyOverflowArea(IRB, VAListTag);
7217 }
7218 }
7219};
7220
7221/// i386-specific implementation of VarArgHelper.
7222struct VarArgI386Helper : public VarArgHelperBase {
7223 AllocaInst *VAArgTLSCopy = nullptr;
7224 Value *VAArgSize = nullptr;
7225
7226 VarArgI386Helper(Function &F, MemorySanitizer &MS,
7227 MemorySanitizerVisitor &MSV)
7228 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/4) {}
7229
7230 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
7231 const DataLayout &DL = F.getDataLayout();
7232 unsigned IntptrSize = DL.getTypeStoreSize(Ty: MS.IntptrTy);
7233 unsigned VAArgOffset = 0;
7234 for (const auto &[ArgNo, A] : llvm::enumerate(First: CB.args())) {
7235 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
7236 bool IsByVal = CB.paramHasAttr(ArgNo, Kind: Attribute::ByVal);
7237 if (IsByVal) {
7238 assert(A->getType()->isPointerTy());
7239 Type *RealTy = CB.getParamByValType(ArgNo);
7240 uint64_t ArgSize = DL.getTypeAllocSize(Ty: RealTy);
7241 Align ArgAlign = CB.getParamAlign(ArgNo).value_or(u: Align(IntptrSize));
7242 if (ArgAlign < IntptrSize)
7243 ArgAlign = Align(IntptrSize);
7244 VAArgOffset = alignTo(Size: VAArgOffset, A: ArgAlign);
7245 if (!IsFixed) {
7246 Value *Base = getShadowPtrForVAArgument(IRB, ArgOffset: VAArgOffset, ArgSize);
7247 if (Base) {
7248 Value *AShadowPtr, *AOriginPtr;
7249 std::tie(args&: AShadowPtr, args&: AOriginPtr) =
7250 MSV.getShadowOriginPtr(Addr: A, IRB, ShadowTy: IRB.getInt8Ty(),
7251 Alignment: kShadowTLSAlignment, /*isStore*/ false);
7252
7253 IRB.CreateMemCpy(Dst: Base, DstAlign: kShadowTLSAlignment, Src: AShadowPtr,
7254 SrcAlign: kShadowTLSAlignment, Size: ArgSize);
7255 }
7256 VAArgOffset += alignTo(Size: ArgSize, A: Align(IntptrSize));
7257 }
7258 } else {
7259 Value *Base;
7260 uint64_t ArgSize = DL.getTypeAllocSize(Ty: A->getType());
7261 Align ArgAlign = Align(IntptrSize);
7262 VAArgOffset = alignTo(Size: VAArgOffset, A: ArgAlign);
7263 if (DL.isBigEndian()) {
7264 // Adjusting the shadow for argument with size < IntptrSize to match
7265 // the placement of bits in big endian system
7266 if (ArgSize < IntptrSize)
7267 VAArgOffset += (IntptrSize - ArgSize);
7268 }
7269 if (!IsFixed) {
7270 Base = getShadowPtrForVAArgument(IRB, ArgOffset: VAArgOffset, ArgSize);
7271 if (Base)
7272 IRB.CreateAlignedStore(Val: MSV.getShadow(V: A), Ptr: Base, Align: kShadowTLSAlignment);
7273 VAArgOffset += ArgSize;
7274 VAArgOffset = alignTo(Size: VAArgOffset, A: Align(IntptrSize));
7275 }
7276 }
7277 }
7278
7279 Constant *TotalVAArgSize = ConstantInt::get(Ty: MS.IntptrTy, V: VAArgOffset);
7280 // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
7281 // a new class member i.e. it is the total size of all VarArgs.
7282 IRB.CreateStore(Val: TotalVAArgSize, Ptr: MS.VAArgOverflowSizeTLS);
7283 }
7284
7285 void finalizeInstrumentation() override {
7286 assert(!VAArgSize && !VAArgTLSCopy &&
7287 "finalizeInstrumentation called twice");
7288 IRBuilder<> IRB(MSV.FnPrologueEnd);
7289 VAArgSize = IRB.CreateLoad(Ty: MS.IntptrTy, Ptr: MS.VAArgOverflowSizeTLS);
7290 Value *CopySize = VAArgSize;
7291
7292 if (!VAStartInstrumentationList.empty()) {
7293 // If there is a va_start in this function, make a backup copy of
7294 // va_arg_tls somewhere in the function entry block.
7295 VAArgTLSCopy = IRB.CreateAlloca(Ty: Type::getInt8Ty(C&: *MS.C), ArraySize: CopySize);
7296 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
7297 IRB.CreateMemSet(Ptr: VAArgTLSCopy, Val: Constant::getNullValue(Ty: IRB.getInt8Ty()),
7298 Size: CopySize, Align: kShadowTLSAlignment, isVolatile: false);
7299
7300 Value *SrcSize = IRB.CreateBinaryIntrinsic(
7301 ID: Intrinsic::umin, LHS: CopySize,
7302 RHS: ConstantInt::get(Ty: MS.IntptrTy, V: kParamTLSSize));
7303 IRB.CreateMemCpy(Dst: VAArgTLSCopy, DstAlign: kShadowTLSAlignment, Src: MS.VAArgTLS,
7304 SrcAlign: kShadowTLSAlignment, Size: SrcSize);
7305 }
7306
7307 // Instrument va_start.
7308 // Copy va_list shadow from the backup copy of the TLS contents.
7309 for (CallInst *OrigInst : VAStartInstrumentationList) {
7310 NextNodeIRBuilder IRB(OrigInst);
7311 Value *VAListTag = OrigInst->getArgOperand(i: 0);
7312 Type *RegSaveAreaPtrTy = PointerType::getUnqual(C&: *MS.C);
7313 Value *RegSaveAreaPtrPtr =
7314 IRB.CreateIntToPtr(V: IRB.CreatePtrToInt(V: VAListTag, DestTy: MS.IntptrTy),
7315 DestTy: PointerType::get(C&: *MS.C, AddressSpace: 0));
7316 Value *RegSaveAreaPtr =
7317 IRB.CreateLoad(Ty: RegSaveAreaPtrTy, Ptr: RegSaveAreaPtrPtr);
7318 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
7319 const DataLayout &DL = F.getDataLayout();
7320 unsigned IntptrSize = DL.getTypeStoreSize(Ty: MS.IntptrTy);
7321 const Align Alignment = Align(IntptrSize);
7322 std::tie(args&: RegSaveAreaShadowPtr, args&: RegSaveAreaOriginPtr) =
7323 MSV.getShadowOriginPtr(Addr: RegSaveAreaPtr, IRB, ShadowTy: IRB.getInt8Ty(),
7324 Alignment, /*isStore*/ true);
7325 IRB.CreateMemCpy(Dst: RegSaveAreaShadowPtr, DstAlign: Alignment, Src: VAArgTLSCopy, SrcAlign: Alignment,
7326 Size: CopySize);
7327 }
7328 }
7329};
7330
7331/// Implementation of VarArgHelper that is used for ARM32, MIPS, RISCV,
7332/// LoongArch64.
7333struct VarArgGenericHelper : public VarArgHelperBase {
7334 AllocaInst *VAArgTLSCopy = nullptr;
7335 Value *VAArgSize = nullptr;
7336
7337 VarArgGenericHelper(Function &F, MemorySanitizer &MS,
7338 MemorySanitizerVisitor &MSV, const unsigned VAListTagSize)
7339 : VarArgHelperBase(F, MS, MSV, VAListTagSize) {}
7340
7341 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
7342 unsigned VAArgOffset = 0;
7343 const DataLayout &DL = F.getDataLayout();
7344 unsigned IntptrSize = DL.getTypeStoreSize(Ty: MS.IntptrTy);
7345 for (const auto &[ArgNo, A] : llvm::enumerate(First: CB.args())) {
7346 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
7347 if (IsFixed)
7348 continue;
7349 uint64_t ArgSize = DL.getTypeAllocSize(Ty: A->getType());
7350 if (DL.isBigEndian()) {
7351 // Adjusting the shadow for argument with size < IntptrSize to match the
7352 // placement of bits in big endian system
7353 if (ArgSize < IntptrSize)
7354 VAArgOffset += (IntptrSize - ArgSize);
7355 }
7356 Value *Base = getShadowPtrForVAArgument(IRB, ArgOffset: VAArgOffset, ArgSize);
7357 VAArgOffset += ArgSize;
7358 VAArgOffset = alignTo(Value: VAArgOffset, Align: IntptrSize);
7359 if (!Base)
7360 continue;
7361 IRB.CreateAlignedStore(Val: MSV.getShadow(V: A), Ptr: Base, Align: kShadowTLSAlignment);
7362 }
7363
7364 Constant *TotalVAArgSize = ConstantInt::get(Ty: MS.IntptrTy, V: VAArgOffset);
7365 // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
7366 // a new class member i.e. it is the total size of all VarArgs.
7367 IRB.CreateStore(Val: TotalVAArgSize, Ptr: MS.VAArgOverflowSizeTLS);
7368 }
7369
7370 void finalizeInstrumentation() override {
7371 assert(!VAArgSize && !VAArgTLSCopy &&
7372 "finalizeInstrumentation called twice");
7373 IRBuilder<> IRB(MSV.FnPrologueEnd);
7374 VAArgSize = IRB.CreateLoad(Ty: MS.IntptrTy, Ptr: MS.VAArgOverflowSizeTLS);
7375 Value *CopySize = VAArgSize;
7376
7377 if (!VAStartInstrumentationList.empty()) {
7378 // If there is a va_start in this function, make a backup copy of
7379 // va_arg_tls somewhere in the function entry block.
7380 VAArgTLSCopy = IRB.CreateAlloca(Ty: Type::getInt8Ty(C&: *MS.C), ArraySize: CopySize);
7381 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
7382 IRB.CreateMemSet(Ptr: VAArgTLSCopy, Val: Constant::getNullValue(Ty: IRB.getInt8Ty()),
7383 Size: CopySize, Align: kShadowTLSAlignment, isVolatile: false);
7384
7385 Value *SrcSize = IRB.CreateBinaryIntrinsic(
7386 ID: Intrinsic::umin, LHS: CopySize,
7387 RHS: ConstantInt::get(Ty: MS.IntptrTy, V: kParamTLSSize));
7388 IRB.CreateMemCpy(Dst: VAArgTLSCopy, DstAlign: kShadowTLSAlignment, Src: MS.VAArgTLS,
7389 SrcAlign: kShadowTLSAlignment, Size: SrcSize);
7390 }
7391
7392 // Instrument va_start.
7393 // Copy va_list shadow from the backup copy of the TLS contents.
7394 for (CallInst *OrigInst : VAStartInstrumentationList) {
7395 NextNodeIRBuilder IRB(OrigInst);
7396 Value *VAListTag = OrigInst->getArgOperand(i: 0);
7397 Type *RegSaveAreaPtrTy = PointerType::getUnqual(C&: *MS.C);
7398 Value *RegSaveAreaPtrPtr =
7399 IRB.CreateIntToPtr(V: IRB.CreatePtrToInt(V: VAListTag, DestTy: MS.IntptrTy),
7400 DestTy: PointerType::get(C&: *MS.C, AddressSpace: 0));
7401 Value *RegSaveAreaPtr =
7402 IRB.CreateLoad(Ty: RegSaveAreaPtrTy, Ptr: RegSaveAreaPtrPtr);
7403 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
7404 const DataLayout &DL = F.getDataLayout();
7405 unsigned IntptrSize = DL.getTypeStoreSize(Ty: MS.IntptrTy);
7406 const Align Alignment = Align(IntptrSize);
7407 std::tie(args&: RegSaveAreaShadowPtr, args&: RegSaveAreaOriginPtr) =
7408 MSV.getShadowOriginPtr(Addr: RegSaveAreaPtr, IRB, ShadowTy: IRB.getInt8Ty(),
7409 Alignment, /*isStore*/ true);
7410 IRB.CreateMemCpy(Dst: RegSaveAreaShadowPtr, DstAlign: Alignment, Src: VAArgTLSCopy, SrcAlign: Alignment,
7411 Size: CopySize);
7412 }
7413 }
7414};
7415
7416// ARM32, Loongarch64, MIPS and RISCV share the same calling conventions
7417// regarding VAArgs.
7418using VarArgARM32Helper = VarArgGenericHelper;
7419using VarArgRISCVHelper = VarArgGenericHelper;
7420using VarArgMIPSHelper = VarArgGenericHelper;
7421using VarArgLoongArch64Helper = VarArgGenericHelper;
7422
7423/// A no-op implementation of VarArgHelper.
7424struct VarArgNoOpHelper : public VarArgHelper {
7425 VarArgNoOpHelper(Function &F, MemorySanitizer &MS,
7426 MemorySanitizerVisitor &MSV) {}
7427
7428 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {}
7429
7430 void visitVAStartInst(VAStartInst &I) override {}
7431
7432 void visitVACopyInst(VACopyInst &I) override {}
7433
7434 void finalizeInstrumentation() override {}
7435};
7436
7437} // end anonymous namespace
7438
7439static VarArgHelper *CreateVarArgHelper(Function &Func, MemorySanitizer &Msan,
7440 MemorySanitizerVisitor &Visitor) {
7441 // VarArg handling is only implemented on AMD64. False positives are possible
7442 // on other platforms.
7443 Triple TargetTriple(Func.getParent()->getTargetTriple());
7444
7445 if (TargetTriple.getArch() == Triple::x86)
7446 return new VarArgI386Helper(Func, Msan, Visitor);
7447
7448 if (TargetTriple.getArch() == Triple::x86_64)
7449 return new VarArgAMD64Helper(Func, Msan, Visitor);
7450
7451 if (TargetTriple.isARM())
7452 return new VarArgARM32Helper(Func, Msan, Visitor, /*VAListTagSize=*/4);
7453
7454 if (TargetTriple.isAArch64())
7455 return new VarArgAArch64Helper(Func, Msan, Visitor);
7456
7457 if (TargetTriple.isSystemZ())
7458 return new VarArgSystemZHelper(Func, Msan, Visitor);
7459
7460 // On PowerPC32 VAListTag is a struct
7461 // {char, char, i16 padding, char *, char *}
7462 if (TargetTriple.isPPC32())
7463 return new VarArgPowerPC32Helper(Func, Msan, Visitor);
7464
7465 if (TargetTriple.isPPC64())
7466 return new VarArgPowerPC64Helper(Func, Msan, Visitor);
7467
7468 if (TargetTriple.isRISCV32())
7469 return new VarArgRISCVHelper(Func, Msan, Visitor, /*VAListTagSize=*/4);
7470
7471 if (TargetTriple.isRISCV64())
7472 return new VarArgRISCVHelper(Func, Msan, Visitor, /*VAListTagSize=*/8);
7473
7474 if (TargetTriple.isMIPS32())
7475 return new VarArgMIPSHelper(Func, Msan, Visitor, /*VAListTagSize=*/4);
7476
7477 if (TargetTriple.isMIPS64())
7478 return new VarArgMIPSHelper(Func, Msan, Visitor, /*VAListTagSize=*/8);
7479
7480 if (TargetTriple.isLoongArch64())
7481 return new VarArgLoongArch64Helper(Func, Msan, Visitor,
7482 /*VAListTagSize=*/8);
7483
7484 return new VarArgNoOpHelper(Func, Msan, Visitor);
7485}
7486
7487bool MemorySanitizer::sanitizeFunction(Function &F, TargetLibraryInfo &TLI) {
7488 if (!CompileKernel && F.getName() == kMsanModuleCtorName)
7489 return false;
7490
7491 if (F.hasFnAttribute(Kind: Attribute::DisableSanitizerInstrumentation))
7492 return false;
7493
7494 MemorySanitizerVisitor Visitor(F, *this, TLI);
7495
7496 // Clear out memory attributes.
7497 AttributeMask B;
7498 B.addAttribute(Val: Attribute::Memory).addAttribute(Val: Attribute::Speculatable);
7499 F.removeFnAttrs(Attrs: B);
7500
7501 return Visitor.runOnFunction();
7502}
7503