|  | 
| 1990 | 1990 |                             "attributes" : "ReadMem" | 
| 1991 | 1991 |                           }, | 
| 1992 | 1992 | 
 | 
|  | 1993 | +### ``llvm.genx.lsc.load.merge.*.<return type if not void>.<any type>.<any type>`` : lsc_load merge instructions | 
|  | 1994 | +### ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
|  | 1995 | +### | 
|  | 1996 | +### * ``llvm.genx.lsc.load.merge.slm`` : | 
|  | 1997 | +### * ``llvm.genx.lsc.load.merge.bti`` : | 
|  | 1998 | +### * ``llvm.genx.lsc.load.merge.stateless`` : | 
|  | 1999 | +### | 
|  | 2000 | +### * Exec_size ignored unless operation is transposed (DataOrder == Tranpose) | 
|  | 2001 | +### * arg0: {1,32}Xi1 predicate (overloaded) | 
|  | 2002 | +### * arg1: i8 Subopcode, [MBZ] | 
|  | 2003 | +### * arg2: i8 Caching behavior for L1, [MBC] | 
|  | 2004 | +### * arg3: i8 Caching behavior for L3, [MBC] | 
|  | 2005 | +### * arg4: i16 Address scale, [MBC] | 
|  | 2006 | +### * arg5: i32 Immediate offset added to each address, [MBC] | 
|  | 2007 | +### * arg6: i8 The dataum size, [MBC] | 
|  | 2008 | +### * arg7: i8 Number of elements to load per address (vector size), [MBC] | 
|  | 2009 | +### * arg8: i8 Indicates if the data is transposed during the transfer, [MBC] | 
|  | 2010 | +### * arg9: i8 Channel mask for quad versions, [MBC] | 
|  | 2011 | +### * arg10: {1,32}Xi{16,32,64} The vector register holding offsets (overloaded) | 
|  | 2012 | +###          for flat version Base Address + Offset[i] goes here | 
|  | 2013 | +### * arg11: i32 surface to use for this operation. This can be an immediate or a register | 
|  | 2014 | +###          for flat and bindless version pass zero here | 
|  | 2015 | +### * arg12: VXi{16,32,64} The data to merge disable channels (overloaded) | 
|  | 2016 | +### | 
|  | 2017 | +### * Return value: the value read merged witg arg12 by predicate | 
|  | 2018 | +### | 
|  | 2019 | +### Cache mappings are: | 
|  | 2020 | +### | 
|  | 2021 | +###   - 0 -> .df (default) | 
|  | 2022 | +###   - 1 -> .uc (uncached) | 
|  | 2023 | +###   - 2 -> .ca (cached) | 
|  | 2024 | +###   - 3 -> .wb (writeback) | 
|  | 2025 | +###   - 4 -> .wt (writethrough) | 
|  | 2026 | +###   - 5 -> .st (streaming) | 
|  | 2027 | +###   - 6 -> .ri (read-invalidate) | 
|  | 2028 | +### | 
|  | 2029 | +### Only certain combinations of CachingL1 with CachingL3 are valid on hardware. | 
|  | 2030 | +### | 
|  | 2031 | +### +---------+-----+-----------------------------------------------------------------------+ | 
|  | 2032 | +### |  L1     |  L3 | Notes                                                                 | | 
|  | 2033 | +### +---------+-----+-----------------------------------------------------------------------+ | 
|  | 2034 | +### | .df     | .df | default behavior on both L1 and L3 (L3 uses MOCS settings)            | | 
|  | 2035 | +### +---------+-----+-----------------------------------------------------------------------+ | 
|  | 2036 | +### | .uc     | .uc | uncached (bypass) both L1 and L3                                      | | 
|  | 2037 | +### +---------+-----+-----------------------------------------------------------------------+ | 
|  | 2038 | +### | .st     | .uc | streaming L1 / bypass L3                                              | | 
|  | 2039 | +### +---------+-----+-----------------------------------------------------------------------+ | 
|  | 2040 | +### | .uc     | .ca | bypass L1 / cache in L3                                               | | 
|  | 2041 | +### +---------+-----+-----------------------------------------------------------------------+ | 
|  | 2042 | +### | .ca     | .uc | cache in L1 / bypass L3                                               | | 
|  | 2043 | +### +---------+-----+-----------------------------------------------------------------------+ | 
|  | 2044 | +### | .ca     | .ca | cache in both L1 and L3                                               | | 
|  | 2045 | +### +---------+-----+-----------------------------------------------------------------------+ | 
|  | 2046 | +### | .st     | .ca | streaming L1 / cache in L3                                            | | 
|  | 2047 | +### +---------+-----+-----------------------------------------------------------------------+ | 
|  | 2048 | +### | .ri     | .ca | read-invalidate (e.g. last-use) on L1 loads / cache in L3             | | 
|  | 2049 | +### +---------+-----+-----------------------------------------------------------------------+ | 
|  | 2050 | +### | 
|  | 2051 | +### Immediate offset. The compiler may be able to fuse this add into the message, otherwise | 
|  | 2052 | +### additional instructions are generated to honor the semantics. | 
|  | 2053 | +### Alternative variant for predicated variant of loads - merge destination for disabled | 
|  | 2054 | +### lanes with values from additional input(arg12) | 
|  | 2055 | +### | 
|  | 2056 | +### Dataum size mapping is | 
|  | 2057 | +### | 
|  | 2058 | +###   - 1 = :u8 | 
|  | 2059 | +###   - 2 = :u16 | 
|  | 2060 | +###   - 3 = :u32 | 
|  | 2061 | +###   - 4 = :u64 | 
|  | 2062 | +###   - 5 = :u8u32 (load 8b, zero extend to 32b; store the opposite), | 
|  | 2063 | +###   - 6 = :u16u32 (load 8b, zero extend to 32b; store the opposite), | 
|  | 2064 | +###   - 7 = :u16u32h (load 16b into high 16 of each 32b; store the high 16) | 
|  | 2065 | +### | 
|  | 2066 | +    "lsc_load_merge_slm" : { "result" : "anyvector", | 
|  | 2067 | +                             "arguments" : ["any","char","char","char","short","int","char","char","char","char","any","int"], | 
|  | 2068 | +                             "attributes" : "ReadMem" | 
|  | 2069 | +                           }, | 
|  | 2070 | +    "lsc_load_merge_stateless" : { "result" : "anyvector", | 
|  | 2071 | +                                   "arguments" : ["any","char","char","char","short","int","char","char","char","char","any","int"], | 
|  | 2072 | +                                    "attributes" : "ReadMem" | 
|  | 2073 | +                                 }, | 
|  | 2074 | +    "lsc_load_merge_bindless" : { "result" : "anyvector", | 
|  | 2075 | +                                  "arguments" : ["any","char","char","char","short","int","char","char","char","char","any","int"], | 
|  | 2076 | +                                  "attributes" : "ReadMem" | 
|  | 2077 | +                                }, | 
|  | 2078 | +    "lsc_load_merge_bti" : { "result" : "anyvector", | 
|  | 2079 | +                             "arguments" : ["any","char","char","char","short","int","char","char","char","char","any","int"], | 
|  | 2080 | +                             "attributes" : "ReadMem" | 
|  | 2081 | +                           }, | 
|  | 2082 | +    "lsc_load_merge_quad_slm" : { "result" : "anyvector", | 
|  | 2083 | +                                  "arguments" : ["any","char","char","char","short","int","char","char","char","char","any","int"], | 
|  | 2084 | +                                  "attributes" : "ReadMem" | 
|  | 2085 | +                                }, | 
|  | 2086 | +    "lsc_load_merge_quad_stateless" : { "result" : "anyvector", | 
|  | 2087 | +                                        "arguments" : ["any","char","char","char","short","int","char","char","char","char","any","int"], | 
|  | 2088 | +                                        "attributes" : "ReadMem" | 
|  | 2089 | +                                      }, | 
|  | 2090 | +    "lsc_load_merge_quad_bindless" : { "result" : "anyvector", | 
|  | 2091 | +                                       "arguments" : ["any","char","char","char","short","int","char","char","char","char","any","int"], | 
|  | 2092 | +                                       "attributes" : "ReadMem" | 
|  | 2093 | +                                     }, | 
|  | 2094 | +    "lsc_load_merge_quad_bti" : { "result" : "anyvector", | 
|  | 2095 | +                                  "arguments" : ["any","char","char","char","short","int","char","char","char","char","any","int"], | 
|  | 2096 | +                                  "attributes" : "ReadMem" | 
|  | 2097 | +                                }, | 
|  | 2098 | + | 
| 1993 | 2099 | ### ``llvm.genx.lsc.store.*.<any type>.<any type>.<any vector>`` : lsc_store instructions | 
| 1994 | 2100 | ### ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
| 1995 | 2101 | ### | 
|  | 
0 commit comments