DeviceGroupedGemmTileLoop< ALayout, BLayout, DsLayout, ELayout, ADataType, BDataType, DsDataType, EDataType, AElementwiseOperation, BElementwiseOperation, CDEElementwiseOperation > Struct Template Reference#
ck::tensor_operation::device::DeviceGroupedGemmTileLoop< ALayout, BLayout, DsLayout, ELayout, ADataType, BDataType, DsDataType, EDataType, AElementwiseOperation, BElementwiseOperation, CDEElementwiseOperation > Struct Template Reference
Grouped GEMM kernel using output Tile Looping algorithm. More...
#include <device_grouped_gemm_tile_loop.hpp>
Inheritance diagram for ck::tensor_operation::device::DeviceGroupedGemmTileLoop< ALayout, BLayout, DsLayout, ELayout, ADataType, BDataType, DsDataType, EDataType, AElementwiseOperation, BElementwiseOperation, CDEElementwiseOperation >:
Additional Inherited Members | |
| Public Member Functions inherited from ck::tensor_operation::device::DeviceGroupedGemm< ALayout, BLayout, DsLayout, ELayout, ADataType, BDataType, DsDataType, EDataType, AElementwiseOperation, BElementwiseOperation, CDEElementwiseOperation > | |
| virtual std::unique_ptr< BaseArgument > | MakeArgumentPointer (std::vector< const void * > &p_a, std::vector< const void * > &p_b, std::vector< std::array< const void *, NumDTensor > > &p_ds, std::vector< void * > &p_e, std::vector< GemmDesc > &gemm_desc, AElementwiseOperation a_element_op, BElementwiseOperation b_element_op, CDEElementwiseOperation c_element_op)=0 |
| virtual std::unique_ptr< BaseInvoker > | MakeInvokerPointer ()=0 |
| virtual void | SetDeviceKernelArgs (BaseArgument *p_arg, void *p_dev_kernel_args, const void *p_host_kernel_args) const |
| Sets the device kernel arguments pointer and may copy data to device. | |
| virtual size_t | GetDeviceKernelArgSize (const BaseArgument *p_arg) const |
| Gets the device kernel argument size. | |
| Public Member Functions inherited from ck::tensor_operation::device::BaseOperator | |
| BaseOperator ()=default | |
| BaseOperator (const BaseOperator &)=default | |
| BaseOperator & | operator= (const BaseOperator &)=default |
| virtual bool | IsSupportedArgument (const BaseArgument *) |
| virtual std::string | GetTypeString () const |
| virtual std::string | GetInstanceString () const |
| virtual std::string | GetTypeIdName () const |
| virtual std::optional< std::string > | GetObjectName () const |
| virtual std::optional< std::string > | GetTemplateInfo () const |
| virtual std::string | GetTypeIdHashCode () const |
| virtual size_t | GetWorkSpaceSize (const BaseArgument *) const |
| virtual void | SetWorkSpacePointer (BaseArgument *p_arg, void *p_workspace, const StreamConfig &=StreamConfig{}) const |
| virtual | ~BaseOperator () |
| Static Public Attributes inherited from ck::tensor_operation::device::DeviceGroupedGemm< ALayout, BLayout, DsLayout, ELayout, ADataType, BDataType, DsDataType, EDataType, AElementwiseOperation, BElementwiseOperation, CDEElementwiseOperation > | |
| static constexpr index_t | NumDTensor |
Detailed Description
template<typename ALayout, typename BLayout, typename DsLayout, typename ELayout, typename ADataType, typename BDataType, typename DsDataType, typename EDataType, typename AElementwiseOperation, typename BElementwiseOperation, typename CDEElementwiseOperation>
struct ck::tensor_operation::device::DeviceGroupedGemmTileLoop< ALayout, BLayout, DsLayout, ELayout, ADataType, BDataType, DsDataType, EDataType, AElementwiseOperation, BElementwiseOperation, CDEElementwiseOperation >
struct ck::tensor_operation::device::DeviceGroupedGemmTileLoop< ALayout, BLayout, DsLayout, ELayout, ADataType, BDataType, DsDataType, EDataType, AElementwiseOperation, BElementwiseOperation, CDEElementwiseOperation >
Grouped GEMM kernel using output Tile Looping algorithm.
- This kernel does not require any knowledge about input data sizes (GEMM M/N/K)
- It requires only the number of groups to launch. Other information like data pointers and GEMM sizes, packed into gemm kernel args may be all dynamic (known only at kernel run-time).
- Note
- This kernel does not support SplitK.
The documentation for this struct was generated from the following file: