cudnn library (1)
TRANSCRIPT
-
8/10/2019 CUDNN Library (1)
1/38
-
8/10/2019 CUDNN Library (1)
2/38
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 2
-
8/10/2019 CUDNN Library (1)
3/38
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 1
Chapter 1.INTRODUCTION
NVIDIAcuDNN is a GPU-accelerated library of primitives for deep neural networks.It provides highly tuned implementations of routines arising frequently in DNNapplications:
Convolution forward and backward, including cross-correlation
Pooling forward and backward
Softmax forward and backward
Neuron activations forward and backward:
Rectified linear (ReLU)
Sigmoid
Hyperbolic tangent (TANH)
Tensor transformation functions
cuDNN's convolution routines aim for performance competitive with the fastest GEMM
(matrix multiply) based implementations of such routines while using significantly lessmemory.
cuDNN features customizable data layouts, supporting flexible dimension ordering,striding, and subregions for the 4D tensors used as inputs and outputs to all of itsroutines. This flexibility allows easy integration into any neural network implementationand avoids the input/output transposition steps sometimes necessary with GEMM-basedconvolutions.
cuDNN offers a context-based API that allows for easy multithreading and (optional)interoperability with CUDA streams.
-
8/10/2019 CUDNN Library (1)
4/38
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 2
Chapter 2.GENERAL DESCRIPTION
2.1. Programming ModelThe cuDNN Library exposes a Host API but assumes that for operations using the GPUthe data is directly accessible from the device.
The application must initialize the handle to the cuDNN library context by calling thecudnnCreate()function. Then, the handle is explicitly passed to every subsequentlibrary function call that operate on GPU data. Once the application finishes using thelibrary, it must call the function cudnnDestroy()to release the resources associatedwith the cuDNN library context. This approach allows the user to explicitly controlthe library setup when using multiple host threads and multiple GPUs. For example,the application can use cudaSetDevice()to associate different devices with differenthost threads and in each of those host threads it can initialize a unique handle to the
cuDNN library context, which will use the particular device associated with thathost thread. Then, the cuDNN library function calls made with different handle willautomatically dispatch the computation to different devices. The device associatedwith a particular cuDNN context is assumed to remain unchanged between thecorresponding cudnnCreate()and cudnnDestroy()calls. In order for the cuDNNlibrary to use a different device within the same host thread, the application must set thenew device to be used by calling cudaSetDevice()and then create another cuDNNcontext, which will be associated with the new device, by calling cudnnCreate().
2.2. Thread SafetyThe library is thread safe and its functions can be called from multiple host threads,even with the same handle. When multiple threads share the same handle, extreme careneeds to be taken when the handle configuration is changed because that change willaffect potentially subsequent cuDNN calls in all threads. It is even more true for thedestruction of the handle. So it is not recommended that multiple threads share the samecuDNN handle.
-
8/10/2019 CUDNN Library (1)
5/38
General Description
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 3
2.3. ReproducibilityBy design, most of cuDNN API routines from a given version generate the same bit-wise results at every run when executed on GPUs with the same architecture andthe same number of SMs. However, bit-wise reproducibility is not guaranteed acrossversions, as the implementation of a given routine may change. With the current release,the following routines do not guarantee reproducibility because they use atomic addoperations:
cudnnConvolutionBackwardFilter
cudnnConvolutionBackwardData
2.4. RequirementscuDNN supports NVIDIA GPUs of compute capability 3.0 and higher and requires an
NVIDIA Driver compatible with CUDA Toolkit 6.5.
-
8/10/2019 CUDNN Library (1)
6/38
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 4
Chapter 3.CUDNN DATATYPES REFERENCE
This chapter describes all the types and enums of the cuDNN library API.
3.1. cudnnHandle_tcudnnHandle_tis a pointer to an opaque structure holding the cuDNN library context.The cuDNN library context must be created using cudnnCreate()and the returnedhandle must be passed to all subsequent library function calls. The context should bedestroyed at the end using cudnnDestroy(). The context is associated with only oneGPU device, the current device at the time of the call to cudnnCreate(). Howevermultiple contexts can be created on the same GPU device.
3.2. cudnnStatus_tcudnnStatus_tis an enumerated type used for function status returns. All cuDNNlibrary functions return their status, which can be one of the following values:
Value Meaning
CUDNN_STATUS_SUCCESS The operation completed successfully.
CUDNN_STATUS_NOT_INITIALIZED The cuDNN library was not initialized properly.
This error is usually returned when a call to
cudnnCreate()fails or when cudnnCreate()has not been called prior to calling another cuDNN
routine. In the former case, it is usually due
to an error in the CUDA Runtime API called by
cudnnCreate()or by an error in the hardwaresetup.
CUDNN_STATUS_ALLOC_FAILED Resource allocation failed inside the cuDNN
library. This is usually caused by an internal
cudaMalloc()failure.
To correct: prior to the function call, deallocate
previously allocated memory as much as possible.
-
8/10/2019 CUDNN Library (1)
7/38
cuDNN Datatypes Reference
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 5
Value Meaning
CUDNN_STATUS_BAD_PARAM An incorrect value or parameter was passed to the
function.
To correct: ensure that all the parameters being
passed have valid values.
CUDNN_STATUS_ARCH_MISMATCH The function requires a feature absent fromthe current GPU device. Note that cuDNN only
supports devices with compute capabilities greater
than or equal to 3.0.
To correct: compile and run the application on a
device with appropriate compute capability.
CUDNN_STATUS_MAPPING_ERROR An access to GPU memory space failed, which isusually caused by a failure to bind a texture.
To correct: prior to the function call, unbind any
previously bound textures.
Otherwise, this may indicate an internal error/bug
in the library.
CUDNN_STATUS_EXECUTION_FAILED The GPU program failed to execute. This is usuallycaused by a failure to launch some cuDNN kernel
on the GPU, which can occur for multiple reasons.
To correct: check that the hardware, an
appropriate version of the driver, and the cuDNN
library are correctly installed.
Otherwise, this may indicate a internal error/bug
in the library.
CUDNN_STATUS_INTERNAL_ERROR An internal cuDNN operation failed.
CUDNN_STATUS_NOT_SUPPORTED The functionality requested is not presently
supported by cuDNN.
CUDNN_STATUS_LICENSE_ERROR The functionality requested requires some licenseand an error was detected when trying to check
the current licensing. This error can happen if
the license is not present or is expired or if the
environment variable NVIDIA_LICENSE_FILE is not
set properly.
3.3. cudnnTensor4dDescriptor_t
cudnnCreateTensor4dDescriptor_t is a pointer to an opaque structure holdingthe description of a generic 4D dataset. cudnnCreateTensor4dDescriptor()is used to create one instance, and cudnnSetTensor4dDescriptor()orcudnnSetTensor4dDescriptorEx() must be used to initialize this instance.
-
8/10/2019 CUDNN Library (1)
8/38
cuDNN Datatypes Reference
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 6
3.4. cudnnFilterDescriptor_tcudnnFilterDescriptor_tis a pointer to an opaque structure holding the descriptionof a filter dataset. cudnnCreateFilterDescriptor()is used to create one instance,and cudnnSetFilterDescriptor()must be used to initialize this instance.
3.5. cudnnConvolutionDescriptor_tcudnnConvolutionDescriptor_tis a pointer to an opaque structure holding thedescription of a convolution operation. cudnnCreateFilterDescriptor()is used tocreate one instance, and cudnnSetFilterDescriptor()must be used to initialize thisinstance.
3.6. cudnnPoolingDescriptor_tcudnnPoolingDescriptor_tis a pointer to an opaque structure holding thedescription of a pooling operation. cudnnCreatePoolingDescriptor()is used tocreate one instance, and cudnnSetPoolingDescriptor()must be used to initializethis instance.
3.7. cudnnDataType_tcudnnDataType_tis an enumerated type indicating the data type to which a tensordescriptor or filter descriptor refers.
Value Meaning
CUDNN_DATA_FLOAT The data is 32-bit single-precision floating point
(float).
CUDNN_DATA_DOUBLE The data is 64-bit double-precision floating point
(double).
3.8. cudnnTensorFormat_tcudnnTensorFormat_tis an enumerated type used bycudnnSetTensor4dDescriptor()to create a tensor with a pre-defined layout.
Value Meaning
CUDNN_TENSOR_NCHW This tensor format specifies that the data is laid
out in the following order: image, features map,
rows, columns. The strides are implicitly defined
in such a way that the data are contiguous in
memory with no padding between images, feature
maps, rows, and columns; the columns are the
-
8/10/2019 CUDNN Library (1)
9/38
cuDNN Datatypes Reference
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 7
Value Meaning
inner dimension and the images are the outermost
dimension.
CUDNN_TENSOR_NHWC This tensor format specifies that the data is laid
out in the following order: image, rows, columns,
features maps. The strides are implicitly defined insuch a way that the data are contiguous in memory
with no padding between images, rows, columns,
and features maps; the feature maps are the
inner dimension and the images are the outermost
dimension.
3.9. cudnnAddMode_tcudnnAddMode_tis an enumerated type used by cudnnAddTensor4d()to specify howa bias tensor is added to an input/output tensor.
Value Meaning
CUDNN_ADD_IMAGEor CUDNN_ADD_SAME_HW In this mode, the bias tensor is defined as one
image with one feature map. This image will be
added to every feature map of every image of the
input/output tensor.
CUDNN_ADD_FEATURE_MAPor
CUDNN_ADD_SAME_CHW
In this mode, the bias tensor is defined as one
image with multiple feature maps. This image
will be added to every image of the input/output
tensor.
CUDNN_ADD_SAME_C In this mode, the bias tensor is defined as oneimage with multiple feature maps of dimension
1x1; it can be seen as an vector of feature maps.Each feature map of the bias tensor will be added
to the corresponding feature map of all height-by-
width pixels of every image of the input/output
tensor.
CUDNN_ADD_FULL_TENSOR In this mode, the bias tensor has the same
dimensions as the input/output tensor. It will be
added point-wise to the input/output tensor.
3.10. cudnnConvolutionMode_t
cudnnConvolutionMode_tis an enumerated type used bycudnnSetConvolutionDescriptor() to configure a convolution descriptor. Thefilter used for the convolution can be applied in two different ways, correspondingmathematically to a convolution or to a cross-correlation. (A cross-correlation isequivalent to a convolution with its filter rotated by 180 degrees.)
-
8/10/2019 CUDNN Library (1)
10/38
cuDNN Datatypes Reference
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 8
Value Meaning
CUDNN_CONVOLUTION In this mode, a convolution operation will be done
when applying the filter to the images.
CUDNN_CROSS_CORRELATION In this mode, a cross-correlation operation will be
done when applying the filter to the images.
3.11. cudnnConvolutionPath_tcudnnConvolutionPath_tis an enumerated type used by the helper routinecudnnGetOutputTensor4dDim()to select the results to output.
Value Meaning
CUDNN_CONVOLUTION_FWD cudnnGetOutputTensor4dDim()will returndimensions related to the output tensor of the
forward convolution.
CUDNN_CONVOLUTION_WEIGHT_GRAD cudnnGetOutputTensor4dDim()will return thedimensions of the output filter produced while
computing the gradients, which is part of the
backward convolution.
CUDNN_CONVOLUTION_DATA_GRAD cudnnGetOutputTensor4dDim()will return the
dimensions of the output tensor produced while
computing the gradients, which is part of the
backward convolution.
3.12. cudnnAccumulateResult_tcudnnAccumulateResult_tis an enumerated type used bycudnnConvolutionForward(), cudnnConvolutionBackwardFilter() andcudnnConvolutionBackwardData() to specify whether those routines accumulatetheir results with the output tensor or simply write them to it, overwriting the previousvalue.
Value Meaning
CUDNN_RESULT_ACCUMULATE The results are accumulated with (added to the
previous value of) the output tensor.
CUDNN_RESULT_NO_ACCUMULATE The results overwrite the output tensor.
3.13. cudnnSoftmaxAlgorithm_tcudnnSoftmaxAlgorithm_tis used to select an implementation of the softmaxfunction used in cudnnSoftmaxForward()and cudnnSoftmaxBackward().
-
8/10/2019 CUDNN Library (1)
11/38
-
8/10/2019 CUDNN Library (1)
12/38
-
8/10/2019 CUDNN Library (1)
13/38
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 11
Chapter 4.CUDNN API REFERENCE
This chapter describes the API of all the routines of the cuDNN library.
4.1. cudnnCreatecudnnStatus_t cudnnCreate(cudnnHandle_t *handle)
This function initializes the cuDNN library and creates a handle to an opaquestructure holding the cuDNN library context. It allocates hardware resources onthe host and device and must be called prior to making any other cuDNN librarycalls. The cuDNN library context is tied to the current CUDA device. To use thelibrary on multiple devices, one cuDNN handle needs to be created for each device.For a given device, multiple cuDNN handles with different configurations (e.g.,different current CUDA streams) may be created. Because cudnnCreateallocatessome internal resources, the release of those resources by calling cudnnDestroywill
implicitly call cudaDeviceSynchronize; therefore, the recommended best practiceis to call cudnnCreate/cudnnDestroyoutside of performance-critical code paths.For multithreaded applications that use the same device from different threads, therecommended programming model is to create one (or a few, as is convenient) cuDNNhandle(s) per thread and use that cuDNN handle for the entire life of the thread.
Return Value Meaning
CUDNN_STATUS_SUCCESS The initialization succeeded.
CUDNN_STATUS_NOT_INITIALIZED CUDA Runtime API initialization failed.
CUDNN_STATUS_ALLOC_FAILED The resources could not be allocated.
4.2. cudnnDestroycudnnStatus_t cudnnDestroy(cudnnHandle_t handle)
This function releases hardware resources used by the cuDNN library. This functionis usually the last call with a particular handle to the cuDNN library. BecausecudnnCreateallocates some internal resources, the release of those resources by
-
8/10/2019 CUDNN Library (1)
14/38
cuDNN API Reference
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 12
calling cudnnDestroywill implicitly call cudaDeviceSynchronize; therefore,the recommended best practice is to call cudnnCreate/cudnnDestroyoutside ofperformance-critical code paths.
Return Value Meaning
CUDNN_STATUS_SUCCESSThe cuDNN context destruction was successful.
CUDNN_STATUS_NOT_INITIALIZED The library was not initialized.
4.3. cudnnSetStreamcudnnStatus_t cudnnSetStream(cudnnHandle_t handle, cudaStream_t streamId)
This function sets the cuDNN library stream, which will be used to execute allsubsequent calls to the cuDNN library functions with that particular handle. If thecuDNN library stream is not set, all kernels use the default (NULL) stream. In particular,this routine can be used to change the stream between kernel launches and then to reset
the cuDNN library stream back toNULL.
Return Value Meaning
CUDNN_STATUS_SUCCESS The stream was set successfully.
4.4. cudnnGetStreamcudnnStatus_t cudnnGetStream(cudnnHandle_t handle, cudaStream_t *streamId)
This function gets the cuDNN library stream, which is being used to execute all calls tothe cuDNN library functions. If the cuDNN library stream is not set, all kernels use the
defaultNULLstream.
Return Value Meaning
CUDNN_STATUS_SUCCESS The stream was returned successfully.
4.5. cudnnCreateTensor4dDescriptorcudnnStatus_t cudnnCreateTensor4dDescriptor(cudnnTensor4dDescriptor_t*tensorDesc)
This function creates a Tensor4D descriptor object by allocating the memory needed to
hold its opaque structure.Return Value Meaning
CUDNN_STATUS_SUCCESS The object was created successfully.
CUDNN_STATUS_ALLOC_FAILED The resources could not be allocated.
-
8/10/2019 CUDNN Library (1)
15/38
-
8/10/2019 CUDNN Library (1)
16/38
cuDNN API Reference
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 14
This function initializes a previously created Tensor4D descriptor object, similarly tocudnnSetTensor4dDescriptorbut with the strides explicitly passed as parameters.This can be used to lay out the 4D tensor in any order or simply to define gaps betweendimensions.
At present, some cuDNN routines have limited support for strides; forexample,wStride==1is sometimes required. Those routines will return
CUDNN_STATUS_NOT_SUPPORTED if a Tensor4D object with an unsupported stride is
used. cudnnTransformTensor4dcan be used to convert the data to a supported
layout.
Param In/out Meaning
tensorDesc input/
output
Handle to a previously created tensor descriptor.
datatype input Data type.
n input Number of images.
c input Number of feature maps per image.
h input Height of each feature map.
w input Width of each feature map.
nStride input Stride between two consecutive images.
cStride input Stride between two consecutive feature maps.
hStride input Stride between two consecutive rows.
wStride input Stride between two consecutive columns.
The possible error values returned by this function and their meanings are listed below.
Return Value Meaning
CUDNN_STATUS_SUCCESS The object was set successfully.
CUDNN_STATUS_BAD_PARAM At least one of the parameters n,c,h,wor
nStride,cStride,hStride,wStrideis negative
or dataTypehas an invalid enumerant value.
4.8. cudnnGetTensor4dDescriptorcudnnStatus_tcudnnGetTensor4dDescriptor( cudnnTensor4dDescriptor_t tensorDesc, cudnnDataType_t *dataType, int*n, int*c, int*h, int*w, int*nStride, int*cStride, int*hStride, int*wStride )
-
8/10/2019 CUDNN Library (1)
17/38
cuDNN API Reference
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 15
This function queries the parameters of the previouly initialized Tensor4D descriptorobject.
Param In/out Meaning
tensorDesc input Handle to a previously insitialized tensor descriptor.
datatype output Data type.
n output Number of images.
c output Number of feature maps per image.
h output Height of each feature map.
w output Width of each feature map.
nStride output Stride between two consecutive images.
cStride output Stride between two consecutive feature maps.
hStride output Stride between two consecutive rows.
wStride output Stride between two consecutive columns.
The possible error values returned by this function and their meanings are listed below.
Return Value Meaning
CUDNN_STATUS_SUCCESS The operation succeeded.
4.9. cudnnDestroyTensor4dDescriptorcudnnStatus_t cudnnDestroyTensor4dDescriptor(cudnnTensor4dDescriptor_t
tensorDesc)
This function destroys a previously created Tensor4D descriptor object.
Return Value Meaning
CUDNN_STATUS_SUCCESS The object was destroyed successfully.
4.10. cudnnTransformTensor4dcudnnStatus_tcudnnTransformTensor4d( cudnnHandle_t handle, cudnnTensor4dDescriptor_t srcDesc, constvoid *srcData, cudnnTensor4dDescriptor_t destDesc, void *destData )
This function copies the data from one tensor to another tensor with a differentlayout. Those descriptors need to have the same dimensions but not necessarily thesame strides. The input and output tensors must not overlap in any way (i.e., tensorscannot be transformed in place). This function can be used to convert a tensor with anunsupported format to a supported one.
-
8/10/2019 CUDNN Library (1)
18/38
cuDNN API Reference
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 16
Param In/out Meaning
handle input Handle to a previously created cuDNN context.
srcDesc input Handle to a previously initialized tensor descriptor.
srcData input Pointer to data of the tensor described by the srcDescdescriptor.
destDesc input Handle to a previously initialized tensor descriptor.
destData output Pointer to data of the tensor described by the destDescdescriptor.
The possible error values returned by this function and their meanings are listed below.
Return Value Meaning
CUDNN_STATUS_SUCCESS The function launched successfully.
CUDNN_STATUS_BAD_PARAM The dimensions n,c,h,wor the dataTypeof thetwo tensor descriptors are different.
CUDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GPU.
4.11. cudnnAddTensor4dcudnnStatus_tcudnnAddTensor4d( cudnnHandle_t handle, cudnnAddMode_t mode, constvoid *alpha, cudnnTensor4dDescriptor_t biasDesc, constvoid *biasData, cudnnTensor4dDescriptor_t srcDestDesc, void *srcDestData )
This function adds the scaled values of one tensor to another tensor. Themodeparametercan be used to select different ways of performing the scaled addition. The amountof data described by thebiasDescdescriptor must match exactly the amount of dataneeded to perform the addition. Therefore, the following conditions must be met:
Except for the CUDNN_ADD_SAME_Cmode, the dimensions h,wof the two tensorsmust match.
In the case of CUDNN_ADD_IMAGEmode, the dimensions n,cof the bias tensor mustbe 1.
In the case of CUDNN_ADD_FEATURE_MAPmode, the dimension nof the bias tensormust be 1 and the dimension cof the two tensors must match.
In the case of CUDNN_ADD_FULL_TENSORmode, the dimensions n,cof the two
tensors must match. In the case of CUDNN_ADD_SAME_Cmode, the dimensions n,w,hof the bias tensor
must be 1 and the dimension cof the two tensors must match.
Param In/out Meaning
handle input Handle to a previously created cuDNN context.
biasDesc input Handle to a previously initialized tensor descriptor.
-
8/10/2019 CUDNN Library (1)
19/38
cuDNN API Reference
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 17
Param In/out Meaning
mode input Addition mode that describe how the addition is performed.
alpha input Scalar factor to be applied to every data element of the bias tensor before
it is added to the output tensor.
srcData input Pointer to data of the tensor described by thebiasDescdescriptor.
srcDestDesc input/
output
Handle to a previously initialized tensor descriptor.
srcDestData input/
output
Pointer to data of the tensor described by the srcDestDescdescriptor.
The possible error values returned by this function and their meanings are listed below.
Return Value Meaning
CUDNN_STATUS_SUCCESS The function executed successfully.
CUDNN_STATUS_BAD_PARAM The dimensions n,c,h,wof the bias tensor referto an amount of data that is incompatible with the
modeparameter and the output tensor dimensionsor the dataTypeof the two tensor descriptors are
different.
CUDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GPU.
4.12. cudnnCreateFilterDescriptorcudnnStatus_t cudnnCreateFilterDescriptor(cudnnFilterDescriptor_t *filterDesc)
This function creates a filter descriptor object by allocating the memory needed to holdits opaque structure,
Return Value Meaning
CUDNN_STATUS_SUCCESS The object was created successfully.
CUDNN_STATUS_ALLOC_FAILED The resources could not be allocated.
4.13. cudnnSetFilterDescriptorcudnnStatus_tcudnnSetFilterDescriptor( cudnnFilterDescriptor_t filterDesc,
cudnnDataType_t dataType, intk, intc, inth, intw )
This function initializes a previously created filter descriptor object. Filters layout mustbe contiguous in memory.
-
8/10/2019 CUDNN Library (1)
20/38
cuDNN API Reference
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 18
Param In/out Meaning
filterDesc input/
output
Handle to a previously created filter descriptor.
datatype input Data type.
k input Number of output feature maps.
c input Number of input feature maps.
h input Height of each filter.
w input Width of each filter.
The possible error values returned by this function and their meanings are listed below.
Return Value Meaning
CUDNN_STATUS_SUCCESS The object was set successfully.
CUDNN_STATUS_BAD_PARAM At least one of the parameters k,c,h,wisnegative or dataTypehas an invalid enumerantvalue.
4.14. cudnnGetFilterDescriptorcudnnStatus_tcudnnGetFilterDescriptor( cudnnFilterDescriptor_t filterDesc, cudnnDataType_t *dataType, int*k, int*c, int*h, int*w )
This function queries the parameters of the previouly initialized filter descriptor object.
Param In/out Meaning
filterDesc input Handle to a previously created filter descriptor.
datatype output Data type.
k output Number of output feature maps.
c output Number of input feature maps.
h output Height of each filter.
w output Width of each filter.
The possible error values returned by this function and their meanings are listed below.
Return Value Meaning
CUDNN_STATUS_SUCCESS The object was set successfully.
-
8/10/2019 CUDNN Library (1)
21/38
cuDNN API Reference
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 19
4.15. cudnnDestroyFilterDescriptorcudnnStatus_t cudnnDestroyFilterDescriptor(cudnnFilterdDescriptor_t filterDesc)
This function destroys a previously created Tensor4D descriptor object.
Return Value Meaning
CUDNN_STATUS_SUCCESS The object was destroyed successfully.
4.16. cudnnCreateConvolutionDescriptorcudnnStatus_t cudnnCreateConvolutionDescriptor(cudnnConvolutionDescriptor_t*convDesc)
This function creates a convolution descriptor object by allocating the memory needed tohold its opaque structure,
Return Value Meaning
CUDNN_STATUS_SUCCESS The object was created successfully.
CUDNN_STATUS_ALLOC_FAILED The resources could not be allocated.
4.17. cudnnSetConvolutionDescriptorcudnnStatus_tcudnnSetConvolutionDescriptor( cudnnConvolutionDescriptor_t convDesc, cudnnTensor4dDescriptor_t inputTensorDesc, cudnnFilterDescriptor_t filterDesc, intpad_h, intpad_w, intu, intv, intupscalex, intupscaley, cudnnConvolutionMode_t mode )
This function initializes a previously created convolution descriptor object, accordingto an input tensor descriptor and a filter descriptor passed as parameter. This functionassumes that the tensor and filter descriptors corresponds to the formard convolutionpath and checks if their settings are valid. That same convolution descriptor can bereused in the backward path provided it corresponds to the same layer.
Param In/out Meaning
convDesc input/
output
Handle to a previously created convolution descriptor.
inputTensorDesc input Input tensor descriptor used for that layer on the forward path.
filterDesc input Filter descriptor used for that layer on the forward path.
-
8/10/2019 CUDNN Library (1)
22/38
-
8/10/2019 CUDNN Library (1)
23/38
cuDNN API Reference
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 21
This function initializes a previously created convolution descriptor object. It is similarto cudnnSetConvolutionDescriptorbut every parameter of the convolution must bepassed explicitly.
Param In/out Meaning
convDesc input/
output
Handle to a previously created convolution descriptor.
n input Number of images.
c input Number of input feature maps.
h input Height of each input feature map.
w input Width of each input feature map.
k input Number of output feature maps.
r input Height of each filter.
s input Width of each filter.
pad_h input zero-padding height: number of rows of zeros implicitly concatenated onto
the top and onto the bottom of input images.
pad_w input zero-padding width: number of columns of zeros implicitly concatenated
onto the left and onto the right of input images.
u input Vertical filter stride.
v input Horizontal filter stride.
upscalex input Upscale the input in x-direction.
upscaley input Upscale the input in y-direction.
mode input Selects between CUDNN_CONVOLUTIONand CUDNN_CROSS_CORRELATION.
The possible error values returned by this function and their meanings are listed below.
Return Value Meaning
CUDNN_STATUS_SUCCESS The object was set successfully.
CUDNN_STATUS_BAD_PARAM At least one of the following conditions are met:
One of the parameters u,vis negative.
The parametermodehas an invalid enumerantvalue.
CUDNN_STATUS_NOT_SUPPORTED The parameter upscalexor upscaleyis not 1.
-
8/10/2019 CUDNN Library (1)
24/38
cuDNN API Reference
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 22
4.19. cudnnGetOutputTensor4dDimcudnnStatus_tcudnnGetOutputTensor4dDim( constcudnnConvolutionDescriptor_t convDesc, cudnnConvolutionPath_t path, int*n, int*c, int*h, int*w )
This function returns the dimensions of a convolution's output, given the convolutiondescriptor and the direction of the convolution. This function can help to setup theoutput tensor and allocate the proper amount of memory prior to launch the actualconvolution.
Param In/out Meaning
convDesc input Handle to a previously created convolution descriptor.
path input Enumerant to specify the direction of the convolution.
n output Number of output images.
c output Number of output feature maps per image.
h output Height of each output feature map.
w output Width of each output feature map.
The possible error values returned by this function and their meanings are listed below.
Return Value Meaning
CUDNN_STATUS_BAD_PARAM Thepathparameter has an invalid enumerantvalue.
CUDNN_STATUS_SUCCESS The object was set successfully.
4.20. cudnnDestroyFilterDescriptorcudnnStatus_t cudnnDestroyConvolutionDescriptor(cudnnConvolutionDescriptor_tconvDesc)
This function destroys a previously created convolution descriptor object.
Return Value Meaning
CUDNN_STATUS_SUCCESS The object was destroyed successfully.
-
8/10/2019 CUDNN Library (1)
25/38
cuDNN API Reference
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 23
4.21. cudnnConvolutionForwardcudnnStatus_tcudnnConvolutionForward( cudnnHandle_t handle, cudnnTensor4dDescriptor_t srcDesc, constvoid *srcData, cudnnFilterDescriptor_t filterDesc, constvoid *filterData, cudnnConvolutionDescriptor_t convDesc, cudnnTensor4dDescriptor_t destDesc, void *destData, cudnnAccumulateResult_t accumulate )
This function executes convolutions or cross-correlations over srcusing the specifiedfilters, returning results in dest.
Param In/out Meaning
handle input Handle to a previously created cuDNN context.
srcDesc input Handle to a previously initialized tensor descriptor.
srcData input Data pointer to GPU memory associated with the tensor descriptor
srcDesc.
filterDesc input Handle to a previously initialized filter descriptor.
filterData input Data pointer to GPU memory associated with the filter descriptor
filterDesc.
convDesc input Previously initialized convolution descriptor.
destDesc input Handle to a previously initialized tensor descriptor.
destData input/
output
Data pointer to GPU memory associated with the tensor descriptor
destDescthat carries the result of the convolution.
accumulate input Enumerant that specifies whether the convolution accumulates with or
overwrites the output tensor.
The possible error values returned by this function and their meanings are listed below.
Return Value Meaning
CUDNN_STATUS_SUCCESS The operation was launched successfully.
CUDNN_STATUS_MAPPING_ERROR An error occured during the texture binding of thefilter data.
CUDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GPU.
-
8/10/2019 CUDNN Library (1)
26/38
cuDNN API Reference
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 24
4.22. cudnnConvolutionBackwardBiascudnnStatus_tcudnnConvolutionBackwardBias( cudnnHandle_t handle, cudnnTensor4dDescriptor_t srcDesc, constvoid *srcData, cudnnTensor4dDescriptor_t destDesc, void *destData, cudnnAccumulateResult_t accumulate )
This function computes the convolution gradient with respect to the bias, which is thesum of every element belonging to the same feature map across all of the images of theinput tensor. Therefore, the number of elements produced is equal to the number offeatures maps of the input tensor.
Param In/out Meaning
handle input Handle to a previously created cuDNN context.
srcDesc input Handle to the previously initialized input tensor descriptor.
srcData input Data pointer to GPU memory associated with the tensor descriptor
srcDesc.
destDesc input Handle to the previously initialized output tensor descriptor.
destData output Data pointer to GPU memory associated with the output tensor descriptor
destDesc.
accumulate input Enumerant that specifies whether the convolution accumulates with or
overwrites the output tensor.
The possible error values returned by this function and their meanings are listed below.
Return Value Meaning
CUDNN_STATUS_SUCCESS The operation was launched successfully.
CUDNN_STATUS_BAD_PARAM At least one of the following conditions are met:
One of the parameters n,h,wof the output
tensor is not 1.
The numbers of feature maps of the input
tensor and output tensor differ.
The dataTypeof the two tensor descriptors
are different.
CUDNN_STATUS_NOT_SUPPORTED At least one of the following conditions are met: The width stride of the input tensor is not 1.
The height stride and the width of the input
tensor differ.
The feature map stride of the output tensor is
not 1.
-
8/10/2019 CUDNN Library (1)
27/38
cuDNN API Reference
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 25
4.23. cudnnConvolutionBackwardFiltercudnnStatus_tcudnnConvolutionBackwardFilter( cudnnHandle_t handle, cudnnTensor4dDescriptor_t srcDesc, constvoid *srcData, cudnnTensor4dDescriptor_t diffDesc, constvoid *diffData, cudnnConvolutionDescriptor_t convDesc, cudnnFilterDescriptor_t gradDesc, void *gradData, cudnnAccumulateResult_t accumulate )
This function computes the convolution gradient with respect to the filter coefficients.
Param In/out Meaning
handle input Handle to a previously created cuDNN context.
srcDesc input Handle to a previously initialized tensor descriptor.
srcData input Data pointer to GPU memory associated with the tensor descriptor
srcDesc.
diffDesc input Handle to the previously initialized input differential tensor descriptor.
diffData input Data pointer to GPU memory associated with the input differential tensor
descriptor diffDesc.
convDesc input Previously initialized convolution descriptor.
gradDesc input Handle to a previously initialized filter descriptor.
gradData input/
output
Data pointer to GPU memory associated with the filter descriptor
gradDescthat carries the result.
accumulate input Enumerant that specifies whether the convolution accumulates with or
overwrites the output tensor.
The possible error values returned by this function and their meanings are listed below.
Return Value Meaning
CUDNN_STATUS_SUCCESS The operation was launched successfully.
CUDNN_STATUS_NOT_SUPPORTED The requested operation is not currently supportedin cuDNN. Your diffDesc is likely not in NCHW
format.
CUDNN_STATUS_MAPPING_ERROR An error occurs during the texture binding of thefilter data.
CUDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GPU.
-
8/10/2019 CUDNN Library (1)
28/38
cuDNN API Reference
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 26
4.24. cudnnConvolutionBackwardDatacudnnStatus_tcudnnConvolutionBackwardData( cudnnHandle_t handle, cudnnFilterDescriptor_t filterDesc, constvoid *filterData, cudnnTensor4dDescriptor_t diffDesc, constvoid *diffData, cudnnConvolutionDescriptor_t convDesc, cudnnTensor4dDescriptor_t gradDesc, void *gradData, cudnnAccumulateResult_t accumulate );
This function computes the convolution gradient with respect to the ouput tensor.
Param In/out Meaning
handle input Handle to a previously created cuDNN context.
filterDesc input Handle to a previously initialized filter descriptor.
filterData input Data pointer to GPU memory associated with the filter descriptor
filterDesc.
diffDesc input Handle to the previously initialized input differential tensor descriptor.
diffData input Data pointer to GPU memory associated with the input differential tensor
descriptor diffDesc.
convDesc input Previously initialized convolution descriptor.
gradDesc input Handle to the previously initialized output tensor descriptor.
gradData input/
output
Data pointer to GPU memory associated with the output tensor descriptor
gradDescthat carries the result.
accumulate input Enumerant that specifies whether the convolution accumulates with or
overwrites the output tensor.
The possible error values returned by this function and their meanings are listed below.
Return Value Meaning
CUDNN_STATUS_SUCCESS The operation was launched successfully.
CUDNN_STATUS_NOT_SUPPORTED The requested operation is not currently supportedin cuDNN. Your diffDesc is likely not in NCHW
format.
CUDNN_STATUS_MAPPING_ERROR An error occurs during the texture binding of the
filter data or the input differential tensor data
CUDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GPU.
-
8/10/2019 CUDNN Library (1)
29/38
cuDNN API Reference
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 27
4.25. cudnnSoftmaxForwardcudnnStatus_tcudnnSoftmaxForward( cudnnHandle_t handle, cudnnSoftmaxAlgorithm_t algorithm, cudnnSoftmaxMode_t mode, cudnnTensor4dDescriptor_t srcDesc, constvoid *srcData, cudnnTensor4dDescriptor_t destDesc, void *destData )
This routine computes the softmax function.
Param In/out Meaning
handle input Handle to a previously created cuDNN context.
algorithm input Enumerant to specify the softmax algorithm.
mode input Enumerant to specify the softmax mode.
srcDesc input Handle to the previously initialized input tensor descriptor.
srcData input Data pointer to GPU memory associated with the tensor descriptor
srcDesc.
destDesc input Handle to the previously initialized output tensor descriptor.
destData output Data pointer to GPU memory associated with the output tensor descriptor
destDesc.
The possible error values returned by this function and their meanings are listed below.
Return Value Meaning
CUDNN_STATUS_SUCCESS The function launched successfully.
CUDNN_STATUS_BAD_PARAM At least one of the following conditions are met:
The dimensions n,c,h,wof the input tensor
and output tensors differ.
The datatypeof the input tensor and output
tensors differ.
The parameters algorithmormodehave aninvalid enumerant value.
CUDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GPU.
-
8/10/2019 CUDNN Library (1)
30/38
cuDNN API Reference
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 28
4.26. cudnnSoftmaxBackwardcudnnStatus_tcudnnSoftmaxBackward( cudnnHandle_t handle, cudnnSoftmaxAlgorithm_t algorithm, cudnnSoftmaxMode_t mode, cudnnTensor4dDescriptor_t srcDesc, constvoid *srcData, cudnnTensor4dDescriptor_t srcDiffDesc, constvoid *srcDiffData, cudnnTensor4dDescriptor_t destDiffDesc, void *destDiffData )
This routine computes the gradient of the softmax function.
Param In/out Meaning
handle input Handle to a previously created cuDNN context.
algorithm input Enumerant to specify the softmax algorithm.
mode input Enumerant to specify the softmax mode.
srcDesc input Handle to the previously initialized input tensor descriptor.
srcData input Data pointer to GPU memory associated with the tensor descriptor
srcDesc.
srcDiffDesc input Handle to the previously initialized input differential tensor descriptor.
srcDiffData input Data pointer to GPU memory associated with the tensor descriptor
srcDiffData.
destDiffDesc input Handle to the previously initialized output differential tensor descriptor.
destDiffData output Data pointer to GPU memory associated with the output tensor descriptordestDiffDesc.
The possible error values returned by this function and their meanings are listed below.
Return Value Meaning
CUDNN_STATUS_SUCCESS The function launched successfully.
CUDNN_STATUS_BAD_PARAM At least one of the following conditions are met:
The dimensions n,c,h,wof the srcDesc,
srcDiffDescand destDiffDesctensorsdiffer.
The strides nStride, cStride, hStride,wStrideof the srcDescand srcDiffDesctensors differ.
The datatypeof the three tensors differs.
CUDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GPU.
-
8/10/2019 CUDNN Library (1)
31/38
cuDNN API Reference
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 29
4.27. cudnnCreatePoolingDescriptorcudnnStatus_t cudnnCreatePoolingDescriptor( cudnnPoolingDescriptor_t*poolingDesc )
This function creates a pooling descriptor object by allocating the memory needed tohold its opaque structure,
Return Value Meaning
CUDNN_STATUS_SUCCESS The object was created successfully.
CUDNN_STATUS_ALLOC_FAILED The resources could not be allocated.
4.28. cudnnSetPoolingDescriptorcudnnStatus_t
cudnnSetPoolingDescriptor( cudnnPoolingDescriptor_t poolingDesc, cudnnPoolingMode_t mode, intwindowHeight, intwindowWidth, intverticalStride, inthorizontalStride )
This function initializes a previously created pooling descriptor object.
Param In/out Meaning
poolingDesc input/
output
Handle to a previously created pooling descriptor.
mode input Enumerant to specify the pooling mode.
windowHeight input Height of the pooling window.
windowWidth input Width of the pooling window.
verticalStride input Pooling vertical stride.
horizontalStride input Pooling horizontal stride.
The possible error values returned by this function and their meanings are listed below.
Return Value Meaning
CUDNN_STATUS_SUCCESS The object was set successfully.
CUDNN_STATUS_BAD_PARAM At least one of the parameterswindowHeight,windowWidth, verticalStride,
horizontalStrideis negative ormodehas an
invalid enumerant value.
-
8/10/2019 CUDNN Library (1)
32/38
cuDNN API Reference
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 30
4.29. cudnnGetPoolingDescriptorcudnnStatus_tcudnnGetPoolingDescriptor( cudnnPoolingDescriptor_t poolingDesc, cudnnPoolingMode_t *mode, int*windowHeight, int*windowWidth, int*verticalStride, int*horizontalStride )
This function queries a previously created pooling descriptor object.
Param In/out Meaning
poolingDesc input Handle to a previously created pooling descriptor.
mode output Enumerant to specify the pooling mode.
windowHeight output Height of the pooling window.
windowWidth output Width of the pooling window.
verticalStride output Pooling vertical stride.
horizontalStride output Pooling horizontal stride.
The possible error values returned by this function and their meanings are listed below.
Return Value Meaning
CUDNN_STATUS_SUCCESS The object was set successfully.
4.30. cudnnDestroyPoolingDescriptorcudnnStatus_t cudnnDestroyPoolingDescriptor( cudnnPoolingDescriptor_tpoolingDesc )
This function destroys a previously created pooling descriptor object.
Return Value Meaning
CUDNN_STATUS_SUCCESS The object was destroyed successfully.
4.31. cudnnPoolingForwardcudnnStatus_tcudnnPoolingForward( cudnnHandle_t handle, cudnnPoolingDescriptor_t poolingDesc, cudnnTensor4dDescriptor_t srcDesc, constvoid *srcData, cudnnTensor4dDescriptor_t destDesc, void *destData )
This function computes pooling of input values (i.e., the maximum or average of severaladjacent values) to produce an output with smaller height and/or width.
-
8/10/2019 CUDNN Library (1)
33/38
cuDNN API Reference
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 31
Param In/out Meaning
handle input Handle to a previously created cuDNN context.
poolingDesc input Handle to a previously initialized pooling descriptor.
srcDesc input Handle to the previously initialized input tensor descriptor.
srcData input Data pointer to GPU memory associated with the tensor descriptor
srcDesc.
destDesc input Handle to the previously initialized output tensor descriptor.
destData output Data pointer to GPU memory associated with the output tensor descriptor
destDesc.
The possible error values returned by this function and their meanings are listed below.
Return Value Meaning
CUDNN_STATUS_SUCCESS The function launched successfully.
CUDNN_STATUS_BAD_PARAM At least one of the following conditions are met:
The dimensions n,cof the input tensor and
output tensors differ.
The datatypeof the input tensor and output
tensors differs.
CUDNN_STATUS_NOT_SUPPORTED ThewStrideof input tensor or output tensor is
not 1.
CUDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GPU.
4.32. cudnnPoolingBackwardcudnnStatus_tcudnnPoolingBackward( cudnnHandle_t handle, cudnnPoolingDescriptor_t poolingDesc, cudnnTensor4dDescriptor_t srcDesc, constvoid *srcData, cudnnTensor4dDescriptor_t srcDiffDesc, constvoid *srcDiffData, cudnnTensor4dDescriptor_t destDesc, constvoid *destData, cudnnTensor4dDescriptor_t destDiffDesc, void *destDiffData )
This function computes the gradient of a pooling operation.
Param In/out Meaning
handle input Handle to a previously created cuDNN context.
poolingDesc input Handle to the previously initialized pooling descriptor.
srcDesc input Handle to the previously initialized input tensor descriptor.
-
8/10/2019 CUDNN Library (1)
34/38
cuDNN API Reference
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 32
Param In/out Meaning
srcData input Data pointer to GPU memory associated with the tensor descriptor
srcDesc.
srcDiffDesc input Handle to the previously initialized input differential tensor descriptor.
srcDiffData input Data pointer to GPU memory associated with the tensor descriptorsrcDiffData.
destDesc input Handle to the previously initialized output tensor descriptor.
destData input Data pointer to GPU memory associated with the output tensor descriptor
destDesc.
destDiffDesc input Handle to the previously initialized output differential tensor descriptor.
destDiffData output Data pointer to GPU memory associated with the output tensor descriptor
destDiffDesc.
The possible error values returned by this function and their meanings are listed below.
Return Value Meaning
CUDNN_STATUS_SUCCESS The function launched successfully.
CUDNN_STATUS_BAD_PARAM At least one of the following conditions are met:
The dimensions n,c,h,wof the srcDescand
srcDiffDesctensors differ.
The strides nStride, cStride, hStride,wStrideof the srcDescand srcDiffDesctensors differ.
The dimensions n,c,h,wof the destDescand destDiffDesctensors differ.
The stridesnStride, cStride,hStride, wStrideof the destDescand
destDiffDesctensors differ.
The datatypeof the four tensors differ.
CUDNN_STATUS_NOT_SUPPORTED ThewStrideof input tensor or output tensor isnot 1.
CUDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GPU.
4.33. cudnnActivationForward
cudnnStatus_tcudnnActivationForward( cudnnHandle_t handle, cudnnActivationMode_t mode, cudnnTensor4dDescriptor_t srcDesc, constvoid *srcData, cudnnTensor4dDescriptor_t destDesc, void *destData )
This routine applies a specified neuron activation function element-wise over each inputvalue.
-
8/10/2019 CUDNN Library (1)
35/38
cuDNN API Reference
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 33
Param In/out Meaning
handle input Handle to a previously created cuDNN context.
mode input Enumerant to specify the activation mode.
srcDesc input Handle to the previously initialized input tensor descriptor.
srcData input Data pointer to GPU memory associated with the tensor descriptor
srcDesc.
destDesc input Handle to the previously initialized output tensor descriptor.
destData output Data pointer to GPU memory associated with the output tensor descriptor
destDesc.
The possible error values returned by this function and their meanings are listed below.
Return Value Meaning
CUDNN_STATUS_SUCCESS The function launched successfully.
CUDNN_STATUS_BAD_PARAM The parametermodehas an invalid enumerantvalue.
CUDNN_STATUS_NOT_SUPPORTED At least one of the following conditions are met:
The dimensions n,c,h,wof the input tensorand output tensors differ.
The datatypeof the input tensor and outputtensors differs.
CUDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GPU.
4.34. cudnnActivationBackwardcudnnStatus_tcudnnActivationBackward( cudnnHandle_t handle, cudnnActivationMode_t mode, cudnnTensor4dDescriptor_t srcDesc, constvoid *srcData, cudnnTensor4dDescriptor_t srcDiffDesc, constvoid *srcDiffData, cudnnTensor4dDescriptor_t destDesc, constvoid *destData, cudnnTensor4dDescriptor_t destDiffDesc, void *destDiffData )
This routine computes the gradient of a neuron activation function.
Param In/out Meaning
handle input Handle to a previously created cuDNN context.
mode input Enumerant to specify the activation mode.
srcDesc input Handle to the previously initialized input tensor descriptor.
-
8/10/2019 CUDNN Library (1)
36/38
cuDNN API Reference
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 34
Param In/out Meaning
srcData input Data pointer to GPU memory associated with the tensor descriptor
srcDesc.
srcDiffDesc input Handle to the previously initialized input differential tensor descriptor.
srcDiffData input Data pointer to GPU memory associated with the tensor descriptorsrcDiffData.
destDesc input Handle to the previously initialized output tensor descriptor.
destData input Data pointer to GPU memory associated with the output tensor descriptor
destDesc.
destDiffDesc input Handle to the previously initialized output differential tensor descriptor.
destDiffData output Data pointer to GPU memory associated with the output tensor descriptor
destDiffDesc.
The possible error values returned by this function and their meanings are listed below.
Return Value Meaning
CUDNN_STATUS_SUCCESS The function launched successfully.
CUDNN_STATUS_BAD_PARAM The parametermodehas an invalid enumerant
value.
CUDNN_STATUS_NOT_SUPPORTED At least one of the following conditions are met:
The dimensions n,c,h,wof the four tensorsdiffer.
The strides nStride, cStride, hStride,
wStrideof the input tensor and the input
differential tensor differ.
The strides nStride, cStride, hStride,
wStrideof the output tensor and the output
differential tensor differ.
The datatypeof the four tensors differs.
CUDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GPU.
-
8/10/2019 CUDNN Library (1)
37/38
www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 35
Chapter 5.ACKNOWLEDGMENTS
Some of the cuDNN library routines were derived from code developed by theUniversity of Tennessee and are subject to the Modified Berkeley Software DistributionLicense as follows:
Copyright (c) 2010 The University of Tennessee.
All rights reserved.
Redistribution and use in source and binary forms, with or withoutmodification, are permitted provided that the following conditions aremet: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer listed in this license in the documentation and/or other materials provided with the distribution. * Neither the name of the copyright holders nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOTLIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FORA PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHTOWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOTLIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANYTHEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USEOF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
8/10/2019 CUDNN Library (1)
38/38