Binary Operations

dot

The inverse operation of dot is dot.

class auto_diff.OpDot(x: auto_diff.op.operation.Operation, y: auto_diff.op.operation.Operation, **kwargs)[source]

Bases: auto_diff.op.operation.Operation

Dot product of two tensors.

If either x or y is a scalar, then it is equivalent to OpMultiply.

If both x and y are 1-D arrays, it is the inner product of vectors, the result is a scalar:

\[z = \sum_k x_{k} \cdot y_{k}\]

Partial derivatives of a single element:

\[\frac{\partial L}{\partial x_{i}} = \frac{\partial L}{\partial z} \cdot \frac{\partial z}{\partial x_i} = \frac{\partial L}{\partial z} \cdot y_i\]
\[\frac{\partial L}{\partial y_{i}} = \frac{\partial L}{\partial z} \cdot \frac{\partial z}{\partial y_i} = \frac{\partial L}{\partial z} \cdot x_i\]

Vector derivatives:

\[\frac{\partial L}{\partial x} = \frac{\partial L}{\partial z} \cdot y\]
\[\frac{\partial L}{\partial y} = \frac{\partial L}{\partial z} \cdot x\]

Note that since z is a scalar, the calculation of vector derivatives in this case is the dot operation.

If both x and y are 2-D arrays, it is the matrix multiplication, the result is a 2-D array:

\[z_{ij} = \sum_{k} x_{ik} \cdot y_{kj}\]

Partial derivative of a single element:

\[\begin{split}\begin{array}{rcl} \displaystyle \frac{\partial L}{\partial x_{ij}} &=& \displaystyle \sum_{a,b} \frac{\partial L}{\partial z_{ab}} \cdot \frac{\partial z_{ab}}{\partial x_{ij}} \\ &=& \displaystyle \sum_{a,b} \frac{\partial L}{\partial z_{ab}} \cdot \frac{\partial \left ( \sum_k x_{ak} \cdot y_{kb} \right )}{\partial x_{ij}} \\ &=& \displaystyle \sum_{b} \frac{\partial L}{\partial z_{ib}} \cdot \frac{\partial \left ( x_{ij} \cdot y_{jb} \right )}{\partial x_{ij}} \\ &=& \displaystyle \sum_{k} \frac{\partial L}{\partial z_{ik}} \cdot y_{jk} \\ &=& \displaystyle \sum_{k} \left ( \frac{\partial L}{\partial Z} \right )_{ik} \cdot (Y^T)_{kj} \\ \end{array}\end{split}\]
\[\frac{\partial L}{\partial y_{ij}} = \sum_{k} (X^T)_{ik} \cdot \left ( \frac{\partial L}{\partial Z} \right )_{kj}\]

The results of partial derivatives are the same as the definition of the dot operation, therefore the matrix derivatives are:

\[\frac{\partial L}{\partial X} = \frac{\partial L}{\partial Z} \cdot Y^T\]
\[\frac{\partial L}{\partial Y} = X^T \cdot \frac{\partial L}{\partial Z}\]

If x is an N-D tensor and y is an M-D tensor (M >= 2), it is a sum product over the last axis of x and second-to-last axis of y.

If x is an N-D tensor and y is a 1-D array, it is a sum product over the last axis of x and y. It is a special case of the previous condition if y is considered as a K x 1 matrix and result is squeezed.

__init__(x: auto_diff.op.operation.Operation, y: auto_diff.op.operation.Operation, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.