Commit 36d28236 authored by nitinprakash96's avatar nitinprakash96
Browse files

add gradient tutorial

parent e03e7253
Loading
Loading
Loading
Loading
+31 −8
Original line number Diff line number Diff line
%% Cell type:markdown id: tags:

### TensorGraph Layers and TensorFlow eager

%% Cell type:markdown id: tags:

 In this tutorial we will look at the working of TensorGraph layer with TensorFlow eager.
 But before that let's see what exactly is TensorFlow eager.

%% Cell type:markdown id: tags:

Eager execution is an imperative, define-by-run interface where operations are executed immediately as they are called from Python. In other words, eager execution is a feature that makes TensorFlow execute operations immediately. Concrete values are returned instead of a computational graph to be executed later.
As a result:
- It allows writing imperative coding style like numpy
- Provides fast debugging with immediate run-time errors and integration with Python tools
- Strong support for higher-order gradients

%% Cell type:code id: tags:

``` python
import tensorflow as tf
import tensorflow.contrib.eager as tfe
```

%% Output

    /home/nitin/anaconda3/envs/deepchem/lib/python3.5/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
      from ._conv import register_converters as _register_converters

%% Cell type:markdown id: tags:

After importing neccessary modules, at the program startup we invoke `enable_eager_execution()`.

%% Cell type:code id: tags:

``` python
tfe.enable_eager_execution()
```

%% Cell type:markdown id: tags:

Enabling eager execution changes how TensorFlow functions behave. Tensor objects return concrete values instead of being a symbolic reference to nodes in a static computational graph(non-eager mode). As a result, eager execution should be enabled at the beginning of a program.

%% Cell type:markdown id: tags:

Note that with eager execution enabled, these operations consume and return multi-dimensional arrays as `Tensor` objects, similar to NumPy `ndarrays`

%% Cell type:markdown id: tags:

### Dense layer

%% Cell type:code id: tags:

``` python
import deepchem as dc
from deepchem.models.tensorgraph.layers import Dense
```

%% Cell type:code id: tags:

``` python
# Initialize a tensor of shape(2,2)
inputs = tf.constant([[1.0, 2.0], [4.0, 5.0]])
```

%% Cell type:code id: tags:

``` python
# This will create a Dense layer
dense_layer = Dense(3) #Provide the number of output values as parameter value
```

%% Cell type:markdown id: tags:

The `create_tensor()` function doesn't perform any tensor operation but is a method of `Dense`. It takes in a list of tensors as a parameter and a boolean `reshape`, when `True` will try to reshape the inputs to all have the same shape.

%% Cell type:code id: tags:

``` python
x = dense_layer.create_tensor(in_layers = [inputs])
print(x)
```

%% Output

    tf.Tensor(
    [[ 0.36128855  3.264349   -1.3033258 ]
     [-0.7551975   9.123002   -3.2123344 ]], shape=(2, 3), dtype=float32)
    [[ 2.4837663  2.5143049 -1.3554332]
     [ 4.81487    8.494966  -4.155171 ]], shape=(2, 3), dtype=float32)

%% Cell type:markdown id: tags:

The above function call performs the same action as the below. This is because `create_tensor()` is invoked by `__call__()` object. This gives us an advantage of directly passing the tensor as a parameter while constructing a TensorGraph layer.

%% Cell type:code id: tags:

``` python
y = Dense(3)(inputs) # creates a layer that outputs a tensor of shape=(2,3)
print(y)
```

%% Output

    tf.Tensor(
    [[ -3.7639852   -0.47902262  -1.6328173 ]
     [-12.509675    -1.1279726   -3.0491922 ]], shape=(2, 3), dtype=float32)
    [[ 0.31004214 -1.9419584  -0.17766535]
     [ 0.36857128 -3.954214    0.10880685]], shape=(2, 3), dtype=float32)

%% Cell type:markdown id: tags:

### Example

%% Cell type:markdown id: tags:

The following code snippet shows a basic architechture of how TensorGraph layer can be created in TensorFlow eager mode.

%% Cell type:markdown id: tags:

Import all the necessary modules and `enable_eager_execution()`.

```python
import tensorflow as tf
import tensorflow.contrib.eager as tfe
import deepchem as dc
from deepchem.models.tensorgraph.layers import Dense

tfe.enable_eager_execution()

x = [[4.]]
output = Dense(3)(x)
print(output)
```

%% Cell type:markdown id: tags:

### Conv1D layer

%% Cell type:markdown id: tags:

`Dense` layers are one of the layers defined in Deepchem. Along with it there are several others like `Conv1D`, `Conv2D`, `conv3D` etc. We show constructing a `Conv1D` layer below.

%% Cell type:markdown id: tags:

Basically this layer creates a convolution kernel that is convolved with the layer input over a single spatial (or temporal) dimension to produce a tensor of outputs.
When using this layer as the first layer in a model, provide an `input_shape` argument (tuple of integers or `None`)

%% Cell type:markdown id: tags:

When the argument `input_shape` is passed in as a tuple of integers e.g (2, 3) it would mean we are passing a sequence of 2 vectors of 3-Dimensional vectors.
And when it is passed as (None, 3) it means that we want variable-length sequences of 3-dimensional vectors.

%% Cell type:code id: tags:

``` python
from deepchem.models.tensorgraph.layers import Conv1D
```

%% Cell type:code id: tags:

``` python
conv_layer = Conv1D(2, 1)
z = conv_layer.create_tensor(in_layers = [inputs])
print(z)
```

%% Output

    tf.Tensor(
    [[[-1.1386716 -1.2082955]
      [-2.2773433 -2.416591 ]]
    [[[-1.3609632  -0.55372864]
      [-2.7219265  -1.1074573 ]]
    
     [[-4.5546865 -4.833182 ]
      [-5.6933584 -6.041477 ]]], shape=(2, 2, 2), dtype=float32)
     [[-5.443853   -2.2149146 ]
      [-6.8048162  -2.7686431 ]]], shape=(2, 2, 2), dtype=float32)

%% Cell type:markdown id: tags:

### Gradients

%% Cell type:markdown id: tags:

Finding gradients under eager mode is much similar to the `autograd` API. The computational flow is very clean and logical.
What happens is that different operations can occur during each call, all forward operations are recorded to a tape, which is then played backwards when computing gradients. After the gradients have been computed, the tape is discared.

%% Cell type:code id: tags:

``` python
def dense_squared(x):
  return Dense(1)(Dense(1)(inputs))

grad = tfe.gradients_function(dense_squared)

print(dense_squared(3.0))
print(grad(3.0))
```

%% Output

    tf.Tensor(
    [[-0.0588641 ]
     [-0.19230397]], shape=(2, 1), dtype=float32)
    [None]

%% Cell type:markdown id: tags:

In the above example, The `gradients_function` call takes a Python function `dense_squared()` as an argument and returns a Python callable that computes the partial derivatives of `dense_squared()` with respect to its inputs.