Updated Docs (04634cad) · Commits · 钟慕尧 / deepchem

deepchem/models/torch_models/layers.py

+23 −13

Original line number	Diff line number	Diff line
		@@ -9,11 +9,21 @@ except:
		class ScaleNorm(nn.Module):
		"""Apply Scale Normalization to input.

		All G values are initialized to sqrt(d).
		The ScaleNorm layer first computes the square root of the scale, then computes the matrix/vector norm of the input tensor.
		The norm value is calculated as `sqrt(scale) / matrix norm`.
		Finally, the result is returned as `input_tensor * norm value`.

		References
		----------
		.. [1] Lukasz Maziarka et al. "Molecule Attention Transformer" Graph Representation Learning workshop and Machine Learning and the Physical Sciences workshop at NeurIPS 2019. 2020. https://arxiv.org/abs/2002.08264

		Examples
		--------
		>>> import deepchem as dc
		>>> scale = 0.35
		>>> layer = dc.models.torch_models.ScaleNorm(scale)
		>>> input_tensor = torch.tensor([[1.269, 39.36], [0.00918, -9.12]])
		>>> output_tensor = layer.forward(input_tensor)
		"""

		def __init__(self, scale, eps=1e-5):