Unverified Commit 813b5e12 authored by micimize's avatar micimize
Browse files

make tokenizer code blocks non-doctests

parent 753d1ec0
Loading
Loading
Loading
Loading
+4 −4
Original line number Diff line number Diff line
@@ -24,13 +24,13 @@ SmilesTokenizer

The :code:`dc.feat.SmilesTokenizer` module inherits from the BertTokenizer class in transformers. It runs a WordPiece tokenization algorithm over SMILES strings using the tokenisation SMILES regex developed by Schwaller et. al.

The SmilesTokenizer employs an atom-wise tokenization strategy using the following Regex expression:
The SmilesTokenizer employs an atom-wise tokenization strategy using the following Regex expression: ::

>>> SMI_REGEX_PATTERN = "(\[[^\]]+]|Br?|Cl?|N|O|S|P|F|I|b|c|n|o|s|p|\(|\)|\.|=|#||\+|\\\\\/|:||@|\?|>|\*|\$|\%[0–9]{2}|[0–9])"
    SMI_REGEX_PATTERN = "(\[[^\]]+]|Br?|Cl?|N|O|S|P|F|I|b|c|n|o|s|p|\(|\)|\.|=|#||\+|\\\\\/|:||@|\?|>|\*|\$|\%[0–9]{2}|[0–9])"

To use, please install the transformers package using the following pip command:
To use, please install the transformers package using the following pip command: ::

>>> pip install transformers
    pip install transformers

References: