Skip to content
Unverified Commit 366c39e1 authored by Niklas Hölter's avatar Niklas Hölter Committed by GitHub
Browse files

Fix: Tokenizer is not able to encode triple bonds

Hi everyone,

i again found one minor bug in deepchems SMILES tokenizer. While tokenizing my dataset, i observed that the triple bond token ´#´ was not tokenized and instead simply leaved out by the SMILESTokenizer aswell as by the BasicSMILESTokenizer. I think this error occured because of the regex pattern, where a linebreak is placed directly in front of the ´#´. Removing the linebreak fixed it for me in a local copy of deepchem.
parent eab6ee63
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment