Clarified wording in vignettes
Fixed computation issue in collocations not of length 5
Fixed merge issue with special characters
Replaced non-ascii characters in datasets