Utils

MordinezNLP.utils.ngram_iterator.ngram_iterator(string: str, ngram_len: int = 3) → list

Returns an iterator that yeilds the given string and its ngrams. Each subsequent list element has got lenght set to ngram_len. The differences between each of following elements in list is a one letter forward in a context. For example for string “hello” and ngram_len set to 3 it will output [“hel”, “ell”, “llo”]

Parameters
  • string (str) – string to iterate on

  • ngram_len (int) – lenght of each ngram

Returns

ngram - list of ngram_len characters of input string

Return type

list

Example usage:

from MordinezNLP.utils import ngram_iterator

print(list(ngram_iterator("<hello>", 3))) # <- will print ['<he', 'hel', 'ell', 'llo', 'lo>']
MordinezNLP.utils.random_string.random_string(length: int = 64, choices_list: List[str] = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789') → str

Generate random string which contains characters from choices_list arg.

Parameters
  • length (int) – length of generated string

  • choices_list (List[str]) – List of characters from which random string should be generated

Returns

Randomly generated string

Return type

str

Example usage:

from MordinezNLP.utils import random_string
import string

rs = random_string(32)
print(rs)

rs = random_string(10, string.digits)
print(rs)