pycommons.strings package¶

Common string handling routines.

Submodules¶

pycommons.strings.chars module¶

Constants for common characters.

pycommons.strings.chars.NBDASH: Final[str] = '‑'¶: A non-breaking hyphen

pycommons.strings.chars.NBSP: Final[str] = '\xa0'¶: A constant for non-breaking space

pycommons.strings.chars.NEWLINE: Final[str] = '\n\r\x85\u2028\u2029'¶: A regular expression matching all characters that are non-line breaking white space.

pycommons.strings.chars.WHITESPACE: Final[str] = '\t\x0b\x0c \xa0\u1680\u2000\u2001\u2002\u2003\u2004\u2005\u2006\u2007\u2008\u2009\u200a\u202f\u205f\u3000'¶: A regular expression matching all characters that are non-line breaking white space.

pycommons.strings.chars.WHITESPACE_OR_NEWLINE: Final[str] = '\t\n\x0b\x0c\r \x85\xa0\u1680\u2000\u2001\u2002\u2003\u2004\u2005\u2006\u2007\u2008\u2009\u200a\u2028\u2029\u202f\u205f\u3000'¶: A regular expression matching any white space or newline character.

pycommons.strings.chars.subscript(s)[source]¶

Transform a string into Unicode-based subscript.

All characters that can be represented as subscript in unicode will be translated to subscript. Notice that only a subset of the latin characters can be converted to unicode subscript. If any character cannot be translated, it will raise a KeyError. White space is preserved.

Parameters:

s (str) – the string

Return type:

str

Returns:

the string in subscript

Raises:

KeyError – if a character cannot be converted
TypeError – if s is not a string

>>> subscript("a0= 4(e)")
'ₐ₀₌ ₄₍ₑ₎'

>>> try:
...     subscript("a0=4(e)Y")
... except KeyError as ke:
...     print(ke)
'Y'

>>> try:
...     subscript(None)
... except TypeError as te:
...     print(te)
descriptor '__iter__' requires a 'str' object but received a 'NoneType'

>>> try:
...     superscript(1)
... except TypeError as te:
...     print(te)
descriptor '__iter__' requires a 'str' object but received a 'int'

pycommons.strings.chars.superscript(s)[source]¶

Transform a string into Unicode-based superscript.

All characters that can be represented as superscript in unicode will be translated to superscript. Notice that only a subset of the latin characters can be converted to unicode superscropt. If any character cannot be translated, it will raise a KeyError. White space is preserved.

Parameters:

s (str) – the string

Return type:

str

Returns:

the string in subscript

Raises:

KeyError – if a character cannot be converted
TypeError – if s is not a string

>>> superscript("a0 =4(e)")
'ᵃ⁰ ⁼⁴⁽ᵉ⁾'

>>> try:
...     superscript("a0=4(e)Y")
... except KeyError as ke:
...     print(ke)
'Y'

>>> try:
...     superscript(None)
... except TypeError as te:
...     print(te)
descriptor '__iter__' requires a 'str' object but received a 'NoneType'

>>> try:
...     superscript(1)
... except TypeError as te:
...     print(te)
descriptor '__iter__' requires a 'str' object but received a 'int'

pycommons.strings.enforce module¶

Routines for checking whether a value is a non-empty string w/o spaces.

pycommons.strings.enforce.enforce_non_empty_str(value)[source]¶

Enforce that a text is a non-empty string.

Parameters:

value (Any) – the value to be checked whether it is a non-empty string

Return type:

str

Returns:

the value, but as type str

Raises:

TypeError – if value is not a str
ValueError – if value is empty

>>> enforce_non_empty_str("1")
'1'
>>> enforce_non_empty_str(" 1 1 ")
' 1 1 '

>>> try:
...     enforce_non_empty_str("")
... except ValueError as ve:
...     print(ve)
Non-empty str expected, but got empty string.

>>> try:
...     enforce_non_empty_str(1)
... except TypeError as te:
...     print(te)
descriptor '__len__' requires a 'str' object but received a 'int'

>>> try:
...     enforce_non_empty_str(None)
... except TypeError as te:
...     print(te)
descriptor '__len__' requires a 'str' object but received a 'NoneType'

pycommons.strings.enforce.enforce_non_empty_str_without_ws(value)[source]¶

Enforce that a text is a non-empty string without white space.

Parameters:

value (Any) – the value to be checked whether it is a non-empty string without any white space

Return type:

str

Returns:

the value, but as type str

Raises:

TypeError – if value is not a str
ValueError – if value is empty or contains any white space characters

>>> enforce_non_empty_str_without_ws("1")
'1'

>>> try:
...     enforce_non_empty_str_without_ws(" 1 1 ")
... except ValueError as ve:
...     print(ve)
No white space allowed in string, but got ' 1 1 '.

>>> try:
...     enforce_non_empty_str_without_ws("a\tb")
... except ValueError as ve:
...     print(ve)
No white space allowed in string, but got 'a\tb'.

>>> try:
...     enforce_non_empty_str_without_ws("012345678901234567890 12345678")
... except ValueError as ve:
...     print(ve)
No white space allowed in string, but got '012345678901234567890 12345678'.

>>> try:
...     enforce_non_empty_str_without_ws(
...         "012345678901234567890 1234567801234567890123456789012345678")
... except ValueError as ve:
...     print(str(ve)[10:])
pace allowed in string, but got '012345678901234567890 12345678...'.

>>> try:
...     enforce_non_empty_str_without_ws("")
... except ValueError as ve:
...     print(ve)
Non-empty str expected, but got empty string.

>>> try:
...     enforce_non_empty_str_without_ws(1)
... except TypeError as te:
...     print(te)
descriptor '__len__' requires a 'str' object but received a 'int'

>>> try:
...     enforce_non_empty_str_without_ws(None)
... except TypeError as te:
...     print(te)
descriptor '__len__' requires a 'str' object but received a 'NoneType'

pycommons.strings.string_conv module¶

Converting stuff to and from strings.

pycommons.strings.string_conv.bool_or_num_to_str(value)[source]¶

Convert a bool or number to string.

Parameters:: value (int | float | bool) – the number or bool
Return type:: str
Returns:: the string
Raises:: TypeError – if the number is neither bool, float, or int.

>>> bool_or_num_to_str(True)
'T'
>>> bool_or_num_to_str(False)
'F'
>>> bool_or_num_to_str(12.0)
'12'
>>> bool_or_num_to_str(12)
'12'
>>> bool_or_num_to_str(12.5)
'12.5'
>>> try:
...     bool_or_num_to_str("x")
... except TypeError as te:
...     print(te)
value should be an instance of float but is str, namely 'x'.

pycommons.strings.string_conv.bool_to_str(value)[source]¶

Convert a Boolean value to a string.

This function is the inverse of str_to_bool().

Parameters:: value (bool) – the Boolean value
Returns “T””T”:: if value == True
Returns “F””F”:: if value == False
Raises:: TypeError – if value is not a bool
Return type:: str

>>> print(bool_to_str(True))
T
>>> print(bool_to_str(False))
F

>>> try:
...     bool_to_str("t")
... except TypeError as te:
...     print(te)
value should be an instance of bool but is str, namely 't'.

>>> try:
...     bool_to_str(None)
... except TypeError as te:
...     print(te)
value should be an instance of bool but is None.

pycommons.strings.string_conv.datetime_to_date_str(date)[source]¶

Convert a datetime object to a date string.

Parameters:: date (datetime) – the date
Return type:: str
Returns:: the date string
Raises:: TypeError – if date is not an instance of datetime.datetime.

>>> datetime_to_date_str(datetime(1999, 12, 21))
'1999‑12‑21'
>>> try:
...     datetime_to_date_str(None)
... except TypeError as te:
...     print(te)
date should be an instance of datetime.datetime but is None.
>>> try:
...     datetime_to_date_str(1)
... except TypeError as te:
...     print(te)
date should be an instance of datetime.datetime but is int, namely 1.

pycommons.strings.string_conv.datetime_to_datetime_str(dateandtime)[source]¶

Convert a datetime object to a date-time string.

Parameters:: dateandtime (datetime) – the date and time
Return type:: str
Returns:: the date-time string
Raises:: TypeError – if dateandtime is not an instance of datetime.datetime.

>>> datetime_to_datetime_str(datetime(1999, 12, 21, 13, 42, 23))
'1999\u201112\u201121\xa013:42'
>>> from datetime import timezone
>>> datetime_to_datetime_str(datetime(1999, 12, 21, 13, 42,
...                                   tzinfo=timezone.utc))
'1999\u201112\u201121\xa013:42\xa0UTC'
>>> try:
...     datetime_to_datetime_str(None)
... except TypeError as te:
...     print(te)
dateandtime should be an instance of datetime.datetime but is None.
>>> try:
...     datetime_to_datetime_str(1)
... except TypeError as te:
...     print(str(te)[:60])
dateandtime should be an instance of datetime.datetime but i

pycommons.strings.string_conv.float_to_str(value)[source]¶

Convert float to a string.

The floating point value value is converted to a string.

Parameters:

value (float) – the floating point value

Return type:

str

Returns:

the string representation

Raises:

TypeError – if value is not a float
ValueError – if value is not a number

>>> float_to_str(1.3)
'1.3'
>>> float_to_str(1.0)
'1'
>>> float_to_str(1e-5)
'1e-5'

>>> try:
...     float_to_str(1)
... except TypeError as te:
...     print(te)
value should be an instance of float but is int, namely 1.

>>> try:
...     float_to_str(None)
... except TypeError as te:
...     print(te)
value should be an instance of float but is None.

>>> from math import nan
>>> try:
...     float_to_str(nan)
... except ValueError as ve:
...     print(ve)
nan => 'nan' is not a permitted float.

>>> from math import inf
>>> float_to_str(inf)
'inf'
>>> float_to_str(-inf)
'-inf'
>>> float_to_str(1e300)
'1e300'
>>> float_to_str(-1e300)
'-1e300'
>>> float_to_str(-1e-300)
'-1e-300'
>>> float_to_str(1e-300)
'1e-300'
>>> float_to_str(1e1)
'10'
>>> float_to_str(1e5)
'100000'
>>> float_to_str(1e10)
'10000000000'
>>> float_to_str(1e20)
'1e20'
>>> float_to_str(1e030)
'1e30'
>>> float_to_str(0.0)
'0'
>>> float_to_str(-0.0)
'0'

pycommons.strings.string_conv.int_or_none_to_str(value)[source]¶

Convert an integer or None to a string.

If value is None, “” is returned. If value is an instance of bool, a TypeError is raised. If value is an int, str(val) is returned. Otherwise, a TypeError is thrown.

Parameters:: value (int | None) – the value
Return type:: str
Returns:: the string representation, ‘’ for None
Returns “”””:: if value is None
Returns int.__str__(value)int.__str__(value):: otherwise
Raises:: TypeError – if value is a bool (notice that bool is a subclass of int) or any other non-int type.

>>> print(repr(int_or_none_to_str(None)))
''
>>> print(int_or_none_to_str(12))
12

>>> try:
...     int_or_none_to_str(True)
... except TypeError as te:
...     print(te)
value should be an instance of int but is bool, namely True.

>>> try:
...     int_or_none_to_str(False)
... except TypeError as te:
...     print(te)
value should be an instance of int but is bool, namely False.

>>> print(int_or_none_to_str(-10))
-10

>>> try:
...     int_or_none_to_str(1.0)
... except TypeError as te:
...     print(te)
value should be an instance of int but is float, namely 1.0.

pycommons.strings.string_conv.num_or_none_to_str(value)[source]¶

Convert a numerical type (int, float) or None to a string.

If value is None, then “” is returned. Otherwise, the result of num_to_str() is returned.

Parameters:

value (int | float | None) – the value

Return type:

str

Returns:

the string representation, “” for None

Returns “”””:

if value is None

Returns num_to_str(value)num_to_str(value):

otherwise

Raises:

TypeError – if value not Nont but instead is a bool (notice that bool is a subclass of int) or any other type that is neither int nor float.
ValueError – if value is not-a-number

>>> print(repr(num_or_none_to_str(None)))
''
>>> print(num_or_none_to_str(12))
12
>>> print(num_or_none_to_str(12.3))
12.3

>>> try:
...     num_or_none_to_str(True)
... except TypeError as te:
...     print(te)
value should be an instance of any in {float, int} but is bool, namely True.

>>> try:
...     num_or_none_to_str(False)
... except TypeError as te:
...     print(te)
value should be an instance of any in {float, int} but is bool, namely False.

>>> from math import nan
>>> try:
...     num_to_str(nan)
... except ValueError as ve:
...     print(ve)
nan => 'nan' is not a permitted float.

pycommons.strings.string_conv.num_to_str(value)[source]¶

Transform a numerical value which is either int or`float` to a string.

If value is an instance of int, the result of its conversion via str will be returned. If value is an instance of bool, a TypeError will be raised. Otherwise, the result of float_to_str() is returned. This means that nan will yield a ValueError and anything that is neither an int, bool, or float will incur a TypeError.

Parameters:

value (int | float) – the value

Return type:

str

Returns:

the string

Raises:

TypeError – if value is a bool (notice that bool is a subclass of int) or any other type that is neither int nor float.
ValueError – if value is not-a-number

>>> num_to_str(1)
'1'
>>> num_to_str(1.5)
'1.5'

>>> try:
...     num_to_str(True)
... except TypeError as te:
...     print(te)
value should be an instance of any in {float, int} but is bool, namely True.

>>> try:
...     num_to_str(False)
... except TypeError as te:
...     print(te)
value should be an instance of any in {float, int} but is bool, namely False.

>>> try:
...     num_to_str("x")
... except TypeError as te:
...     print(te)
value should be an instance of float but is str, namely 'x'.

>>> try:
...     num_to_str(None)
... except TypeError as te:
...     print(te)
value should be an instance of float but is None.

>>> from math import inf, nan
>>> try:
...     num_to_str(nan)
... except ValueError as ve:
...     print(ve)
nan => 'nan' is not a permitted float.

>>> num_to_str(inf)
'inf'
>>> num_to_str(-inf)
'-inf'

pycommons.strings.string_conv.str_to_bool(value)[source]¶

Convert a string to a boolean value.

This function is the inverse of bool_to_str().

Parameters:

value (str) – the string value

Returns TrueTrue:

if value == “T”

Returns FalseFalse:

if value == “F”

Raises:

TypeError – if value is not a string
ValueError – if value is neither T nor F

Return type:

bool

>>> str_to_bool("T")
True
>>> str_to_bool("F")
False

>>> try:
...     str_to_bool("x")
... except ValueError as v:
...     print(v)
Expected 'T' or 'F', but got 'x'.

>>> try:
...     str_to_bool(1)
... except TypeError as te:
...     print(te)
descriptor '__len__' requires a 'str' object but received a 'int'

>>> try:
...     str_to_bool(None)
... except TypeError as te:
...     print(te)
descriptor '__len__' requires a 'str' object but received a 'NoneType'

pycommons.strings.string_conv.str_to_int_or_none(value)[source]¶

Convert a string to an int or None.

If the value value is None, then None is returned. If the vlaue value is empty or entirely composed of white space, None is returned. If the value value can be converted to an integer, then an int with the corresponding value is returned. Otherwise, a ValueError is thrown.

Parameters:

value (str | None) – the string value, or None

Return type:

int | None

Returns:

the int or None

Raises:

TypeError – if value is neither a str nor None
ValueError – if value is a str but cannot be base-10 converted to an integer

>>> print(str_to_int_or_none(""))
None
>>> print(str_to_int_or_none("5"))
5
>>> print(str_to_int_or_none(None))
None
>>> print(str_to_int_or_none("  "))
None
>>> try:
...     print(str_to_int_or_none(1.3))
... except TypeError as te:
...     print(te)
value should be an instance of str but is float, namely 1.3.
>>> try:
...     print(str_to_int_or_none("1.3"))
... except ValueError as ve:
...     print(ve)
invalid literal for int() with base 10: '1.3'

pycommons.strings.string_conv.str_to_num(value)[source]¶

Convert a string to an int or float.

If value is not an instance of str, a TypeError will be raised. If the value value can be converted to an integer, then an int with the corresponding value is returned. If the value value can be converted to a float, a float with the appropriate value is returned. Otherwise, a ValueError is thrown.

Parameters:

value (str) – the string value

Return type:

int | float

Returns:

the int or float: Integers are preferred to be used whereever possible

Raises:

TypeError – if value is not a str
ValueError – if value is a str but cannot be converted to an integer (base-10) or converts to a float which is not a number

>>> print(type(str_to_num("15.0")))
<class 'int'>
>>> print(type(str_to_num("15.1")))
<class 'float'>
>>> str_to_num("inf")
inf
>>> str_to_num("  -inf  ")
-inf
>>> try:
...     str_to_num(21)
... except TypeError as te:
...     print(te)
descriptor 'strip' for 'str' objects doesn't apply to a 'int' object
>>> try:
...     str_to_num("nan")
... except ValueError as ve:
...     print(ve)
NaN is not permitted, but got 'nan'.
>>> try:
...     str_to_num("12-3")
... except ValueError as ve:
...     print(ve)
Invalid numerical value '12-3'.
>>> str_to_num("1e34423")
inf
>>> str_to_num("-1e34423")
-inf
>>> str_to_num("-1e-34423")
0
>>> str_to_num("1e-34423")
0
>>> try:
...     str_to_num("-1e-34e4423")
... except ValueError as ve:
...     print(ve)
Invalid numerical value '-1e-34e4423'.
>>> try:
...     str_to_num("T")
... except ValueError as ve:
...     print(ve)
Invalid numerical value 'T'.
>>> try:
...     str_to_num("F")
... except ValueError as ve:
...     print(ve)
Invalid numerical value 'F'.
>>> try:
...     str_to_num(None)
... except TypeError as te:
...     print(te)
descriptor 'strip' for 'str' objects doesn't apply to a 'NoneType' object
>>> try:
...     str_to_num("")
... except ValueError as ve:
...     print(ve)
Value '' becomes empty after stripping, cannot be converted to a number.

pycommons.strings.string_conv.str_to_num_or_none(value)[source]¶

Convert a string to an int or float or None.

If the value value is None, then None is returned. If the vlaue value is empty or entirely composed of white space, None is returned. If the value value can be converted to an integer, then an int with the corresponding value is returned. If the value value can be converted to a float, a float with the appropriate value is returned. Otherwise, a ValueError is thrown.

Parameters:

value (str | None) – the string value

Return type:

int | float | None

Returns:

the int or float or None

Raises:

TypeError – if value is neither a str nor None
ValueError – if value is a str but cannot be converted to an integer (base-10) or converts to a float which is not a number

>>> print(type(str_to_num_or_none("15.0")))
<class 'int'>
>>> print(type(str_to_num_or_none("15.1")))
<class 'float'>
>>> str_to_num_or_none("inf")
inf
>>> str_to_num_or_none("  -inf  ")
-inf
>>> try:
...     str_to_num_or_none(21)
... except TypeError as te:
...     print(te)
descriptor 'strip' for 'str' objects doesn't apply to a 'int' object
>>> try:
...     str_to_num_or_none("nan")
... except ValueError as ve:
...     print(ve)
NaN is not permitted, but got 'nan'.
>>> try:
...     str_to_num_or_none("12-3")
... except ValueError as ve:
...     print(ve)
Invalid numerical value '12-3'.
>>> str_to_num_or_none("1e34423")
inf
>>> str_to_num_or_none("-1e34423")
-inf
>>> str_to_num_or_none("-1e-34423")
0
>>> str_to_num_or_none("1e-34423")
0
>>> try:
...     str_to_num_or_none("-1e-34e4423")
... except ValueError as ve:
...     print(ve)
Invalid numerical value '-1e-34e4423'.
>>> try:
...     str_to_num_or_none("T")
... except ValueError as ve:
...     print(ve)
Invalid numerical value 'T'.
>>> try:
...     str_to_num_or_none("F")
... except ValueError as ve:
...     print(ve)
Invalid numerical value 'F'.
>>> print(str_to_num_or_none(""))
None
>>> print(str_to_num_or_none(None))
None
>>> print(type(str_to_num_or_none("5.0")))
<class 'int'>
>>> print(type(str_to_num_or_none("5.1")))
<class 'float'>

pycommons.strings.tools module¶

Routines for handling strings.

pycommons.strings.tools.get_prefix_str(strings)[source]¶

Compute the common prefix of an iterable of strings.

Parameters:: strings (Union[str, Iterable[str]]) – the iterable of strings
Return type:: str
Returns:: the common prefix
Raises:: TypeError – if the input is not a string, iterable of string, or contains any non-string element (before the prefix is determined) Notice: If the prefix is determined as the empty string, then the search is stopped. If some non-str items follow later in strings, then these may not raise a TypeError

>>> get_prefix_str(["abc", "acd"])
'a'
>>> get_prefix_str(["xyz", "gsdf"])
''
>>> get_prefix_str([])
''
>>> get_prefix_str(["abx"])
'abx'
>>> get_prefix_str(("abx", ))
'abx'
>>> get_prefix_str({"abx", })
'abx'
>>> get_prefix_str("abx")
'abx'
>>> get_prefix_str(("\\relative.path", "\\relative.figure",
...     "\\relative.code"))
'\\relative.'
>>> get_prefix_str({"\\relative.path", "\\relative.figure",
...     "\\relative.code"})
'\\relative.'

>>> try:
...     get_prefix_str(None)
... except TypeError as te:
...     print(te)
strings should be an instance of any in {str, typing.Iterable} but is None.

>>> try:
...     get_prefix_str(1)
... except TypeError as te:
...     print(str(te)[:60])
strings should be an instance of any in {str, typing.Iterabl

>>> try:
...     get_prefix_str(["abc", "acd", 2, "x"])
... except TypeError as te:
...     print(te)
descriptor '__len__' requires a 'str' object but received a 'int'

>>> try:
...     get_prefix_str(["abc", "acd", None, "x"])
... except TypeError as te:
...     print(te)
descriptor '__len__' requires a 'str' object but received a 'NoneType'

>>> get_prefix_str(["xyz", "gsdf", 5])
''

pycommons.strings.tools.replace_regex(search, replace, inside)[source]¶

Replace all occurrences of ‘search’ in ‘inside’ with ‘replace’.

This replacement procedure is done repetitively and recursively until no occurrence of search is found anymore. This, of course, may lead to an endless loop, so a ValueError is thrown if there are too many recursive replacements.

Parameters:

search (Union[str, Pattern]) – the regular expression to search, either a string or a pattern
replace (Union[str, Callable[[Match], str]]) – the string to replace it with, or a function receiving a re.Match instance and returning a replacement string
inside (str) – the string in which to search/replace

Return type:

str

Returns:

the new string after the recursive replacement

Raises:

TypeError – if any of the parameters is not of the right type
ValueError – if there are 100000 recursive replacements or more, indicating that there could be an endless loop

>>> replace_regex('[ \t]+\n', '\n', ' bla \nxyz\tabc\t\n')
' bla\nxyz\tabc\n'
>>> replace_regex('[0-9]A', 'X', '23A7AA')
'2XXA'
>>> from re import compile as cpx
>>> replace_regex(cpx('[0-9]A'), 'X', '23A7AA')
'2XXA'

>>> def __repl(a):
...     print(repr(a))
...     return "y"
>>> replace_regex("a.b", __repl, "albaab")
<re.Match object; span=(0, 3), match='alb'>
<re.Match object; span=(3, 6), match='aab'>
'yy'

>>> def __repl(a):
...     print(repr(a))
...     ss = a.group()
...     print(ss)
...     return "axb"
>>> replace_regex("aa.bb", __repl, "aaaaaxbbbbb")
<re.Match object; span=(3, 8), match='aaxbb'>
aaxbb
<re.Match object; span=(2, 7), match='aaxbb'>
aaxbb
<re.Match object; span=(1, 6), match='aaxbb'>
aaxbb
<re.Match object; span=(0, 5), match='aaxbb'>
aaxbb
'axb'

>>> replace_regex("aa.bb", "axb", "aaaaaxbbbbb")
'axb'
>>> replace_regex("aa.bb", "axb", "".join("a" * 100 + "y" + "b" * 100))
'axb'
>>> replace_regex("aa.bb", "axb",
...               "".join("a" * 10000 + "y" + "b" * 10000))
'axb'

>>> try:
...    replace_regex(1, "1", "2")
... except TypeError as te:
...    print(str(te)[0:60])
search should be an instance of any in {str, typing.Pattern}

>>> try:
...    replace_regex(None, "1", "2")
... except TypeError as te:
...    print(te)
search should be an instance of any in {str, typing.Pattern} but is None.

>>> try:
...    replace_regex("x", 2, "2")
... except TypeError as te:
...    print(te)
replace should be an instance of str or a callable but is int, namely 2.

>>> try:
...    replace_regex("x", None, "2")
... except TypeError as te:
...    print(te)
replace should be an instance of str or a callable but is None.

>>> try:
...    replace_regex(1, 1, "2")
... except TypeError as te:
...    print(str(te)[0:60])
search should be an instance of any in {str, typing.Pattern}

>>> try:
...    replace_regex("yy", "1", 3)
... except TypeError as te:
...    print(te)
inside should be an instance of str but is int, namely 3.

>>> try:
...    replace_regex("adad", "1", None)
... except TypeError as te:
...    print(te)
inside should be an instance of str but is None.

>>> try:
...    replace_regex(1, "1", 3)
... except TypeError as te:
...    print(str(te)[0:60])
search should be an instance of any in {str, typing.Pattern}

>>> try:
...    replace_regex(1, 3, 5)
... except TypeError as te:
...    print(str(te)[0:60])
search should be an instance of any in {str, typing.Pattern}

>>> try:
...     replace_regex("abab|baab|bbab|aaab|aaaa|bbbb", "baba",
...                   "ababababab")
... except ValueError as ve:
...     print(str(ve)[:50])
Too many replacements, pattern re.compile('abab|ba

pycommons.strings.tools.replace_str(find, replace, src)[source]¶

Perform a recursive replacement of strings.

After applying this function, there will not be any occurence of find left in src. All of them will have been replaced by replace. If that produces new instances of find, these will be replaced as well unless they do not make the string shorter. In other words, the replacement is continued only if the new string becomes shorter.

See replace_regex() for regular-expression based replacements.

Parameters:

find (str) – the string to find
replace (str) – the string with which it will be replaced
src (str) – the string in which we search

Return type:

str

Returns:

the string src, with all occurrences of find replaced by replace

Raises:

TypeError – if any of the parameters are not strings

>>> replace_str("a", "b", "abc")
'bbc'
>>> replace_str("aa", "a", "aaaaa")
'a'
>>> replace_str("aba", "a", "abaababa")
'aa'
>>> replace_str("aba", "aba", "abaababa")
'abaababa'
>>> replace_str("aa", "aa", "aaaaaaaa")
'aaaaaaaa'
>>> replace_str("a", "aa", "aaaaaaaa")
'aaaaaaaaaaaaaaaa'
>>> replace_str("a", "xx", "aaaaaaaa")
'xxxxxxxxxxxxxxxx'

>>> try:
...     replace_str(None, "a", "b")
... except TypeError as te:
...     print(te)
replace() argument 1 must be str, not None

>>> try:
...     replace_str(1, "a", "b")
... except TypeError as te:
...     print(te)
replace() argument 1 must be str, not int

>>> try:
...     replace_str("a", None, "b")
... except TypeError as te:
...     print(te)
replace() argument 2 must be str, not None

>>> try:
...     replace_str("x", 1, "b")
... except TypeError as te:
...     print(te)
replace() argument 2 must be str, not int

>>> try:
...     replace_str("a", "v", None)
... except TypeError as te:
...     print(te)
descriptor '__len__' requires a 'str' object but received a 'NoneType'

>>> try:
...     replace_str("x", "xy", 1)
... except TypeError as te:
...     print(te)
descriptor '__len__' requires a 'str' object but received a 'int'

pycommons.strings.tools.split_str(source, split_by)[source]¶

Split a string by the given other string.

The goal is to provide a less memory intense variant of the method str.split(). This routine should iteratively divide a given string based on a splitting character or string. This function may be useful if we are dealing with a very big source string and want to iteratively split it into smaller strings. Instead of creating a list with many small strings, what str.split() does, it creates these strings iteratively

Parameters:

source (str) – the source string
split_by (str) – the split string

Return type:

Generator[str, None, None]

Returns:

each split element

>>> list(split_str("", ""))
['']

>>> list(split_str("", "x"))
['']

>>> list(split_str("a", ""))
['a']

>>> list(split_str("abc", ""))
['a', 'b', 'c']

>>> list(split_str("a;b;c", ";"))
['a', 'b', 'c']

>>> list(split_str("a;b;c;", ";"))
['a', 'b', 'c', '']

>>> list(split_str(";a;b;;c;", ";"))
['', 'a', 'b', '', 'c', '']

>>> list(split_str("a;aaa;aba;aa;aca;a", "a;a"))
['', 'a', 'b', '', 'c', '']

pycommons.strings package¶

Submodules¶

pycommons.strings.chars module¶

pycommons.strings.enforce module¶

pycommons.strings.string_conv module¶

pycommons.strings.tools module¶

Table of Contents

Previous topic

This Page