String

The string module contains useful methods when it comes to manipulating and dealing with strings. Let's take a dive into their individual implementation.

Is String Object¶

The purpose of the is_string_object() method is to check whether a provided object is actually a string object, the method takes an obj as input, which is the object that will be used for checking.

Let's see how this translates to code:

from everysk.core.string import is_string_object

is_string_object('Hello, World!')
True

In the example above we are able to verify that the object passed is actually a string. Let's take a look at another scenario below, this time with a different object type:

is_string_object(123)
False

Convert to String¶

This method is used in order to convert a value to a string. It takes a value as an argument and converts it to a string.

Let's take a look into a couple examples below:

from everysk.core.string import to_string

to_string(123)
'123'

to_string({'key': 'value'})
"{'key': 'value'}"

to_string([1,2,3])
'[1, 2, 3]'

Normalize String¶

To better understand the importance of this method let's start with an example:

aa = b'\xc4\x81'.decode('utf-8')
bb = b'a\xcc\x84'.decode('utf-8')

When we print the values defined above we get the following:

aa
'ā'

bb
'ā'

They may look the same but watch when we try to compare both values:

aa == bb
False

In the first variable the character is represented using only one unicode point U+0101, in the second variable the character is represented as two separate unicode points U+0061 (letter 'a') and U+0304 (the macron symbol)

Imagine that you have both characters I (U+2160) which is the roman numeral one and (U+0049) which is the latin capital letter I. In the surface they may look the same, but in fact they are different characters. This might be a problem when trying to compare two characters.

The normalize_string() method solves this problem by applying a unicode normalization based on their equivalence. Let's circle back to our example and see the full implementation

from everysk.core.string import normalize_string

aa = b'\xc4\x81'.decode('utf-8')
bb = b'a\xcc\x84'.decode('utf-8')

aa
'ā'

bb
'ā'

aa == bb
False

new_aa = normalize_string(aa)
new_bb = normalize_string(bb)
new_aa == new_bb
True

Normalize String For Search¶

This method implements the functionality of the previous normalize_string() method and goes even further by removing whitespaces and formatting the string to lowercase.

Let's take a look at the implementation below:

from everysk.core.string import normalize_string_to_search

normalize_string_to_search('  My Query   ')
'my query'

Import From String¶

The import_from_string() method takes a dotted_path as argument, which refers to the dotted path of the python class or module, and returns the imported class according to the path provided.

This method uses the import_module() method from the importlib library to perform the import based on the string path provided as mentioned.

Below we have an implementation example:

from everysk.core.string import import_from_string

import_from_string('my_module.module_name.ClassName')
my_module.module_name.ClassName

In the cases when the provided dotted path is not valid or for some unknown reason the module cannot be imported, the method throws in a ModuleNotFoundError.

import_from_string('invalid_module.UnknownClass')
ModuleNotFoundError: No module named 'invalid_module'