As a multidisciplinary employee, Ebbot is able to work in many positions, such as Customer Service or Helpdesk Digital Assistant. This means that he sometimes has to ask your personal information to help you solve problems. Are you wondering if it is safe to share your information with Ebbot? Well, you can put your mind at ease now as we reassure you that with Hello Ebbot's special dishwasher, your personal data is completely secure! How does this dishwasher work and how did we make it possible, you may ask? Please keep reading and we will answer all your questions 👀

Handling numeric data

During conversations with Ebbot, sometimes there exists very sensitive and important information such as credit card number or personal identity number. We can see how dangerous it gets if people with bad intentions have access to this kind of data. With the dishwasher, this information will be censored and not saved in the database.

Compared to text data, it is easier to sort these types of information out, thanks to the help of regular expression or regex - which is used to check if a string contains the specified search pattern. For example, the pattern of phone numbers can be defined as: 

*phone = re.compile(r'([+]?46|0)(s)(7[0-9])(s)(d{4})(s)(d{3})')

In which re.compile() method allows us to combine a regex pattern into pattern objects. Because we don't want to make this blog post too technical, we are not going to explain the meaning of the pattern inside the parentheses. But in case you are a tech enthusiast, here is a regex cheatsheet for you so you can understand Hello Ebbot's dishwasher mechanism better! 

Washing personal text data using the dishwasher

With text data, we have emails, names and locations in our cleaning list. For emails, it is very simple to wipe out using regex. However, names and locations are pretty challenging to tackle, and we need a solid solution to filter out as many of them as possible. 

Fortunately, we are able to utilize a very powerful yet open-source pre-trained model, released by the National Library of Sweden, specifically fine-tuned for the purpose of named entity extraction. Or simply speaking, this model allows us to detect entity types such as personal names, locations, events and organizations in a sentence.

Enough talking, let's see some actions! Here are some of the examples on our demo web app. 

Unlike our previous NLP projects - which are still in testing stage, this dishwasher has actually been in use by one of our clients. 🥳 If you want to know more about this special dishwasher, feel free to contact us! In case you are curious about our digital employee Ebbot, you can read more about him here. See you next month in our May 2021 edition of Hello Ebbot's NLP blog 👋