Adult content filter
Our client manage a global telecommunications ecosystem with an exchange platform at its core, facilitating unlimited, multi-party, chain trades, routing and financial settlements in real-time. It streamlines business processes, reduces network infrastruc
Their patented Blockchain technology, proprietary advanced algorithms and a pioneering globally distributed network enable operators, carriers and resellers to remove commercial inefficiencies, operational challenges and financial treats in today's and tomorrow’s telecom market.
One of the main offerings of this ecosystem is an interactive trading floor that uses social networking tools and market intelligence to enable real-time communications, negotiations and instant decision-making. It delivers a centralized forum to evaluate, negotiate and create new business opportunities, and maximizes business opportunities and capitalizes on immediate needs.
One of the concerns of those responsible for the social network was to prevent users from uploading inappropriate content. However, we wanted to avoid having to carry out a pre-moderation process, due to the cost that this entailed and especially because this type of filter causes users of a social network to disengage from it.
The goal was to create an adult filter or inappropriate content, capable of detecting this type of content in a transparent manner for all those users who upload appropriate content. The filter should be able to detect inappropriate content whether it was uploaded as an image or as text.
In a high level overview, the system is composed by a REST API using different services:
- Image filtering service: responsible of analyzing images - classifying them by using the Cloud Vision API - and if the images pass the filter, detect if it has text - if it does, check it with the text filtering service - and retrieve a response to the end user about it.
- Text filtering service: responsible for filtering text. This is extremely complex because it implies using different APIs to ensure they are used properly. Firstly, the system needs to check in the Google translate API if the text is in English (so the NLP API is able to be used), otherwise, it needs to be translated to be used in the NLP API - because of the limitation that NLP API only classify text in categories if the text is in English.
From the existent APIs, the ones that will be used will be:
- Vision API enables developers to understand the content of an image by encapsulating powerful machine learning models in an easy-to-use REST API. It quickly classifies images into thousands of categories (such as, “sailboat”), detects individual objects and faces, and reads printed words contained within the images. You can build metadata on your image catalog, moderate offensive content, or enable new marketing scenarios through image sentiment analysis. We will mainly use the safe search functionality and also the object detection one.
- Cloud Natural Language API provides natural language understanding technologies to developers, including sentiment analysis, entity analysis, entity sentiment analysis, content classification, and syntax analysis. This API is part of the larger Cloud Machine Learning API family. This API has limitations and it is only able to classify text in categories2 (Adult) in English, so if a text comes in a different language it will not work. That’s why we need to use firstly the Cloud Translation AP.
- Cloud Translation API provides a simple programmatic interface for translating an arbitrary string into any supported language using state-of-the-art Neural Machine Translation. It is highly responsive, so websites and applications can integrate with Translation API for fast, dynamic translation of source text from the source language to a target language (such as French to English). Language detection is also available in cases where the source language is unknown. The underlying technology is updated constantly to include improvements from Google research teams, which results in better translations and new languages and language pairs.
- The solution run on the Kubernetes engine, so a docker image will be stored in the Container Registry.
- The service is running on the Kubernetes engine and the traffic is being forwarded to the container through the Load Balancer.
- API Vision API is used with a Safe Search.
- Cloud Natural Language API categorizes the content.
- Cloud Translation API is used to check the language of the text to be analyzed and to translate it into English if the source language is not English.
- Firebase contains a dictionary of more than 1.7k words/expressions in English that are not allowed.
After a month in the testing phase, the new system analyzed and leaked more than 500 comments, marking some of them as inappropriate content. All the filtered comments were manually reviewed, and more than 90% effectively contained inappropriate images or phrases that would have been rejected by any human moderator.
On the other hand, as of now, no messages deemed inappropriate or that should not have passed the content filter have been found among the unfiltered messages.
After the results obtained by the filter, the client decided that it is not necessary to review more comments manually, which has saved the daily time of a person dedicated to moderating comments in the social network.