The Source of all Knowledge

Alex Zhou, Director of Infra and Science 2019-03-11

In this Q&A article, MioTech’s Director of Data Infrastructure and Science, Alex Zhou gives us the low-down on the knowledge graph and why they play a huge role in many intelligent systems today.

What is a knowledge graph?


In simple terms, a knowledge graph is a tool that visually represents all entities and their interrelations/relationships in a knowledge domain in the form of graphs in a network structure. Common entities include: company, person, place name, and an event. Common relationships include: competitive relationships, foreign investment, and so on.


Source: MioTech Report


Before a search engines started using knowledge graph, there was the “search string”, could you explain? Why was that inefficient?


Before the emergence of knowledge graph, search engines only used text data string entered by the user to match the text string on all the web pages to find the relevant information.


For example, let’s say I wanted to know when “Captain Marvel” is playing in the cinemas. I would have had to first, search for the cinema, then scroll through the website to find the right movie, and then find the showtimes. I would have had to navigate through multiple links before I finally get to what I initially wanted.


This simple way of searching information has its advantages in being able to find out exactly what you want quickly but its limitations are obvious. First of all, this practice of merely matching text does not fully understand the user's true intentions which often  results in unanswered, unfulfilled search queries. Secondly, simple string matching cannot extend the information further, which greatly reduces search efficiency.

Source: Company Report


Why is the knowledge graph more efficient? How is the AI infrastructure better as compared to others?


Knowledge graph is the knowledge base behind AI. Without the support of knowledge graph, the amount of information that the AI ​​can receive is insufficient. The two are interdependent.


Let's go back to the previous example of wanting to know when “Captain Marvel” is playing in the cinemas. With today’s knowledge graph structure, I will not only get an array of cinema choices and movie timings, it’d be paired with the movie’s official website, perhaps a Rotten Tomatoes score, perhaps a news article in relation to gender representation in Hollywood. Moreover, depending on my user behaviour, it’ll rank the information based on what I would like to click based on previous searches.


The cornerstone of AMI, MioTech’s virtual data scientist’s graph query, is our proprietary knowledge graph. If users want to know Apple's stock price, with the help of AMI’s knowledge graph, our graph query can present all information related to Apple, such as financial reports, competitors, foreign investment activities, latest news updates, etc. to users, significantly improving search efficiency for a financial context.


How does a knowledge graph help AI discover hidden relationships?

Two examples can be given to illustrate this. First, in the field of anti-fraud, through knowledge graph, we can visualize certain patterns from historical fraudulent activity/data, and evaluate whether other multiple transactions satisfy or share a similar pattern and if so trigger the fraud warning to alert users.


Source: MioTech Report


Another common example is when conducting market research. A merger of a company taking place in Los Angeles may have a considerable impact on London competitors, Chinese suppliers, and Taiwanese competitors of this Chinese supplier. With the help of knowledge graph, we can consider these isolated events as a whole.


Source: MioTech Report


How is MioTech’s knowledge graph different from Google?


From a data perspective, Google's knowledge map can be called the "General knowledge graph". This means that Google's data is more comprehensive with data coverage being broader, catered to the wider general audience. But MioTech's knowledge map is tailored to the financial industry. All the data in our database is financially related. In terms of coverage of financial data, we can say that we have more data coverage than Google.

At the application level, MioTech’s knowledge map is more in-depth. We cater and customize to the different needs of the financial industry, such as equity penetration analysis, peer analysis. These are results that Google's knowledge map cannot achieve.


What is critical when building MioTech’s knowledge graph?

Let’s compare it with Google. From a data perspective, Google requires more complete data, but there is no higher requirement for the reliability of the data source. In the financial industry, there is a higher demand for accuracy, reliability, and real-time data, because it is directly linked to investment decisions.


What are the challenges knowledge graphs currently face in terms of performance. How can it be improved?


The reliability and real-time nature of data sources is currently a problem we need to solve.

In order to ensure the reliability of the data, we have designed a data testing framework. Common data tests use a method of extracting certain samples. But in order to ensure the high accuracy of data in the financial industry, we have performed a full data scan. The disadvantage of this method is that it is slow. But we are passing distributed systems and parallel computing to increase the speed of detection.

The real-time nature of the data is another challenge we are currently facing. At present, most of the data used by knowledge graph comes from traditional data providers and the method of manually entering data is still used. Therefore, new events often take a while to be reflected in the knowledge map. We are currently developing a new AI model to combat this. After completion, we will no longer rely on labor, but let the machine determine the subject, event and object; and such combination will result in new relationships in the knowledge graph, resulting in more hidden market signals or investment opportunities to be found. This AI model will take into account the source of the message, the number of media mentions and related contexts to ensure that the data, while real-time, is accurate.

Training labels are also a point we need to pay more attention to. The financial industry has higher requirements for the labeling of labels, and has certain requirements for the qualifications and financial background of the labeling personnel which would come at a substantial cost.


What is in store for MioTech’s knowledge graph?


In the future, MioTech’s knowledge graph will no longer need manual labeling. With the AI ​​model I just mentioned, automatic labeling can be achieved.

Share this to
Share this to