How Glass AI works

Frontiers of AI

One of the frontiers in AI is machine language “understanding”, which aims to give machines the power to understand not just words but entire sentences and paragraphs. Following several years of R&D, at Glass we have invented AI technology that can read and interpret text at scale.

You can see Glass in action for yourself in our research demo. Here we have pointed the Glass AI at the open web and asked it to find and map organisations across the UK & US.

Making sense of text at scale

Our technology uses various approaches in machine learning and computational linguistics. It combines semantic analysis and resource crawling at scale:

  • Semantic analysis
    Glass detects entities and classifies content from text (e.g. companies, people, news articles) with state-of-the-art precision. So when Glass goes to a website or reads a news source, it makes its own decisions on what is being talked about.

  • Resource crawling 
    Glass is an intelligent crawler, with smart filtering and crawling that follows links that are likely to discover the data that is most relevant to the results, simulating how a human would efficiently scan a website. By doing this we are able to extract large-scale information highly efficiently.

  • Topics ontology
    Glass builds a large topic map to help understand content and open the data for further investigation. For example, the research demo contains around 300k related topics, which are continuously improved as Glass learns more.

  • Automated on-boarding of entities
    Everything in Glass is fully automated. For example, when detecting businesses, Glass automatically recognises the type of site and the names of the business, and then works out which sectors the business is in and where it is based.

Multiple data sources

90% of the world's data is unstructured and can be found on various platforms. Our AI makes sense of vast quantities of textual data – whether from websites, news, proprietary databases or other sources. The research demo has been set up to read the internet. Web data is unstructured, fast-moving and hard to query at scale. In the demo we track any topic of interest across hundreds of millions of web pages, watching over millions of organisations.