- Aug 17, 2014
- 11,777
Meta has open-sourced a machine-learning resource that could one day supplant Wikipedia as the world's biggest publicly available knowledge-verification database.
Dubbed Sphere, it can be used to perform knowledge-intensive natural language processing, or KI-NLP, we're told. In practical terms, that means it can be used to answer complicated questions using natural language, and find sources for claims.
A given example of its use is asking Sphere, "Who is Joëlle Sambi Nzeba?" Wikipedia doesn't have an entry for her, but Sphere said she was "born in Belgium and grew up partly in Kinshasa (Congo). She currently lives in Brussels. She is a writer and slammer, alongside her activism in a feminist movement," and links to a website where it got that information about her work.
Wikipedia has pretty much served as the corpus of record, Meta's eggheads wrote in a paper discussing the design of Sphere, claiming the volunteer-maintained uber-wiki is "accurate, well-structured, and small enough to use easily in testing environments."
Seeking to build something bigger and better than Wikipedia, though, Meta pulled together content from all over the web – sans wikipedia.org – to form a "universal, uncurated and unstructured knowledge source for multiple KI-NLP tasks at once." The result is Sphere, which is more or less a mountain of processed data that can be queried using a bunch of machine-learning tools.
The team adds that Sphere "can match and outperform baselines grounded in Wikipedia" on some tasks using the KILT AI benchmark. That is to say, Sphere performs better than AI systems built on Wikipedia's content.

Meta's AI-based Sphere 'may be the next big break in NLP'
Don't believe everything you read on the internet