Semantic Maps, Silk Data’s new revolutionary text processing engine, turns months of studying COVID-19 research papers into hours
For nearly two months, COVID-19 - a disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) - has been the central topic of all public debates. Scientists, policy makers, and businesses worldwide joined forces to study the virus and develop solutions to help people with its implications.
With a substantial amount of data and research papers on COVID collected and provided to experts in the recent weeks, a new problem arose: Processing such enormous volume of (unstructured) information will take months if not years of human work. To help all involved in studying COVID-19, we at Silk Data are happy to introduce the free application (powered by a proprietary technology) that makes navigating in large arrays of texts and documents as easy as using the world map.
The novel method of representing collections of texts and documents – called ‘semantic maps’ – was first developed by Silk Data engineers in 2018. That same year the first prototype was born, based on hundreds of thousands of Wikipedia articles.
The outcome seemed revolutionary. A large piece of the entire humanity’s knowledge was transformed into a compact semantically structured and navigable map where you can easily and fluidly move between different topics.
Each area of the map had its own meaning so you could embrace the knowledge that Wikipedia articles contained by exploring the neighboring regions and seeing how different keywords gradually change.
After successfully experimenting with Wikipedia, we used our technology on multiple other datasets: medical articles, economic studies, legal documents. The best thing about Semantic maps is that this technology can be applied for any collection of texts and text-based documents. Semantic maps enable an instant discovery of the most important topics and revelation of all logical connections between words and subjects.
COVID-19 dataset and a free demo
To demonstrate the capabilities of our technology of semantic mapping and to provide a free versatile tool to everyone who works with information on coronavirus, we processed the recently published COVID-19 research dataset. The COVID-19 Open Research Dataset (CORD-19), was released by a group or research institutions and contains more than 33 000 full-text research papers, over 1 Gb of text in total.
With a semantic map, researchers, clinicians, and everyone involved in studying the novel coronavirus can quickly access required piece of information or research, explore how different terms and topics are connected, embrace vast amounts of data and transform it into structured and instantly applicable knowledge. We also believe that our technology can help different experts and institutions better coordinate in studying the data and developing the treatment.
How semantic maps work
Our technology automatically detects relations between texts, maps texts to semantically related keywords and then generates a visual representation of data. Every dot on the semantic map represents a text excerpt from a document and is characterized by several words that have very close sense. To visualize the relations, semantic map generates ‘clots’, or dense areas in the map, related to the most important topics in a collection of documents. The above process is mostly automatic, it is only required from a human to categorize generated topics, to store/bookmark the knowledge or to better communicate results with other team members.
Our free demonstration currently offers the following features:
- Visual exploration of the text collection, examining the keywords and related papers from CORD-19 for some region of the map.
- Zooming into any selected area on the map, to drill down for more narrow keywords and their relations.
- Selection of keyword, to see its usage in different contexts and visually represented by different map regions.
- Opening the full text of selected scholarly paper in PubMed Central reader or at the publisher’s web site.
For more details of our application, please consult the Tutorial.
Semantic mapping is a powerful engine that can be used for solving the most complex text exploration and processing tasks and applied to any type of text-based unstructured documents.
Such text processing applications include:
- Versatile search over documents, considering the word contexts (topics).
- Semi-supervised categorization of documents.
- Semantic search, using short text (or a question) as a query.
- Highlighting different topics in a document, helping experts to read it faster and have a deeper analysis.