Google makes datasets easier to find online

Researchers and academics searching for datasets online will now have an easier time doing so as Google’s Dataset Search is now out of beta and includes new tools to better filter searches with access to close to 25m datasets.

Dataset Search first launched in 2018 as part of the company’s goal to put an end to the fragmentation of open-access data. 

While many universities, governments and labs publish their data online, it is often difficult to find using traditional search engines. However, by adding open source metadata tags to their web pages, these groups can have their data indexed by Google’s Dataset Search.

Although the search giant did not share an specific usage figures for Dataset Search, the company says that “hundreds of thousands of users” have tried it out since its launch and that the tool has received positive support from the scientific community.

The Verge spoke with a research scientist at Google AI who helped create the tool named Natasha Noy who said that “most [data] repositories have been very responsive” and that Dataset Search has even encouraged older scientific institutions to take “publishing metadata more seriously”.

Now that the tool is out of beta, Google has added new features to it including the ability to filter data by type (tables, images, text, etc), whether it is free to use and also the geographic area it covers. Dataset Search is also now available on mobile and it has expanded dataset descriptions.

According to Google, the tool’s search engine covers almost 25m datasets, though this is only a “fraction of datasets on the web”. The largest topics indexed by Dataset Search include geosciences, biology and agriculture with education, weather, cancer, crime, soccer and dogs being the most common queries.

Making data available to users is what Google does best and the company plans to continue to add more datasets to Dataset Search.

Via The Verge