Datasets

We provide some of the most useful/popular datasets from the LOD cloud in HDT for you to use them easily. If the dataset you need is not available here, you can create your own or kindly ask the data provider to publish their datasets in HDT format for all the community to enjoy.

We are serving more than 7 Billion triples in 50Gb of HDT.gz files, which in plain NTriples would take more than 1 TB.

Dataset Size Triples Details Provenance
Semantic Web Dog Food 2.3MB 242K Triples 2012-11-28 Dump SWDF.
DBLP Computer Science Bibliography 286MB 55M Triples 2012-11-28 Dump Faceted DBLP project.
Wiktionary English 212MB 64M Triples Wiktionary Download Page.
Wiktionary French 124MB 32M Triples
Wikitionary Deutch 23MB 5M Triples
Wikitionary Russian 40MB 12M Triples
WordNet 3.0 26MB 8M Triples All the turtle files found in the Git repository as of 2013-03-20. Princeton Wordnet Page.
WordNet 3.1 23MB 5.5M Triples Generated from the 3.1 NTriples dump on 2014-04-16. Princeton Wordnet 3.1 in RDF.
Geonames 344MB 123M Triples 2012-11-11 Dump Geonames official dump.
LinkedGeoData 461MB 129M Triples 2012-10-09 Dump LinkedGeoData download page.
DBPedia 2015-04 English New Dataset! 4.7GB 837M Triples All Canonicalized datasets (and Links to other datasets) together in one big file Official DBpedia Web Site.
DBpedia 3.9 English 2.4GB 474M Triples All canonicalized datasets together in one big file Official DBpedia Web Site.
DBpedia 3.8 English 2.8GB 431M Triples All canonicalized datasets together in one big file Official DBpedia Web Site.
DBpedia 3.8 English No Long Abstracts 2.4GB 427M Triples A reduced version without long abstracts
DBpedia 3.8 English By Section One HDT file by section
DBpedia 3.8 English External Links Links to other datasets
YAGO2s Knowledge Base 903MB 159M Triples 2013-05-08 Dump. TTL dump of Max Planck Institut Informatik.
Freebase 11GB 2067M Triples 2013-12-01 Dump. From the official Freebase Dump in RDF.
Freebase 2.6GB 770M Triples 2012-11-09 Dump.
Converted to RDF using Castagna’s Freebase2RDF.
The official quadruples dump.
Wikidata 2.1GB 367M Triples 2014-05-26 Dump. Wikidata RDF exports.

Disclaimer: These datasets were downloaded from the public Web Sites of their respective maintainers and converted to HDT. We are not responsible for the accuracy of the content. Also check their respective licenses on their own sites for details on usage rights.

We would like to thank the Unit for Information Mining and Retrieval in DERI for allowing us to use their servers for converting these big datasets.