Entity Embeddings

Pre-trained entity embeddings for all data sets of the Data Set Knowledge Graph using RDF2Vec (model: skip-gram, random walk, dim: 128, min count: 5, window: 5, epochs: 10):

Example: DBpedia vs. Wikidata

Distance between Wikidata and DBpedia: 0.0794
Similarity between Wikidata and DBpedia: 0.9206

Top 10 most similar to DBpedia (according to most_similar()):
  • Freebase 0.9711188077926636
  • YAGO 0.9587806463241577
  • Wikidata 0.9205970168113708
  • BabelNet 0.8259806036949158
  • British National Corpus 0.8120617866516113
  • DrugBank 0.8102672100067139
  • ontology alignment 0.8098002076148987
  • Social Science Research Network 0.8095744848251343
  • Google Books 0.8072085380554199
  • Online Mendelian Inheritance in Man 0.8065714836120605

Position of Wikidata: 3 (similarity)

Top 10 most similar to Wikidata (according to most_similar()):
  • YAGO 0.9815795421600342
  • Freebase 0.9716936349868774
  • DrugBank 0.9684451222419739
  • BabelNet 0.9648255109786987
  • ontology alignment 0.962468147277832
  • British National Corpus 0.9600380659103394
  • Online Mendelian Inheritance in Man 0.9587374329566956
  • Google Books 0.9566649198532104
  • ChEBI 0.9561185240745544
  • Chemical Entities of Biological Interest 0.9538220763206482

Position of DBpedia: 46 (similarity)

Top 10 nearest neighbors of DBpedia (according to NearestNeighbors):
  • SwissLipids
  • CycleBase
  • Proteome Inc.
  • Information Artifact Ontology
  • Social Security Applications and Claims Index
  • SILVA ribosomal RNA database
  • Star Cluster Simulations
  • Ontobee
  • General Social Survey, 2006
  • Internet Broadway Database

Position of Wikidata: 1274 (nearest neighbors)

Top 10 nearest neighbors of Wikidata (according to NearestNeighbors):
  • Information Artifact Ontology
  • Social Security Applications and Claims Index
  • SILVA ribosomal RNA database
  • Minutiae Sample
  • Orlando
  • GeoNames
  • Database of Vascular Plants of Canada
  • miRBase
  • Open PHACTS Discovery Platform
  • Micro-Loans

Position of DBpedia: 1077 (nearest neighbors)