Skip to content

Latest commit

 

History

History
15 lines (9 loc) · 780 Bytes

README.md

File metadata and controls

15 lines (9 loc) · 780 Bytes

SiMa: Effective and Efficient Matching Across Data

This repo includes the code used for implementing SiMa, as described in "SiMa: Effective and Efficient Matching Across Data Silos Using Graph Neural Networks".

Repo structure

  • src Contains python source files used for developing SiMa and getting effectiveness results.
  • notebooks Contains matching_data_silos.ipynb notebook for easily comprehending the pipeline of SiMa. It uses an example of SiMa running on the data silo configuration derived from NYC OpenData (used for the paper). In addition, it shows how SiMa compares with COMA and Starmie in terms of effectiveness.

Datasets and datafiles

Datasets and datafiles used can be found here.