Detecting data errors: Where are we and what needs to be done? Z Abedjan, X Chu, D Deng, RC Fernandez, IF Ilyas, M Ouzzani, P Papotti, ... Proceedings of the VLDB Endowment 9 (12), 993-1004, 2016 | 310 | 2016 |
Pass-join: A partition-based method for similarity joins G Li, D Deng, J Wang, J Feng Proceedings of the VLDB Endowment 5 (3), 253-264, 2011 | 253 | 2011 |
Josie: Overlap set similarity search for finding joinable tables in data lakes E Zhu, D Deng, F Nargesian, RJ Miller Proceedings of the 2019 International Conference on Management of Data, 847-864, 2019 | 213 | 2019 |
The Data Civilizer System. D Deng, RC Fernandez, Z Abedjan, S Wang, M Stonebraker, ... 8th Biennial Conference on Innovative Data Systems Research (CIDR ‘17), 2017 | 201 | 2017 |
String similarity search and join: a survey M Yu, G Li, D Deng, J Feng Frontiers of Computer Science, 1-19, 2015 | 178 | 2015 |
Massjoin: A mapreduce-based method for scalable string similarity joins D Deng, G Li, S Hao, J Wang, J Feng 2014 IEEE 30th International Conference on Data Engineering, 340-351, 2014 | 163 | 2014 |
Cost-effective crowdsourced entity resolution: A partial-order approach C Chai, G Li, J Li, D Deng, J Feng Proceedings of the 2016 ACM SIGMOD International Conference on Management …, 2016 | 114 | 2016 |
An efficient partition based method for exact set similarity joins D Deng, G Li, H Wen, J Feng Proceedings of the VLDB Endowment 9 (4), 360-371, 2015 | 108 | 2015 |
Top-k string similarity search with edit-distance constraints D Deng, G Li, J Feng, WS Li 2013 IEEE 29th International Conference on Data Engineering (ICDE), 925-936, 2013 | 81 | 2013 |
A pivotal prefix based filtering algorithm for string similarity search D Deng, G Li, J Feng Proceedings of the 2014 ACM SIGMOD International Conference on Management of …, 2014 | 76 | 2014 |
Distributed graph simulation: Impossibility and possibility W Fan, X Wang, Y Wu, D Deng Proceedings of the VLDB Endowment (PVLDB) 7 (12), 1083-1094, 2014 | 70 | 2014 |
Scalable column concept determination for web tables using large knowledge bases D Deng, Y Jiang, G Li, J Li, C Yu Proceedings of the VLDB Endowment 6 (13), 1606-1617, 2013 | 69 | 2013 |
Faerie: efficient filtering algorithms for approximate dictionary-based entity extraction G Li, D Deng, J Feng Proceedings of the 2011 ACM SIGMOD International Conference on Management of …, 2011 | 61 | 2011 |
Overlap set similarity joins with theoretical guarantees D Deng, Y Tao, G Li Proceedings of the 2018 International Conference on Management of Data, 905-920, 2018 | 58 | 2018 |
Efficient similarity join and search on multi-attribute data G Li, J He, D Deng, J Li Proceedings of the 2015 ACM SIGMOD international conference on management of …, 2015 | 53 | 2015 |
State-of-the-art in string similarity search and join S Wandelt, D Deng, S Gerdjikov, S Mishra, P Mitankin, M Patil, E Siragusa, ... ACM Sigmod Record 43 (1), 64-76, 2014 | 53 | 2014 |
Two birds with one stone: An efficient hierarchical framework for top-k and threshold-based string similarity search J Wang, G Li, D Deng, Y Zhang, J Feng 2015 IEEE 31st International Conference on Data Engineering, 519-530, 2015 | 51 | 2015 |
Silkmoth: An Efficient Method for Finding Related Sets with Maximum Matching Constraints D Deng, A Kim, S Madden, M Stonebraker Proceedings of the VLDB Endowment 10 (10), 1082-1093, 2017 | 46 | 2017 |
Database decay and how to avoid it M Stonebraker, D Deng, ML Brodie 2016 IEEE International Conference on Big Data (Big Data), 7-16, 2016 | 40 | 2016 |
Efficient parallel partition-based algorithms for similarity search and join with edit distance constraints Y Jiang, D Deng, J Wang, G Li, J Feng Proceedings of the Joint EDBT/ICDT 2013 Workshops, 341-348, 2013 | 40 | 2013 |