
Generalized Pipeline Evaluation for Entity Matching
In Spring 2025, ReLU continued working on the findings from the Fall 2024 Entity Matching (EM) project, to develop a system to find good pipelines for EM across datasets.
Throughout the semester, we combined Weights and Biases (wandb) with EM pipelines to quantitatively assess the effectiveness of different prompting techniques and pipelines, combining single- and multi-step approaches. We explored a wide range of strategies and models, and ran tests on datasets of different levels of complexity.
Our final deliverable was a system for effectively finding high performing EM pipelines in a quantitative manner. In our report, we outlined our best performing pipeline, insights and test results we gained on the effect of different strategies and techniques, and instructions on how to utilize the system when facing a new dataset or strategy to test.
-
Rystad Energy is a leading global energy research and business intelligence firm, renowned for its vast databases and in-depth analysis across the oil, gas, and renewable energy sectors. Since its beginning 20 years ago, Rystad has delivered consulting and analytics services to a wide array of entities and is present all over the world, with offices in Oslo, New York, London, Singapore, Rio De Janeiro, Beijing and Sydney.