Being one of several publishers of the German Yellow Pages, we have the need to categorise new, very large datasets. We discuss our process from a rule-based system using a knowledge graph towards a machine-learning approach as well as benefits of a mixed approach.
With the help of real-world data we explain how each of these approaches work.