Automatic classification for occupation and cause of death strings

Dr Graham Kirby (2014)

Other information: The software uses Apache Mahout machine learning algorithms to classify unseen strings from historical records. It uses a sample of human-classified records to train the machine learning models. The software has been used to classify occupation and cause-of-death strings to the HISCO and ICD-10 coding systems, respectively. This software is open source.

Available online: Link

Recent Blog

Vacancy: Early Stage Researcher

Applications are invited for an Early Stage Researcher position funded by the Marie Sklodowska-Curie Innovative Training Network “LONGPOP (Methodologies and... Read more...

Latest Tweets