
Exploiting historical registers: Automatic methods for coding c19th and c20th cause of death descriptions to standard classifications
Recent Blog
Digitising Scotland on Twitter
Follow us on #digitisingscot Read more...
Digitising Scotland
Carson, J., Kirby, G., Dearle, A., Williamson, L., Garrett, E., Reid, A. & Dibben, C. (2013) NTTS (New Techniques and Technologies for Statistics) 15-17 March 2013
Other information:
The increasing availability of digitised registration records presents a significant opportunity for research. Returning to the original records allows researchers to classify descriptions, such as cause of death, to modern medical understandings of illness and disease, rather than relying on contemporary registrars’ classifications.
Linkage of an individual’s records together also allows the production of sparse life-course micro-datasets. The further linkage of these into family units then presents the possibility of reconstructing family structures and producing multi-generational studies. We describe work to develop a method for automatically coding to standard classifications the causes of death from 8.3 million Scottish death certificates. We have evaluated a range of approaches using text processing and supervised machine learning, obtaining accuracy from 72%-96% on several test sets. We present results and speculate on further development that may be needed for classification of the full data set.
Conference proceeding/paper
Available online: Link
Download outcome document: Exploiting historical registers: Automatic methods for coding c19th and c20th cause of death descriptions to standard classifications. PDF
Follow us on #digitisingscot Read more...
© 2023 Digitising Scotland - The University of Edinburgh Site by [wideopenspace]
Cookie | Duration | Description |
---|---|---|
cookielawinfo-checkbox-analytics | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics". |
cookielawinfo-checkbox-functional | 11 months | The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional". |
cookielawinfo-checkbox-necessary | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary". |
cookielawinfo-checkbox-others | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other. |
cookielawinfo-checkbox-performance | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance". |
viewed_cookie_policy | 11 months | The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data. |
Cookie | Duration | Description |
---|---|---|
__utma | 2 years | Used to distinguish users and sessions. The cookie is created when the javascript library executes and no existing __utma cookies exists. The cookie is updated every time data is sent to Google Analytics. |
__utmb | 30 minutes | Used to determine new sessions/visits. The cookie is created when the javascript library executes and no existing __utmb cookies exists. The cookie is updated every time data is sent to Google Analytics. |
__utmc | Not used in ga.js. Set for interoperability with urchin.js. Historically, this cookie operated in conjunction with the __utmb cookie to determine whether the user was in a new session/visit. | |
__utmt | 10 minutes | Used to throttle request rate. |
__utmz | 6 months | Stores the traffic source or campaign that explains how the user reached your site. The cookie is created when the javascript library executes and is updated every time data is sent to Google Analytics. |
_ga | 2 years | Used to distinguish users. |
_gat | 1 minute | Used to throttle request rate. |
_gid | 24 hours | Used to distinguish users. |