Department of Theoretical and Applied Linguistics
English Faculty Building
University of Cambridge
9 West Road
Nigel’s research is in the broad area of Natural Language Processing and Computational Linguistics. His research brings together computational techniques such as machine learning, syntactic parsing and concept understanding with the aim of providing a machine-understandable semantic representation of text. This is used to support real-world tasks, e.g. question answering and knowledge discovery from very large scale data sources such as the World Wide Web.
Nigel works in collaboration with colleagues from computer science, the life sciences and linguistics.
- Human language technologies
- Computational linguistics
- Machine learning
- Text/data mining
- Knowledge discovery
- Domain adaptation
- Question answering
2015 – 2020, SIPHS (EPSRC funded), Semantic interpretation of personal health messages
2012 – 2014, PhenoMiner (EC FP7 funded), Semantic mining of phenotype associations from the scientific literature
2006 – 2012 BioCaster (JST funded), Detecting public health rumors with a Web-based text mining system
Pilehvar, M. T. and Collier, N. (2016), De-conflating semantic representations of words by exploiting knowledge from semantic networks, in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2016), Austin, USA, November 1st to 5th (in press).
Limsopatham, N. and Collier, N. (2016), Normalising medical concepts in social media texts by learning semantic representation, in Proceedings of the Association of Computational Linguistics Annual Meeting (ACL 2016), Berlin, Germany, August 1st to 7th, pp. 1014-1023.
Limsopatham, N. and Collier, N. (2015), Adapting phrase-based machine translation to normalize medical terms in social media messages, in Proceedings of the Empirical Methods in Natural Language Processing (EMNLP 2015), Lisbon, Portugal, pp. 1675-1680.
Lofi, C., Nieke, C. and Collier, N. (2014), Discriminating rhetorical analogies in social media, European Conference on Computational Linguistics (EACL), Gothenburg, Sweden, April 26-30, pp. 560-568.
Collier, N., Tran, M., Le, H. Ha, Q., Oellrich, A. Rebholz-Schuhmann, D. (2013), Learning to recognize phenotype candidates in the auto-immune literature using SVM re-ranking, PLoS One 8(10): e72965.
Bao, Y., Collier, N. and Datta, A. (2013), A partially supervised cross-collection topic model for cross-domain text classification, ACM Conference of Information and Knowledge Management, San Francisco, USA, October 27-November 1, pp. 239-248
Collier, N., Son, N. T., & Nguyen, N. M. (2011), OMG U got flu? Analysis of shared health messages for bio-surveillance. J. Biomedical Semantics, 2(S-5), S9.
Hay, S. I., Battle, K. E., Pigott, D. M., Smith, D. L., Moyes, C. L., Bhatt, S., Brownstein, J. S., Collier, N., Myers, M. F., George, D. B. & Gething, P. W. (2013), Global mapping of infectious disease. Philosophical Transactions of the Royal Society B: Biological Sciences, 368(1614), 20120250.
Lau, J. H., Collier, N., & Baldwin, T. (2012), On-line Trend Analysis with Topic Models:\# twitter Trends Detection Topic Model Online. 24th International Conference on Computational Linguistics (COLING), Bombay, India, December 8-15, pp. 1519-1534.
Chanlekha, H., Kawazoe, A. & Collier, N. (2010), A framework for enhancing spatial and temporal granularity in report-based health surveillance systems. BMC medical informatics and decision making, 10(1), 1.
Collier, N. (2010), What’s unusual in online disease outbreak news? Journal of Biomedical Semantics, 1:2.