
English Faculty Building
University of Cambridge
9 West Road, Cambridge CB3 9DP
United Kingdom
Nigel’s main research interests span core work on machine learning for Natural Language Processing (NLP). He is active in information extraction, text mining, Web and social media and NLP with ontologies. He has a special interest in applications to biology, medicine and public health.
He joined the Department of Theoretical and Applied Linguistics in 2015 as Director of Research in Computational Linguistics and is currently both a University Lecturer and an EPSRC Experienced Research Fellow. He has a joint affiliation with the Alan Turing Institute for data science and artificial intelligence where he holds a fellowship. From 2012 to 2014 he was a Marie Curie Research Fellow at the European Bioinformatics Institute in Cambridge and prior to this an Associate Professor at the National Institute of Informatics in Tokyo where he led the NLP laboratory. He received his doctorate from the University of Manchester in 1996 and held post-doctoral positions at Toshiba Corporation and the University of Tokyo. His research has been funded by UK, EU and Japanese research councils (JSPS, JST, FP7, EPSRC, MRC).
Computational Linguistics
Natural Language Processing
Text mining
Veracity detection on the Web and social media
Grounding of language to ontologies
Deep learning models of natural language processing
2015 – 2020, SIPHS (EPSRC funded), Semantic interpretation of personal health messages
2012 – 2014, PhenoMiner (EC FP7 funded), Semantic mining of phenotype associations from the scientific literature
2006 – 2012 BioCaster (JST funded), Detecting public health rumors with a Web-based text mining system
Recent publications include:
- Pilehvar, M. T., Prokhorov, V., Kartsaklis, D. and Collier, N. (2018), “CARD-660: A Reliable Evaluation Framework for Rare Word Representation Models”, in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), Brussels, Belgium (in press).
- Kartsaklis, D., Pilehvar, M. T. and Collier, N. (2018), “Mapping Text to Knowledge Graph Entities using Multi-Sense LSTMs”, in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), Brussels, Belgium (in press).
- Le, H. Q., Can, D. C., Vu, T. S., Dang, T. H.., Pilehvar, M. T. and Collier, N. (2018), “Large-scale Exploration of Neural Relation Classification Architectures”, in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), Brussels, Belgium (in press).
- Gritta, M., Pilehvar, M. T., & Collier, N. (2018). “Which Melbourne? Augmenting Geocoding with Maps”, in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, pp. 1285-1296.
- Conforti, C., Pilehvar, M. T. and Collier, N. (2018), “Towards Automatic Fake News Detection: Asymmetric Stance Detection in News Articles”, in Proceedings of the First Workshop on Fact Extraction and Verificiation at EMNLP 2018, Brussels, Belgium (in press).
- Conforti, C., Pilehvar, M. T. and Collier, N. (2018), “Modeling the Fake News Challenge as an Asymmetric Stance Detection Task”, in Proceedings of the 2nd International Workshop on Rumours and Deception in Social Media (RDSM) at CIKM 2018, Turin, Italy, (in press).
- Gritta, M., Pilehvar, M. T., Limsopatham, N. and Collier, N. (2017), "Vancouver Welcomes You! Minimalist Location Metonymy Resolution", in Proceedings of the Association of Computational Linguistics Annual Meeting (ACL 2017), Vancouver, Canada, pp. 1248-1259.
- Gritta, M., Pilehvar, M. T., Limsopatham, N., & Collier, N. (2017) “What’s missing in geographical parsing?”, Language Resources and Evaluation, 1-21.
- Pilehvar, M. T., Camacho-Collados, J., Navigli, R. and Collier, N. (2017), "Towards a Seamless Integration of Word Senses into Downstream NLP Applications", in Proceedings of the Association of Computational Linguistics Annual Meeting (ACL 2017), Vancouver, Canada, August, pp. 1857-1869.
- Pilehvar, M. T. and Collier, N. (2017), "Inducing Representations for Rare Words by Leveraging Lexical Resources", in Proceedings of the European Chapter of the Association for Computational Linguistics (EACL), Valencia, Spain, pp. 388-393.