corpkit grew organically out of the code I had developed to make sense of the data I encountered in my research projects. I made a basic command line interface for interrogating, editing and plotting corpora, and described it alongside my research project findings whenever I presented my research. Many people seemed interested in some of tools and methods, but most didn’t know how to program. So, for them, I’ve put together this graphical interface, as well as its documentation.

The best way to look at corpkit is as an rickety little open-source bridge between applied, systemic, corpus and computational linguistics.

Testers, contributors, helpers, friends

Special thanks to James Davidson, who has contributed documentation to the project. Also to Marvin Lam, Sigrid Klerke, and anybody else who has been using corpkit in its early, buggy stages without telling me. Stephen Skalicky’s comprehensive testing and feature suggestions have proven invaluable. There’s also Jin Huang, who absolutely nailed the “interro-gator” logo. Finally, Mick O’Donnell (creator of UAM Corpus Tool) deserves a lot of thanks for not only his advice, but also his tireless commitment to developing software for functional linguists.


I’m Daniel McDonald, a PhD student and Research Fellow at University of Melbourne. Currently though, I’m a visiting researcher at Saarland Uni, Germany, where I’m very lucky to get to work on some really awesome, geeky stuff.

In general, most of my research revolves around looking for lexicogrammatical and discourse-semantic patterns and changes in large datasets. More and more, though, I’m shifting toward the computational side of things, and am currently very interested in discourse/semantic annotation.

I’m fortunate enough to work on a number of cool projects related to corpus, computational linguistics and systemic functional linguistics, as well as open science more generally. I develop resources for teaching researchers to code, and teach linguistics (slides for my most recent class are available on I’m on Twitter, too.