Quantum natural language processing toolkit released
Named after the late mathematician and linguist Joachim Lambek, the toolkit – called lambeq – is capable of converting sentences into a quantum circuit. It is designed to accelerate the development of practical, real-world QNLP applications, such as automated dialogue, text mining, language translation, text-to-speech, language generation and bioinformatics.
lambeq has been released on a fully open-sourced basis for the benefit of the world’s quantum computing community and the rapidly growing ecosystem of quantum computing researchers, developers, and users. lambeq works seamlessly with the company’s TKET, a leading quantum software development platform that is also fully open-sourced. This provides QNLP developers with access to the broadest possible range of quantum computers, says the company.
“Our team has been involved in foundational work that explores how quantum computers can be used to solve some of the most intractable problems in artificial intelligence,” says CQ’s Chief Scientist Bob Coecke. “This work was based on advances originally pioneered by me, Steve Clark, now CQ’s Head of AI, and others. NLP sits at the heart of these investigations. The release of lambeq is the natural next step after the publication a few months ago that provided details of the world’s first QNLP implementation by CQ on actual quantum computers, and our initial disclosure of the foundational principles in December 2019.”
“In various papers published over the course of the past year,” says Coecke, “we have not only provided details on how quantum computers can enhance NLP but also demonstrated that QNLP is ‘quantum native,’ meaning the compositional structure governing language is mathematically the same as that governing quantum systems. This will ultimately move the world away from the current paradigm of AI that relies on brute force techniques that are opaque and approximate.”
lambeq enables and automates the design and deployment of NLP experiments of the compositional-distributional (DisCo) type that CQ scientists have previously described. This means moving from syntax/grammar diagrams, which encode a text’s structure, to either (classical) tensor networks or quantum circuits implemented with TKET, ready to be optimised for machine learning tasks such as text classification. lambeq has a modular design so that users can swap components in and out of the model and have flexibility in architecture design.
lambeq is designed to remove the barriers to entry for practitioners and researchers who are focused on AI and human-machine interactions, potentially one of the most significant applications of quantum technologies. TKET has gained a worldwide user base now measured in the hundreds of thousands. lambeq, says the company, has the potential to become the most important toolkit for the quantum computing community seeking to engage with QNLP applications that are amongst the most important markets for AI.
A key point that has become apparent recently is that QNLP will also be applicable to the analysis of symbol sequences that arise in genomics as well as in proteomics.
“There is a lot of interesting theoretical work on QNLP, but theory usually stands at some distance from practice,” says CQ senior scientist Dimitrios Kartsaklis, Ph.D. “With lambeq, we give researchers the opportunity to gain hands-on experience on experimental aspects of QNLP, which is currently completely unexplored ground. This is a crucial step towards reaching the point where practical, real-world NLP applications on quantum hardware become a reality.”
lambeq has been released as a conventional Python repository on GitHub. The quantum circuits generated by lambeq have thus far been executed and implemented on IBM quantum computers and Honeywell Quantum Solutions’ H series devices.
The toolkit is introduced by a technical report uploaded on arxiv: lambeq: “An Efficient High-Level Python Library for Quantum NLP.” For more, see the blog post “Quantum Natural Language Processing II.”