In this recipe, we’ll use ConceptNet’s natural language tools (and basically nothing else) to rephrase a sentence into various forms.
Load the English natural language tools, an instance of csc.nl.NLTools that you can get using get_nl().
>>> from csc.nl import get_nl
>>> en_nl = get_nl('en')
We’re going to do everything using the lemma_split() and lemma_combine() methods.
lemma_split() will separate text into a normalized form, containing the base forms of its content words, and a residue, describing the remaining text that surrounds those content words. These two examples will illustrate:
>>> en_nl.lemma_split("you can sit on a couch")
(u'sit couch', u'you can 1 on a 2')
>>> en_nl.lemma_split("you are sitting on a couch")
(u'sit couch', u'you are 1ing on a 2')
Notice that in these two examples, the normalized forms are the same but the residues are different.
Using lemma_combine(), we can put these back together:
>>> en.nl.lemma_combine(u'sit couch', u'you are 1ing on a 2')
u'you are sitting on a couch'
By changing the residue while leaving the normalized form the same, we can rephrase text.
>>> en.nl.lemma_combine(u'sit couch', u'1 on the 2')
u'sit on the couch'
>>> en.nl.lemma_combine(u'sit couch', u'a 2 is something for 1ing on')
u'a couch is something for sitting on'