Generating Pcfg From Universal Tagset
I am trying to build a PCFG using the POS tags obtained from the below code: from nltk.corpus import treebank corpus = treebank.tagged_sents(tagset='universal') tags = set() for
Solution 1:
I got the answer to this question. Instead of using fromstring
method, generate PCFG object by passing a list of nltk.ProbabilisticProduction
objects and an nltk.Nonterminal
object as below:
from nltk import ProbabilisticProduction
from nltk.grammar import PCFG
from nltk import Nonterminal as NT
g = ProbabilisticProduction(NT('TS'), [NT('.'), NT('NT6')], prob=1)
# Adding a terminal production
g = ProbabilisticProduction(NT('NT6'), ['terminal'], prob = 1)
start = NT('Q0') # Q0 is the start symbol for my grammar
PCFG(start, [g]) # Takes a list of ProbabilisticProductions
Post a Comment for "Generating Pcfg From Universal Tagset"