Having an input text that will pass through the grammar and the output must be all the entries that the grammar finds in the text. The problem is that my non-terminals are external list files and I can not find a way to do it.
Example of a pseudo-code:
Pass the grammar (only one example):
grammar ("""
S -> NP VP
NP -> DET N
VP -> V N
DET -> **lista_det.txt**
N -> **lista_n.txt**
V -> **lista.txt** """)
Print the results of the text that obey the grammar
Example:
with open ("corpus_risque.txt", "r") as f:
texte = f.read()
grammar = nltk.parse_cfg("""
S-> NP VP
NP -> DET N
VP -> V N
DET -> lista_det.txt
N -> lista_n.txt
V -> lista.txt""")
parser = nltk.ChartParser(grammar)
parsed = parser.parse(texte)
print(texte)
Normally, grammars are presented in this way, already in lists:
grammar = nltk.parse_cfg("""
S -> NP VP
VP -> VBZ NP PP
PP -> IN NP
NP -> NNP | DT JJ NN NN | NN
NNP -> 'Python'
VBZ -> 'is'
DT -> 'a'
JJ -> 'good'
NN -> 'programming' | 'language' | 'research'
IN -> 'for'
""")
Would it be possible to do what I want?