TreeTagger – part-of-speech tagset for French language

Knowledge Sharing

tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.

French TreeTagger part-of-speech tagset is available in French corpora annotated by the tool TreeTagger that was developed by Helmut Schmid in the Textual corpora project at the Institute for Computational Linguistics of the University of Stuttgart.


An Example of a tag 
in the CQL concordance search box[tag="VER:cond"] searches all verb conditionals, e.g. serait, pourrait (note: please make sure that you use straight double quotation marks)

Tagset

TagDescription
ABRabreviation
ADJadjective
ADVadverb
DET:ARTarticle
DET:POSpossessive pronoun (ma, ta, …)
INTinterjection
KONconjunction
NAMproper name
NOMnoun
NUMnumeral
PROpronoun
PRO:DEMdemonstrative pronoun
PRO:INDindefinite pronoun
PRO:PERpersonal pronoun
PRO:POSpossessive pronoun (mien, tien, …)
PRO:RELrelative pronoun
PRPpreposition
PRP:detpreposition plus article (au,du,aux,des)
PUNpunctuation
PUN:citpunctuation citation
SENTsentence tag
SYMsymbol
VER:condverb conditional
VER:futuverb futur
VER:impeverb imperative
VER:impfverb imperfect
VER:infiverb infinitive
VER:pperverb past participle
VER:ppreverb present participle
VER:presverb present
VER:simpverb simple past
VER:subiverb subjunctive imperfect
VER:subpverb subjunctive present

Source: https://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/french-tagset.html

chakir.mahjoubi https://lexsense.net

Knowledge engineer with expertise in natural language processing, Chakir's work experience spans, language corpus creation, software localisation, data lineage, patent translation, glossary creation and statistical analysis of experimentally obtained results.

You May Also Like

More From Author

+ There are no comments

Add yours