0 0
Part of Speech tagset for French language - Lexsense

Part of Speech tagset for French language

Read Time:3 Minute, 13 Second


French TreeTagger part-of-speech tagset is available in French corpora annotated by the tool TreeTagger that was developed by Helmut Schmid in the Textual corpora project at the Institute for Computational Linguistics of the University of Stuttgart.

A part of speech (POS) tagset for the French language defines the different categories or types of words that can be used in sentences and how they should be labelled when processed by linguistic tools like part-of-speech taggers. The French language tagset can be quite detailed with variations depending on the linguistic framework used (e.g., Universal Dependencies, Treebank POS tags, etc.). Here’s an overview of the main categories found in a typical French POS tagset:

1. Nouns (N)

  • NN: Common noun (e.g., chat, maison).
  • NNS: Plural noun (e.g., chats, maisons).
  • NC: Countable noun.
  • NOMPROP: Proper noun (e.g., Paris, Marie).

2. Pronouns (PRON)

  • PRP: Personal pronoun (e.g., je, tu, il).
  • PRP$: Possessive pronoun (e.g., mon, ton, son).
  • REFL: Reflexive pronoun (e.g., se, me).

3. Verbs (V)

  • VB: Base form of a verb (infinitive, e.g., manger).
  • VBD: Past tense verb (e.g., mangait).
  • VBG: Gerund/participle form (e.g., mangeant).
  • VBN: Past participle (e.g., mangé).
  • VBP: Present tense verb, non-3rd person singular (e.g., je mange).
  • VBZ: 3rd person singular present (e.g., il mange).

4. Adjectives (ADJ)

  • JJ: Adjective (e.g., grand, beau).
  • JJR: Comparative adjective (e.g., plus grand).
  • JJS: Superlative adjective (e.g., le plus grand).

5. Adverbs (ADV)

  • RB: Adverb (e.g., rapidement, très).
  • RBR: Comparative adverb (e.g., plus rapidement).
  • RBS: Superlative adverb (e.g., le plus rapidement).

6. Determiners (DET)

  • DT: Determiner (e.g., le, une).
  • PDT: Predeterminer (e.g., quelques).
  • WDT: Wh-determiner (e.g., quel).

7. Prepositions (PREP)

  • IN: Preposition (e.g., dans, avec).

8. Conjunctions (CONJ)

  • CC: Coordinating conjunction (e.g., et, mais).
  • IN: Subordinating conjunction (e.g., que, si).

9. Interjections (INTJ)

  • UH: Interjection (e.g., oh, aïe).

10. Auxiliary Verbs (AUX)

  • AUX: Auxiliary verb (e.g., être, avoir).
  • AUXP: Auxiliary in the past participle (e.g., était).

11. Numbers (NUM)

  • CD: Cardinal number (e.g., un, deux).
  • OD: Ordinal number (e.g., premier, deuxième).

12. Symbols and Punctuation (SYM, PUNCT)

  • SYM: Symbol (e.g., &, %, $).
  • PUNCT: Punctuation (e.g., ., !, ?).

13. Other

  • X: Other, not classified (often used for words or expressions not fitting standard categories).
  • FW: Foreign word (e.g., pizza, souvenir).

Tagset Variations

Different linguistic resources might have slight variations in the POS tagset. For example:

  • Universal Dependencies (UD): A standardized tagset for many languages, including French, which simplifies POS labels (e.g., NOUN, VERB, ADJ, ADV).
  • French Treebank: A detailed POS tagset that includes specific distinctions between types of nouns, verbs, and modifiers.
  • CLAWS (Constituent Labeled Annotated Word Sense): Used for more nuanced tagging, often in research and language technology development.

When working with French language processing, it’s essential to choose a tagset that matches your linguistic analysis needs or the specific tool you’re using. An Example of a tag in the CQL concordance search box[tag="VER:cond"] searches all verb conditionals, e.g. serait, pourrait (note: please make sure that you use straight double quotation marks)

French TreeTagger part-of-speech tagset

French Language Tagset

DET:POSpossessive pronoun (ma, ta, …)
NAMproper name
PRO:DEMdemonstrative pronoun
PRO:INDindefinite pronoun
PRO:PERpersonal pronoun
PRO:POSpossessive pronoun (mien, tien, …)
PRO:RELrelative pronoun
PRP:detpreposition plus article (au,du,aux,des)
PUN:citpunctuation citation
SENTsentence tag
VER:condverb conditional
VER:futuverb futur
VER:impeverb imperative
VER:impfverb imperfect
VER:infiverb infinitive
VER:pperverb past participle
VER:ppreverb present participle
VER:presverb present
VER:simpverb simple past
VER:subiverb subjunctive imperfect
VER:subpverb subjunctive present

Source: https://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/french-tagset.html

0 0 %
0 0 %
0 0 %
0 0 %
0 0 %
0 0 %

Recent Posts

What are the Advantages of Data Annotation

1. Introduction Machine learning models, especially those based on supervised learning, rely heavily on labeled…

2 weeks ago

Data Annotation Techniques: A Comprehensive Overview

Introduction The rise of machine learning, particularly deep learning, has established the critical role of…

2 weeks ago

What Is a Neural Network?

Introduction The quest to replicate human intelligence in machines has spurred significant research in artificial…

3 weeks ago

Understanding Neural Networks: The Backbone of Modern AI

Introduction Neural networks, inspired by the architecture of the human brain, have emerged as the…

3 weeks ago

Translation: Connecting Brands Worldwide

Introduction The Internet is a space without borders. It allows people to connect and discover…

3 weeks ago

Understanding the Translation Market: Trends and Challenges

Introduction In an increasingly globalized world, the translation market has gained significant importance. As businesses…

3 weeks ago