Part-of-Speech Hierarchy in Arabic Language

Estimated read time 5 min read
Knowledge Sharing

Preamble

Part-of-speech (POS) tagging is the process of marking up words in a text with their corresponding part of speech based on their definition and context. For Arabic, POS tagging is more complex than English because of its rich morphology. In Modern Standard Arabic (MSA) grammar, part-of-speech hierarchy is based on classical Arabic grammar where words are classified into three primary categories. Particle (حرف, Hurf)Verb (فعل, Fi‘l)Noun (اسم, Ism). The set of nominals include nouns, pronouns, adjectives and adverbs. The particles include prepositions, conjunctions and interrogatives, as well as many others. This system forms the core structure of Arabic grammar and anything else, is but a detailed subdivision of these main classes. These three categories are:

A noun in Arabic refers to a word that describes a person, place, thing, or idea. It can also include an adjective, a pronoun and other subcategories. Nouns are independent and don’t typically change in tense.

Characteristics of nouns:

1. Noun (اسم, Ism)

  • Can take gender (masculine, feminine)
  • Can take number (singular, dual, plural)
  • Can take case endings: nominative (raf‘), accusative (nasb), genitive (jarr)
  • Can have definite (with definite article “ال”) or indefinite form (without the definite article)

Subcategories:

  • Proper nouns (اسم علم): Specific names of people, places, etc. (e.g., أحمد، القاهرة)
  • Common nouns (اسم عام): General nouns (e.g., كتاب، سيارة)
  • Pronouns (ضمائر): Personal pronouns (e.g., هو, أنا), demonstrative pronouns (e.g., هذا، تلك), and relative pronouns (e.g., الذي، التي)
  • Adjectives (صفة): Describes qualities or attributes of a noun (e.g., كبير، جميل)
  • Verbal nouns (Masdar, مصدر): Derived nouns that express the idea of the verb, functioning similarly to infinitives in English (e.g., قراءة — “reading”)

2. Verbs (فعل, Fi‘l)

Verbs in Arabic denote actions or states of being and are inflected for tense (past, present, and future), person, gender, and number.

  • Actions — things that are done by the subject (e.g., “to eat,” “to write”)
  • States of being — situations or conditions of the subject (e.g., “to be,” “to exist,” “to become”)

Action Verbs (فعل العمل)

These are verbs that express physical or mental actions. They indicate something that the subject is doing or has done.

Examples:

  • كتب (kataba) — “He wrote” (an action of writing)
  • أكل (akala) — “He ate” (an action of eating)
  • شرب (shariba) — “He drank” (an action of drinking)
  • ذهب (dhahaba) — “He went” (an action of going)

Stative Verbs (فعل الحالة)

Stative verbs express a state or condition of being, rather than an action. They indicate situations or states in which the subject exists, such as possession, emotions, or characteristics.

Examples:

  • كان (kaana) — “He was” or “It was” (indicating existence or a state in the past)
  • أصبح (asbaha) — “He became” (indicating a change of state)
  • ظل (dhalla) — “He remained” (indicating a continued state)
  • يوجد (yujad) — “There is” or “There exists” (indicating the state of existence)
  • ملك (malaka) — “He owned” (indicating the state of ownership)

These verbs typically answer the question “What is the subject like?” or “In what state is the subject?”

Characteristics of verbs:

  • Conjugated based on subject (person, gender, number)
  • Inflect for tense: past (ماضٍ), present (مضارع), and future (often indicated using particles like “سوف” or “سـ” before the present verb form)
  • Can be transitive (requires a direct object) or intransitive (does not require a direct object)
  • Infinitives (verbal nouns) are treated as part of nouns

Subcategories:

  • Perfect verbs (ماضٍ): Actions completed in the past (e.g., كتب — “he wrote”)
  • Imperfect verbs (مضارع): Ongoing or future actions (e.g., يكتب — “he is writing” or “he will write”)
  • Imperative verbs (أمر): Commands or requests (e.g., اكتب — “write!”)
  • Auxiliary verbs: Verbs like “كان” (to be) used to form complex tenses (e.g., كان يكتب — “he was writing”)

3. Particles (حروف, Huruf)

Particles are words that don’t fit into the noun or verb categories. They usually serve as connectors or modifiers, having no independent meaning but influencing the meaning of the sentence. Particles are not inflected and can be short, functional words.

Subcategories:

  • Prepositions (حروف الجر): Words like “في” (in), “على” (on), “من” (from), and “إلى” (to)
  • Conjunctions (حروف العطف): Words that connect clauses or words (e.g., و — “and”, أو — “or”, ثم — “then”)
  • Negative particles (حروف نفي): Words that negate verbs (e.g., لا — “no”, ليس — “is not”)
  • Interrogative particles: Words used to form questions (e.g., هل — “do?”, ماذا — “what?”)
  • Conditional particles: Words used to form conditional statements (e.g., إذا — “if”, إن — “if”)
  • Emphatic particles: Words used for emphasis (e.g., إن — “indeed”, قد — “already”)

Hierarchy and Interactions

The three main categories (nouns, verbs, and particles) form the foundation of the Arabic part-of-speech hierarchy. Each of these categories is further subdivided into more specific classes that govern their interactions. The hierarchical relationships between them can be summarized as:

Nouns can interact with:

  1. Adjectives to describe or qualify them (e.g., سيارة كبيرة — “a big car”)
  • Pronouns to replace them (e.g., ذهب أحمد إلى المدرسة → هو ذهب إلى المدرسة — “Ahmed went to school” → “He went to school”)
  • Prepositions to form prepositional phrases (e.g., في البيت — “in the house”)

Verbs can interact with:

  • Nouns to form the subject and object of sentences (e.g., كتب أحمد الرسالة — “Ahmed wrote the letter”)
  • Pronouns as subjects or objects (e.g., he is writing — “يكتب”)
  • Particles to modify or negate the action (e.g., he is not writing — “لا يكتب”)

Particles interact with:

  • Verbs or nouns to form various expressions, prepositional phrases, conjunctions, or questions.

Examples of the Hierarchy in Use:

Noun-Adjective Agreement:

  • The new book (الكتاب الجديد): Here, “الكتاب” (the book) is a noun, and “الجديد” (new) is an adjective. In Arabic, the adjective follows the noun and agrees with it in gender, number, and definiteness.

Verb-Noun Agreement:

  • The boy read the book (قرأ الولد الكتاب): “قرأ” (read) is the verb, “الولد” (the boy) is the subject (a noun), and “الكتاب” (the book) is the object (a noun). The verb agrees with the subject in person and number.

Particle-Noun & Particle-Verb Relations:

  • In the house ( في البيت): The preposition “في” (in) is a particle, and it governs the noun “البيت” (the house), which is in the genitive case.
  • He did not write (لم يكتب): The negative particle “لم” negates the verb “يكتب” (he writes), turning it into a past-tense negation.

Summary:

In MSA, part-of-speech hierarchy is primarily composed of three overarching categories—nounsverbs, and particles—with further subcategories and variations based on their function and relationships with each other. Understanding this hierarchy allows for the correct formation of sentences, agreement between words, and the ability to convey nuanced meanings. This structured framework is rooted in traditional Arabic grammar but is still applicable to Modern Standard Arabic.

Download: Arabic_POS_Tags. Click to Download

chakir.mahjoubi https://lexsense.net

Knowledge engineer with expertise in natural language processing, Chakir's work experience spans, language corpus creation, software localisation, data lineage, patent translation, glossary creation and statistical analysis of experimentally obtained results.

You May Also Like

More From Author