Theory of Automata

The theory of automata is the study of abstract machines and computational problems. The computation problems can be solved using these abstract machines. The abstract machine is called the automata or automaton.

Different names of Automata Theory are 

  • Theory of computation (TOC)
  • Automata Theory
  • Theory of Formal Languages and Computation
  • Formal languages, Automata, Grammars 

Here are the most important key terminologies explained in simple and clear wording

Theory of Automata - TOC key terminologies

1. Alphabet (Σ) and Symbol in Automata Theory

An alphabet denoted as “Σ” (also called sigma) is a nonempty finite set of symbols. The elements of the alphabet are called symbols. A symbol is a single element (atomic entity) of Length 1.

Theory of Automata - Alphabet and Symbols

Alphabet (Σ) in Automata

An alphabet (Σ) is a finite, non-empty set of symbols. Alphabets are classified based on the number of symbols they contain. Classifications are given below

  • Unary Alphabet (|Σ| = 1) : Alphabet containing one symbol.

    • Example: Σ = {a}
  • Binary Alphabet (|Σ| = 2): Alphabet containing two symbols.

    • Example: Σ = {0,1}

  • Ternary Alphabet (|Σ| = 3): Alphabet containing three symbols.

    • Example: Σ = {a,b,c}
  • Quaternary Alphabet (|Σ| = 4): Alphabet containing four symbols.

    • Example: Σ = {a,b,c,d}
  • k-ary Alphabet (|Σ| = k): Alphabet containing k symbols.

    • Example: Σ = {a₁, a₂, … , aₖ}

Symbols in Automata

Symbols are special notations used to represent alphabets, strings, languages, and operations in automata theory. The symbols a, b, c, …, x, y, z and 0, 1, 2, …, 9 are basic symbols used to form strings. In general, any characters, digits, or special symbols can be used as alphabet symbols to construct strings in automata theory.

Examples of possible symbols:

  • Lowercase letters: a, b, c, …, z
  • Uppercase letters: A, B, C, …, Z
  • Digits: 0, 1, 2, …, 9
  • Special symbols: +, −, *, /, @, #, $, %

All such symbols can be elements of an alphabet Σ, and strings are formed by combining these symbols in sequence.

Important:

Important Points

  • epsilon (ε) is a string of length 0, not an alphabet symbol.

  • ∅ is a set, not a string.

  • ε ≠ ∅

  • a ≠ A because Case matters: lowercase and uppercase letters are different symbols.

2. Strings in Automata Theory

A string (or word) is a finite sequence of symbols taken from an alphabet Σ.

Theory of Automata - Valid and invalid strings in TOC

Valid String Examples

A valid string must contain only symbols from the alphabet Σ.

  • If Σ = {0,1}, then 0, 01, 1101, 11010,…so on, are valid strings.
  • If Σ = {a,b}, then a, aaab, bbbba, ababba,…so on, are valid strings.
  • If Σ = {0,1, 2,3,…..,9}, then 12345, 56654, 11455, 8888888,…so on, are valid strings.
  • If Σ = {a,b,c,d,…,z}, then theory, beautiful, hello,…so on, are valid strings.

Invalid String Examples

A string is invalid if it contains symbols not present in the given alphabet Σ.

  • If Σ = {0,1}, then 012, 102, 2, 10a are invalid strings (because 2 and a are not in Σ).

  • If Σ = {a,b}, then abc, acb, a1b, cab are invalid strings (because c and 1 are not in Σ).

Order of symbols and repetition

In strings, both the order of symbols and the repetition of symbols matter.

  • Order Matters: Changing the sequence of symbols gives a different string.

    • Example: If Σ = {a, b} then “ab” ≠ “ba”

  • Repetition Matters: Repeating a symbol changes the string.

    • Example: If Σ = {a, b} then “a” ≠ “aa”

So, a string is defined by the exact sequence of symbols, including their order and how many times each appears.

Length of the String

The number of symbols in a string is called the length of the string. 

Theory of Automata - String Length in Automata

  • if w = 1011, then the length of the string (|w|) = 4.
  • The string with no symbols is called the empty string. It is represented by ε (called epsilon). The length of an empty string is 0

Specific Length of strings over the alphabet (Σ)

If the alphabet size = k and string length = n, the number of strings of length n = kⁿ. Look at the following diagram

Theory of Automata - Strings of specific length in Automata

Example: Let the alphabet be Σ = {a, b}, which means we can form strings using only the symbols a and b.

  • A string of length 0 is called the empty string and is written as ε.
  • For length 1, the possible strings are: a, b.
  • For length 2, the possible strings are: aa, ab, ba, bb.
  • For length 3, the possible strings are: aaa, aab, aba, abb, baa, bab, bba, bbb.
  • For length 4, the possible strings are: aaaa, aaab, aaba, aabb, abaa, abab, abba, abbb, baaa, baab, baba, babb, bbaa, bbab, bbba, bbbb.

In general, for alphabet size 2, the total number of strings of length n is 2ⁿ.

All possible strings over the alphabet (Σ)

The set of all possible strings that can be formed using symbols from an alphabet Σ is called Σ* (Kleene Star of Σ). It includes strings of every possible finite length, including the empty string ε.

Examples:

  • If Σ = {a}, then Σ* = {ε, a, aa, aaa, aaaa, …}

  • If Σ = {0,1}, then Σ* = {ε, 0, 1, 00, 01, 10, 11, 000, 001, …}

  • If Σ = {a,b,c}, then Σ* = {ε, a, b, c, aa, ab, ac, ba, bb, bc, ca, cb, cc, …}

Key Points:

  • Includes strings of length 0,1,2,3,…∞

  • The empty string ε is always included

  • The number of strings of length n over an alphabet of size k = kⁿ

  • Σ* contains all possible combinations of symbols in Σ.

Reversal of a String

Theory of Automata - Reversal of String

The reversal of a string is obtained by writing the symbols of the string in the opposite order.

  • Syntax: If w = a₁a₂a₃…aₙ, then the reversal of the string is written as wᴿ = aₙaₙ₋₁…a₂a₁.
  • Example: If w = abc, then its reversal is wᴿ = cba.

Concatenation of Strings

Concatenation is the operation of joining two strings together to form a new string.

Theory of Automata - Concatenation of String

  • Notation: If x and y are strings, their concatenation is written as xy.
  • Example: If x = ab and y = ba, then xy = abba. Similarly If x = ab, y = ba, and z = mn, then zxy = mnabba.
  • Length Rule: |xy| = |x| + |y| (the length of the new string is the sum of both lengths).
  • Order Matters: xy ≠ yx in general.
    • Example: Let x = ab and y = c
    • xy = abc
    • yx = cab
    • Since abc ≠ cab, the order of strings matters in concatenation.
  • Associative Property: (xy)z = x(yz).
  • Identity Element: ε (epsilon) acts as the identity element for concatenation.
  • Left Identity: εx = x.
  • Right Identity: xε = x.

Prefix, Suffix, Substring, Subsequence of string

  • A prefix of a string is any substring that starts from the first symbol of the string and may end at any position.
  • A suffix of a string is any substring that ends at the last symbol of the string and may start at any position within the string.
  • A substring of a string is a consecutive sequence of characters that appears anywhere inside the string.
  • A subsequence of a string is formed by deleting zero or more symbols from the string without changing the order of the remaining symbols.

Theory of Automata - Prefix, Suffix, Substring, Subsequence of String in TOC

Note: If a symbol repeats, subsequences that differ only in the position of identical symbols are considered the same. The total number of distinct subsequences is less than 2ⁿ.

Look at the following diagram, which shows the examples of prefix, suffix, and substring in automata theory

Theory of Automata - Prefix, Suffix, Substring, Subsequences of String

How to get Substring formula: n(n + 1) / 2

Let w = abcde (length n = 5). A substring is any continuous part of the string. To count them, we consider all possible starting positions.

  • Starting at position 1 (a): a, ab, abc, abcd, abcde → 5 substrings
  • Starting at position 2 (b): b, bc, bcd, bcde → 4 substrings
  • Starting at position 3 (c): c, cd, cde → 3 substrings
  • Starting at position 4 (d): d, de → 2 substrings
  • Starting at position 5 (e): e → 1 substring

Adding them gives 5 + 4 + 3 + 2 + 1 = 15 substrings. In general, the number of substrings of a string of length n is n(n+1)/2, which matches this example: 5(5+1)/2 = 15. Adding 1 in the end represents the ε (epsilon) in the formula.

Note:

  • Prefix, suffix, and substring always contain consecutive characters.
  • Every prefix and suffix is a substring, but not every substring is a prefix or suffix.
  • proper prefix, a proper suffix, and a proper substring do not contain ε (epsilon).
  • Proper Prefix / Proper Suffix ⊆ Substring ⊆ Subsequence

Set of all strings over the given alphabet

The set of all strings over a given alphabet Σ is denoted by Σ* (called the Kleene Star of Σ). It includes all possible strings of any finite length, including the empty string ε.

Properties:

  • Includes strings of length 0, 1, 2, 3, … infinitely.
  • Concatenation of any number of symbols from Σ forms a string in Σ*.

The content of Σ* depends on the number of symbols in the alphabet. Following diagram shows all

Theory of Automata - Set of All Strings Over the Given Alphabet

Observation: For alphabet size k, number of strings of length n = kⁿ, and Σ* is the union of strings of all lengths 0,1,2,… infinitely. This shows how Σ* grows rapidly as the alphabet size increases.

Power of a String

If w = ab, then the power of the string means repeated concatenation of w with itself.

Example:

  • w⁰ = ε
  • w¹ = ab
  • w² = abab
  • w³ = ababab
  • w⁴ = abababab
  • In general, wⁿ means concatenating the string w with itself n times.

Palindrome String

A palindrome string is a string that reads the same forward and backward (i.e., w = wᴿ).

  • Example: aba, aa, abba, madam, eye.

Set Vs String

Set: A collection of distinct elements where order does not matter, e.g., {a, b} = {b, a}.

String: An ordered sequence of symbols where order matters, e.g., “ab” ≠ “ba”.

Theory of Automata - set vs string

Here is a clear comparison table for Set vs String in TOC:

Basis Set String
Definition A collection of elements A sequence of symbols
Order Order does not matter, i.e., {a,b} = {b,a} Order matters, i.e., ab is not equal to ba
Repetition Duplicate elements not allowed Symbols can repeat, i.e., aababbba
Written As {a, b, c} abc
Example {a, b} ab
Length No concept of length, finite or infinite Has length which is always finite
Empty Form ∅ (empty set) ε (empty string)
Used For In TOC, set is used to represent alphabet symbols. used as the input sequence of symbols taken from an alphabet Σ

Exercise about strings in Automata

Question 1: Let Σ= {a}, How many Strings are there over Σ ?

Answer: If Σ = {a}, then there are infinitely many strings over Σ because we can form ε, a, aa, aaa, aaaa, and so on endlessly (since Σ* is infinite). For every length n ≥ 0, there is exactly one string (aⁿ), such as ε (length 0), a (length 1), aa (length 2), aaa (length 3), and so forth without end, but each string has finite length.

Question 02: Consider the strings over the alphabet {a,b,c}.

  1. How many strings of length 5 are there?
  2. How many strings of length n are there?
  3. How many strings of length 7 are there, in which the first, middle, and last symbols are b?
  4. How many strings of length at most 5 are there?

Answer:

  1. Each position has 3 choices. So, 35=243 strings is the answer
  2. Each of n positions has 3 choices. So, 3n
  3. Length = 7, Positions fixed are
    • 1st = b

    • 4th (middle) = b

    • 7th = b

The remaining positions are 4, therefore each has 3 choices. So, 34 = 81 strings

  1. v. Length ≤ 5 means: 0, 1, 2, 3, 4, 5
    • Total: 30+31+32+33+34+35 =1+3+9+27+81+243 364 strings
NOTE:

  • Every Individual String has a finite length. But there are infinitely many strings over any Σ.
  • Analogy: Each Movie has a finite length, but we can create infinitely many movies.

3. Language (L) in Automata Theory

Language: A language is a set of strings formed from an alphabet according to specific rules. Language is of two types in automata theory

  • Finite Language: Contains a limited number of strings, e.g., L = {a, ab, ba}.

  • Infinite Language: Contains infinitely many strings, e.g., L = {aⁿ | n ≥ 1}.

Finite languages are countable, while infinite languages grow without bound.

Examples of Language

Examples of finite languages are given in the following diagram

Theory of Automata - Finite Languages Examples In Automata

Examples of infinite languages are given in the following diagram

Theory of Automata - Infinite Languages Examples In Automata

Here are some examples of languages over Σ given in the following diagram. Let’s take Σ = {a, b}. A language over Σ is any set of strings formed from Σ, including finite or infinite sets.

Theory of Automata - Language in Automata

A language over Σ can be any subset of Σ* (L⊆Σ*), finite or infinite, empty, or the whole set.

Operations on Language

In Theory of Automata, a language is a set of strings over an alphabet Σ. Several operations can be performed on languages to create new languages. Here’s a concise explanation:

Theory of Automata - Operations on Language

  • Union (L₁ ∪ L₂) → All strings that belong to L₁ or L₂ or both.

    • Example: L₁ = {a, b}, L₂ = {b, c} → L₁ ∪ L₂ = {a, b, c}

  • Intersection (L₁ ∩ L₂) → All strings that belong to both L₁ and L₂.

    • Example: L₁ = {a, b}, L₂ = {b, c} → L₁ ∩ L₂ = {b}

  • Complement (Lᶜ) → All strings in Σ* that are not in L.

    • Example: Σ = {a, b}, L = {a} → Lᶜ = {ε, b, aa, ab, ba, bb, …}

  • Concatenation (L₁L₂) → All strings formed by joining a string from L₁ with a string from L₂.

    • Example: L₁ = {a, b}, L₂ = {c, d} → L₁L₂ = {ac, ad, bc, bd}
  • Difference (L₁ − L₂) → All strings in L₁ that are not in L₂.
    • Example: L₁ = {a, b, c}, L₂ = {b, c} → L₁ − L₂ = {a}

These operations allow constructing complex languages from simpler ones.

Reversal of a Language

The reversal of a language L, denoted Lᴿ, is the set of strings obtained by reversing every string in L.

Theory of Automata - Reversal of language

Mathematical notation: LR ={wR ∣ w∈L}

Where wᴿ is the reversal of string w.

Example: Let L = {ab, aab, baa}

  • Reverse each string:

    • (ab)ᴿ = ba

    • (aab)ᴿ = baa

    • (baa)ᴿ = aab

  • So, Lᴿ = {ba, baa, aab}

Notes:

  • Reversal preserves the number of strings but changes the order of symbols in each string.
  • If L contains ε, then εᴿ = ε, so it remains in Lᴿ.

This operation is often used in NFA to DFA conversions and studying palindromes in languages.

After reversal of a language:

  • L=LR → may be satisfied

  • L ≠ LR  → may also occur

So it is not necessary that L=LR

Theory of Automata - L = L^R may or may not satisfied

Palindrome Language

A palindrome language L is a language in which every string reads the same forward and backward (i.e., w = wᴿ).

Mathematical notation: L = { w ∈ Σ* ∣ w = wR }

Example: Let Σ = {a, b}

  • L = {ε, a, b, aa, bb, aba, bab, abba, baab, …}

Here, each string is the same when reversed, so all are palindromes.

Power / Exponential of a Language

The power of a language is a concept in formal languages that describes repeated concatenation of strings from the language.

Theory of Automata - Exponential of a Language In Automata

Where:

  • L1
  • , and so on.

Example 1:

Let L = {a, b}

  • L¹ = {a, b}
  • L² = {aa, ab, ba, bb}
  • L³ = {aaa, aab, aba, abb, baa, bab, bba, bbb}

Example 2:

Let L = {ab}

  • L² = {abab}
  • L³ = {ababab}

Kleene Closure/Star of a Language

The Kleene Closure of a language L, written as L*, is the set of all strings formed by concatenating zero or more strings from L, including the empty string ε for L⁰.

Theory of Automata - Kleene Closure In Automata

where:

  • L⁰ = {ε} (empty string)
  • L¹ = L
  • L² = L · L (concatenation of L with L)
  • so on…
  • It continues for all n ≥ 0 (infinitely many times).

Example: 1

Example: If L = {a}

  • L⁰ = {ε}
  • L¹ = {a}
  • L² = {aa}
  • L³ = {aaa}

So, L* = {ε, a, aa, aaa, aaaa, …}.

Example: 2

If L = {ab, c}

  • L⁰ = {ε}
  • L¹ = {ab, c}
  • L² = L.L= {abab, abc, cab, cc}
  • L³ =L.L.L= {ababab, ababc, abcab, abcc, cabab, cabc, ccab, ccc}

So, L* = {ε, ab, c, abab, abc, cab, cc, ababab, ababc, abcab, …}.

Kleene Plus of a Language

The Kleene Plus of a language L, written as L⁺, is the set of all strings formed by concatenating one or more strings from L.

Theory of Automata - Kleene Plus In Automata

It does not include the empty string ε unless ε is already in L.

Example 1

If L = {a}

  • L¹ = {a}

  • L² = {aa}

  • L³ = {aaa}

So, L⁺ = {a, aa, aaa, aaaa, …}

Example 2

If L = {ab, c}

  • L¹ = {ab, c}

  • L² = {abab, abc, cab, cc}

So, L⁺ = {ab, c, abab, abc, cab, cc, …}

Important:  Key Difference

  • Kleene Star (L*) = {ε} ∪ L⁺
  • Kleene Plus (L⁺) = L* − {ε}

Exercise of Language in automata

Question 1.

Let L be a language. If L = LR, then every string in L is a Palindrome ??

Solution:  

No, not necessarily.

If L = Lᴿ, it means the language is closed under reversal, not that every string is a palindrome.

Condition: L=LR⇒For every w∈L,  wR ∈ L

  • Counter Example: Let L = {ab, ba}
  • Reverse of ab = ba
  • Reverse of ba = ab

So Lᴿ = {ba, ab} = L, but ab ≠ ba, therefore not every string is a palindrome.

Conclusion: If L = Lᴿ, the language contains reverses of its strings, but each string itself does not have to be a palindrome

Question 02:

Let L = {ab, aa, baa}. Which of the following strings are in L*: abaabaaabaa, aaaabaaaa, baaaaabaaaab, baaaaabaa? Which strings are in L4?

Solution:

Given L = {ab, aa, baa}. The Kleene star (L*) contains all strings formed by concatenating the strings ab, aa, or baa any number of times. The language L⁴ contains strings formed by exactly four concatenations of elements from L.

Checking membership:

  • abaabaaabaa → ab | aa | baa | ab | aa → 5 parts → ∈ L* but ∉ L⁴
  • aaaabaaaa → aa | aa | baa | aa → 4 parts → ∈ L* and ∈ L⁴
  • baaaaabaaaab → baa | aa | ab | aa | ab → 5 parts → ∈ L* but ∉ L⁴
  • baaaaabaa → baa | aa | ab | aa → 4 parts → ∈ L* and ∈ L⁴

Result:

  • Strings in L*: abaabaaabaa, aaaabaaaa, baaaaabaaaab, baaaaabaa
  • Strings in L⁴: aaaabaaaa, baaaaabaa.

Question: 03

L* is infinite for every language?

Solution:

No, not always. The size of L*  depends on the language L.

  • If L = ∅ (empty language), then L* = {ε} → finite.
  • If L={ε}, then L*={ε} → finite.
  • If L contains any non-empty string, then L* generates strings of unlimited length → infinite.

Question 04:

Consider the languages L1 = and L2 = {a}. What will be the answer of L1.L2* U L1*?

Solution:

First find the required parts:

  • L₂* = {ε, a, aa, aaa, …}

  • L₁ · L₂* = ∅ · L₂* = (concatenation with empty language gives empty language)

  • L₁* = (∅)* = {ε}

Now compute the union:

  • L₁ · L₂* ∪ L₁*

  • ∅ ∪ {ε} = {ε}

Final Answer

L₁ · L₂* ∪ L₁* = {ε}.

Complement of a Language

The complement of a language L (written as L̅ or Lᶜ) is the set of all strings over the alphabet Σ that are not in L.

Formula: L̅ = Σ* − L

  • Σ* = all possible strings over alphabet Σ
  • Remove all strings that belong to L

Example : If Σ = {a, b} and L = {a, ab} find the complement of L

Solution:

All possible strings over Σ* are

  • Σ* = {ε, a, b, aa, ab, ba, bb, aaa, …}
  • L = {a, ab}

Then the complement of L is:

L̅ = Σ* − L

L̅ = {ε, b, aa, ba, bb, aaa, …}

(all strings except a and ab).

Example 2

If Σ = {a, b} and L = { aⁿ | n ≥ 1 }

Σ* = {ε, a, b, aa, ab, ba, bb, aaa, …}
L = {a, aa, aaa, aaaa, …}

Complement:

L̅ = Σ* − L

L̅ = {ε, b, ab, ba, bb, aba, bab, …}
(all strings not consisting only of a’s)

Example 3

If Σ = {a, b} and L = { (ab)ⁿ | n ≥ 1 }

Σ* = {ε, a, b, aa, ab, ba, bb, …}
L = {ab, abab, ababab, …}

Complement:

L̅ = Σ* − L

L̅ = {ε, a, b, aa, ba, bb, aba, …}
(all strings not in repeating “ab” pattern)

Example 4

If Σ = {0,1} and L = { 1ⁿ | n ≥ 1 }

Σ* = {ε, 0, 1, 00, 01, 10, 11, …}
L = {1, 11, 111, 1111, …}

Complement:

L̅ = Σ* − L

L̅ = {ε, 0, 00, 01, 10, 101, …}
(all strings not made only of 1’s)

Example 5

If Σ = {a, b} and L = { aⁿbⁿ | n ≥ 1 }

Σ* = {ε, a, b, aa, ab, ba, bb, …}
L = {ab, aabb, aaabbb, …}

Complement:

L̅ = Σ* − L

L̅ = {ε, a, b, aa, ba, bb, abb, aba, …}
(all strings not having equal a’s followed by b’s).

4. Power / Exponential of Alphabet

The power (or exponential) of an alphabet Σ, written as Σⁿ, is the set of all strings of length n formed using the symbols of Σ.

  • Σ⁰ = {ε} (string of length 0)
  • Σ¹ = Σ (all strings of length 1)
  • Σ² = all strings of length 2, and so on.

Example 1: If Σ = {a, b}

  • Σ⁰ = {ε}

  • Σ¹ = {a, b}

  • Σ² = {aa, ab, ba, bb}

  • Σ³ = {aaa, aab, aba, abb, baa, bab, bba, bbb}

  • Σ⁴ = all strings of length 4 over {a, b} (e.g., aaaa, aaab, aaba, …)

  • Σ⁵ = all strings of length 5 over {a, b} (e.g., aaaaa, aaaab, aaaba, …)

  • so on…

Example 2: If Σ = {a, b, c}

  • Σ⁰ = {ε}

  • Σ¹ = {a, b, c}

  • Σ² = {aa, ab, ac, ba, bb, bc, ca, cb, cc}

  • Σ³ = {aaa, aab, aac, aba, abb, abc, aca, acb, acc, …}

  • so on…

5. Kleene Star of the Alphabet vs Kleene Star of a Language

Kleene Star of an Alphabet (Σ*) means the set of all possible strings (including ε) formed using symbols of the alphabet Σ.

  • Example: If Σ = {a, b}
  • Σ* = {ε, a, b, aa, ab, ba, bb, aaa, aab, …}

Kleene Star of a Language (L*) means the set of all strings formed by concatenating strings from language L zero or more times.

  • Example: If L = {ab, c}
  • L* = {ε, ab, c, abab, abc, cab, cc, …}

Difference:

  • Σ* uses individual symbols of an alphabet to form strings.
  • L* uses complete strings from a language and concatenates them.

Exercise Questions

Question 01

Assume L is a language over alphabet Σ and ε denotes the empty string. Which of the following statements is/are always true?

  1. Σ* − {ε} = Σ⁺
  2. L* {ε} = L⁺
  3. Σ* = Σ⁺ ∪ {ε}
  4. L* = L* ∪ {ε}

Solution:

  1. Σ* − {ε} = Σ⁺True (Σ⁺ contains all strings of length ≥ 1).
  2. L* {ε} = L⁺False (since L*{ε} = L*, not L⁺).
  3. Σ* = Σ⁺ ∪ {ε}True (Σ* includes all strings plus ε).
  4. L* = L* ∪ {ε}True (ε is already in L*).

Correct statements are i, iii, and iv.

Question 02

Assume L is a language over alphabet Σ and ε denotes the empty string. Which of the following statements is/are always true?

  1. ε  L⁺
  2. ε ∈ L*
  3. ε ∈ Σ*
  4. ε Σ⁺

Solution:

Check each statement:

  • ε ∉ L⁺True
    L⁺ = L¹ ∪ L² ∪ … (one or more concatenations), so ε is not included unless ε is already in L.

  • ε ∈ L*True
    By definition L* = {ε} ∪ L⁺, so ε is always included.

  • ε ∈ Σ*True
    Σ* contains all strings over Σ including ε.

  • ε ∉ Σ⁺True
    Σ⁺ contains strings of length ≥ 1, so ε is not included.

Final Answer: All statements are true.

5. Grammar in Automata Theory

Grammar in Automata Theory is a formal system that defines rules for generating valid strings of a language. Let understand from very basic

i. Idea of Grammar

Just like English grammar has rules to form correct sentences, formal grammar in Automata Theory has rules to form valid strings of a language.

ii. Example from English Language

Rules:

  • Sentence → Noun + Verb

  • Noun → Ali | Sara

  • Verb → runs | eats

Generated Sentences:

  • Ali runs

  • Sara eats

Here, the rules generate valid sentences, just like grammar rules in English.

iii. Example from Automata Theory

Rules:

  • S → aSb

  • S → ab

Generated Strings:

  • ab

  • aabb

  • aaabbb

Here, the grammar rules generate valid strings of the language, similar to how English grammar generates sentences.

iv. Definition of Grammar

Grammar is a set of rules used to generate strings of a language.
It tells how symbols can be arranged to form valid strings.

A grammar is written as:

G = (V, Σ, P, S)

v. Components of Grammar

  • V → Variables / Non-terminals (symbols that can be replaced)

  • Σ → Terminals (actual symbols of the language)

  • P → Production rules (rules used to replace variables)

  • S → Start symbol (where generation begins)

vi. Formal Example

Let:

  • V = {S}

  • Σ = {a, b}

  • P = {S → aSb, S → ab}

  • S = Start symbol

Generated Strings:

  • ab

  • aabb

  • aaabbb

The grammar starts from S and applies production rules to generate strings.

6. Theory of Automata Notations

Notations in Automata Theory are symbols used to represent alphabets, strings, languages, and automata machines.

Common Notations:

  • Σ (Sigma) → Alphabet (finite set of input symbols), e.g., Σ = {a, b}

  • ε (Epsilon) → Empty string (string with length 0)

  • |w| → Length of string w

  • Σ* → Set of all possible strings over alphabet Σ

  • Σ⁺ → Set of all strings over Σ except ε

  • L → A language (set of strings)

  • → Empty language (contains no strings)

Automata Machine Notations:

  • Q → Finite set of states of the automaton

  • q₀Initial (start) state

  • F → Set of final / accepting states

  • δ (delta)Transition function that shows how the machine moves from one state to another.

Notations in Theory Of Automata

Important: Languages are generated according to their grammar and accepted or rejected by automata machines. Automata are abstract machines that accept or reject regular languages.

Note: “It is possible to express all regular languages using Regular Expressions.”

start…

7. Automata Machine and Types

An automaton (like a DFA or NFA) is a state machine that recognizes the language by processing input strings and decides whether they belong to a language.

It consists of:

  • States
  • Transitions based on input symbols
  • A start state
  • Accepting (final) state(s)

The automaton takes each input string and processes it through states. If it ends in an accepting state, the string is accepted; otherwise, it’s rejected.

Note: Some examples of automata machines are Finite Automaton (FA),  Pushdown Automaton (PDA), Linear Bounded Automaton (LBA), Turing Machine (TM), Moore machine, Mealy machine etc.

Validation of an Automata Machine

  • An automaton is designed to recognize a language (a set of valid strings).

  • To validate that the automaton is correct, we test it using strings:

    • It should accept all strings in the language.

    • It should reject all strings not in the language.

So, we validate the machine by checking how it behaves on many test strings — both valid and invalid — based on the definition of the target language.

 Example:

Let the language L = all binary strings that end in 1.

We can validate the DFA for this language using strings like:

  • “1” → Should be accepted

  • “101” → Accepted

  • “100” → Rejected

  • “” (empty string) → Rejected

If the DFA accepts all strings in L and rejects all others, then it is valid for language L.

Types of Automata Machines

The following table presents information about different types of automata and their respective recognized languages and computational abilities. Each automaton type ranges from regular to ω-regular language classes. The purpose of the table is to demonstrate the varying computational capacities of different automata models.

Automata Machine Recognized Languages Computational Power
Finite Automaton (FA) Regular Languages Limited to Regular Languages
Two-Way Finite Automaton Regular Languages Same as Finite Automaton
Mealy Machine Output depends on state and input Limited to Regular Languages
Moore Machine Output depends only on the state Limited to Regular Languages
Pushdown Automaton (PDA) Context-Free Languages More powerful than Finite Automaton
Linear Bounded Automaton (LBA) Context-Sensitive Languages More powerful than PDA
Nested Stack Automaton Certain types of nested structures More expressive than PDAs
Register Machine Integer functions Turing-equivalent
Turing Machine (TM) Recursively Enumerable Languages The most powerful model of computation
Multi-Tape Turing Machine Recursively Enumerable Languages Same as the Turing Machine
Non-deterministic Turing Machine (NDTM) Recursively Enumerable Languages Same as the Turing Machine
Deterministic Büchi Automaton ω-limit Languages Recognizes languages with infinite words satisfying Büchi acceptance condition
Nondeterministic Büchi Automaton ω-Regular Languages Recognizes languages with infinite words satisfying Büchi acceptance condition
Weighted Automaton Weighted Languages Assign weights to infinite words
Rabin Automaton, Streett Automaton, Parity Automaton, Muller Automaton ω-Regular Languages Recognize various classes of ω-regular languages

8. Chomsky Hierarchy

Proposed by Noam Chomsky.

Types:

  1. Type 0 → Recursively Enumerable

  2. Type 1 → Context Sensitive

  3. Type 2 → Context Free

  4. Type 3 → Regular

 

Circles and arrows form the automata machine, where Circles represent states, and arrows represent transitions. An automaton takes some input strings as input, and this input goes through some states and may be entered in the final state.

Theory of Automata Workflow

Grammar defines production rules, which generate strings (made from input symbols). These strings form a language, which can be accepted or rejected by an automata machine.

Elements in Theory of Automata

The following are some essential elements of the theory of Automata.

1. Grammar

A grammar is a formal set of rules used to generate strings in a language. It includes:

  • A start symbol
  • Non-Terminals (variables)
  • Terminals (alphabet symbols)
  • Production rules

Example:

Let the grammar be:

  • Start Symbol: S

  • Terminals: a, b

  • All Production Rules:

    • S → aS

    • S → bS

    • S → a

Can represented as S → aS | bS | a

Production rules are the specific rewriting rules in a grammar. They tell how to replace non-terminals with terminals or other non-terminals.

 From above grammar, rules are:

  • S → aS

  • S → bS

  • S → a

These allow generating strings like:

  • a

  • aa (S → aS → aa)

  • aba (S → aS → abS → aba)

Grammar is commonly used to represent a Set of Production Rules with the letter ‘P’.

Note: In all Production rules, P : S → aS | bS | a. Where S → a is one the production rule.

The followings are the major types of  grammar in Automata

  • Type-3: Regular Grammar – Used by Finite Automata to generate Regular Languages.

  • Type-2: Context-Free Grammar (CFG) – Used by Pushdown Automata to generate Context-Free Languages.

  • Type-1: Context-Sensitive Grammar (CSG) – Used by Linear Bounded Automata to generate Context-Sensitive Languages.

  • Type-0: Unrestricted Grammar – Used by Turing Machines to generate Recursively Enumerable Languages.

2. Input Strings

Input strings are sequences of terminal symbols (like a, b) that are derived by applying production rules starting from the start symbol. It is a collection of some symbols from the Sigma (∑). The symbol ‘W’ denotes the string.

String Examples (from the Grammar above):

  • a

  • aa

  • ababa

  • bba

These are input strings we may feed into an automaton for testing.

Important: Sigma (∑) contains a finite set of symbols known as an alphabet. Symbols involve letters, the alphabet, or any picture. Strings are made from input symbols (∑).

Examples: ∑ = {a, b} , ∑ = {A, B, C, D} , ∑ = {0, 1, ….., 5]  

If ∑ = {x, y}, then various strings that can be generated from ∑ are {x, y, xxx, xyxy, yy, xyy, …..,}.

An empty string, also known as an epsilon string, is a string that contains no symbols or has zero occurrences of symbols. Symbol ‘ε’ denotes the epsilon string. A string’s length refers to the number of characters it contains. Symbol ‘|w|’ denotes the length of that string. If the string is w= 010, its length is 3.

3. Language

A language is the set of all valid strings that can be generated by a grammar. Grammar uses production rules to create the set of strings. 

Note: Languages in automata are classified into four major types: Regular, Context-Free, Context-Sensitive, and Recursively Enumerable, based on the Chomsky Hierarchy.

Example 01:

The language L = { all strings over a and b that end in a }

So:

  • Valid strings (in the language): a, aa, aba, bba

  • Invalid strings: b, ab, bb

Example: 02

Language (L) = {Set of string of length 2} = {xx, yy, xy, yx}        //  Finite Language

Example: 03

L2 = {Set of all strings starts with ‘x’} = {x, xx, x, xyy, xyyyy, xyxyxyyyx, ………,}      //  Infinite Language

4. Automata Machine

An Automaton (plural: Automata) is a mathematical model of a machine that reads input strings from an alphabet and changes states according to rules to decide whether to accept or reject the string.

Formally, an automaton can be represented as a 5-tuple:

M = (Q, Σ, δ, q₀, F)

  • Q → Finite set of states

  • Σ → Input alphabet

  • δ → Transition function (Q × Σ → Q for deterministic machines)

  • q₀ → Start state (q₀ ∈ Q)

  • F → Set of accepting/final states (F ⊆ Q)

Types of Automata

1. Finite Automata (FA)

  • Machines with finite number of states

  • Deterministic FA (DFA) → Each input leads to exactly one next state

  • Non-deterministic FA (NFA) → An input can lead to multiple possible next states

2. Pushdown Automata (PDA)

  • Automata with stack memory

  • Can recognize context-free languages

3. Linear Bounded Automata (LBA)

  • Turing Machine variant with tape length limited by input size

  • Recognizes context-sensitive languages

4. Turing Machine (TM)

  • Most powerful automaton with infinite tape and read/write head

  • Recognizes recursively enumerable languages

Applications Theory of Automata

Following are the applications of thoery of automata 

  • Compiler Design: Utilizes automata theory for lexical analysis and parsing in compiler construction.
  • Parsing and Syntax Analysis: Uses context-free grammars and pushdown automata for analyzing syntax in compilers.
  • Lexical Analysis: Applies regular expressions and finite automata to identify tokens in programming language source code.
  • Natural Language Processing (NLP): Involves language recognition and syntax analysis for applications like speech recognition.
  • Software Engineering: Utilizes automata for specifying and verifying correctness in software systems.
  • Model Checking: Verifies system specifications to ensure correctness in hardware and software.
  • Network Protocol Design: Uses finite state machines to model and analyze network protocol behavior.
  • DNA Computing: Applies automata concepts for solving computational problems using DNA strands.
  • Formal Verification: Uses automata for verifying correctness in hardware and software systems.
  • Regular Expressions in Text Processing: Widely used for searching and data extraction tasks.
  • Databases: Applies finite automata and regular expressions for query processing and pattern matching.
  • VLSI Design: Utilizes automata theory in modeling and optimizing sequential circuits in VLSI.
  • Robotics: Implements finite state machines to model and control robot behavior based on sensory input.

"Your Support, Our Priority"

"We Make It Easy"