Compiler Design

C Program to Recognize Strings Under ‘a’, ‘ab+’, ‘abb’

C Program to Recognize Strings Under 'a*', 'a*b+', 'abb'

This C program is designed to recognize and classify strings according to three specific rules or patterns:

  1. a*: A string consisting of zero or more ‘a’ characters.
  2. a*b+: A string that starts with zero or more ‘a’ characters followed by one or more ‘b’ characters.
  3. abb: A string that follows exactly the pattern ‘abb’.

The program uses a finite state machine (FSM) approach where the input string is evaluated character by character, and the current state determines how the next character is processed.

#include<stdio.h>
#include<conio.h>
#include<string.h>

These header files are included for standard input/output functions, console input/output operations, and string manipulation.

Variables Used:
s[20]: An array to store the input string.
c: A character variable used to store the current character of the string being processed.
state: The current state of the FSM.
i: Index variable used to traverse the string.

Initialization and Input:

void main()
{
char s[20], c;
int state = 0, i = 0;
clrscr();
printf("n Enter a string:");
gets(s); // Gets the input string from the user
  • state = 0: The FSM starts in the initial state (state 0).
  • gets(s): Reads the string entered by the user. (Note: gets is outdated and can lead to buffer overflows. Use fgets instead in modern programs).

State Transition Logic:

The FSM uses a while loop to process each character in the input string. A switch-case construct handles the transitions between different states.

while(s[i] != '')
{
switch(state)
{
case 0: // Initial state
c = s[i++];
if (c == 'a')
state = 1; // Transition to state 1 if 'a' is found
else if (c == 'b')
state = 2; // Transition to state 2 if 'b' is found
else
state = 6; // Error state if neither 'a' nor 'b' is found
break;

In state 0:

  • If the current character is ‘a’, the FSM transitions to state 1.
  • If it’s ‘b’, it moves to state 2.
  • Any other character leads to state 6, which indicates an invalid input.

 case 1: // State for 'a'
c = s[i++];
if (c == 'a')
state = 3; // Remain in state 3 for multiple 'a's
else if (c == 'b')
state = 4; // Transition to state 4 when 'b' is found after 'a'
else
state = 6; // Error state
break;

In state 1:

  • If ‘a’ is found, it transitions to state 3 (to check for multiple ‘a’s).
  • If ‘b’ is found, it moves to state 4 (as per the pattern ‘a*b’).
 case 2: // State for 'b'
c = s[i++];
if (c == 'a')
state = 6; // Error if 'a' comes after 'b'
else if (c == 'b')
state = 2; // Stay in state 2 for multiple 'b's (for a*b+)
else
state = 6; // Error state
break;

In state 2:

  • If another ‘b’ is found, it stays in state 2 (for ‘a*b+’).
  • If ‘a’ appears after ‘b’, the string is invalid.
 case 3: // State for multiple 'a's
c = s[i++];
if (c == 'a')
state = 3; // Stay in state 3 for more 'a's
else if (c == 'b')
state = 2; // Go to state 2 when 'b' appears (for a*b+)
else
state = 6; // Error state
break;

In state 3:

  • It keeps checking for more ‘a’s, staying in the same state.
  • Once ‘b’ is found, it moves to state 2 (to complete ‘a*b+’).
 case 4: // State after 'a' followed by 'b'
c = s[i++];
if (c == 'a')
state = 6; // Error if 'a' appears after 'b'
else if (c == 'b')
state = 5; // Transition to state 5 for 'abb'
else
state = 6; // Error state
break;

In state 4:

  • It checks for the second ‘b’ in ‘abb’. If found, it moves to state 5.
 case 5: // State for 'abb'
c = s[i++];
if (c == 'a')
state = 6; // Error if 'a' appears after 'abb'
else if (c == 'b')
state = 2; // Stay in state 2 for more 'b's
else
state = 6; // Error state
break;

In state 5:

  • The FSM ensures that ‘abb’ is recognized and checks for further ‘b’s.
 case 6: // Error state
printf("n %s is not recognised.", s);
exit(0);
}
}

State 6 is the error state where the string is invalid according to the defined patterns.

Output:

Once the FSM finishes processing the input string, the final state determines which pattern the string matches:

if (state == 1)
printf("n %s is accepted under rule 'a*'", s);
else if ((state == 2) || (state == 4))
printf("n %s is accepted under rule 'a*b+'", s);
else if (state == 5)
printf("n %s is accepted under rule 'abb'", s);

  • If the final state is 1, the string is recognized as belonging to the a* pattern.
  • If the final state is 2 or 4, it matches a*b+.
  • If the final state is 5, it matches abb.

Conclusion

This program uses a finite state machine to recognize strings based on three specific patterns: a*, a*b+, and abb. It processes each character of the input string step by step, transitioning between states, and finally determines if the string matches one of the predefined rules. If the string does not match any rule, it is deemed unrecognized.

PROGRAM:
#include
#include
#include
#include
void main()
{
char s[20],c;
int state=0,i=0;
clrscr();
printf("n Enter a string:");
gets(s);
while(s[i]!='')
{
switch(state)
{
case 0: c=s[i++];
if(c=='a')
state=1;
else if(c=='b')
state=2;
else
state=6;
break;
case 1: c=s[i++];
if(c=='a')
state=3;
else if(c=='b')
state=4;
else
state=6;
break;
case 2: c=s[i++];
if(c=='a')
state=6;
else if(c=='b')
state=2;
else
state=6;
break;
case 3: c=s[i++];
if(c=='a')
state=3;
else if(c=='b')
state=2;
else
state=6;
break;
case 4: c=s[i++];
if(c=='a')
state=6;
else if(c=='b')
state=5;
else
state=6;
break;
case 5: c=s[i++];
if(c=='a')
state=6;
else if(c=='b')
state=2;
else
state=6;
break;
case 6: printf("n %s is not recognised.",s);
exit(0);
}
}
if(state==1)
printf("n %s is accepted under rule 'a'",s);
else if((state==2)||(state==4))
printf("n %s is accepted under rule 'a*b+'",s);
else if(state==5)
printf("n %s is accepted under rule 'abb'",s);
getch();
}
Input :
Enter a String: aaaabbbbb
Output:
aaaabbbbb is accepted under rule 'a*b+'
Input :
Enter a string: cdgs
Output:
cdgs is not recognized

Team Educate

About Author

Leave a comment

Your email address will not be published. Required fields are marked *

You may also like

Convert from NFA to DFA using Thompson’s rule for (a+b)*abb
Compiler Design

Convert from NFA to DFA using Thompson’s rule for (a+b)*abb

To convert the regular expression (a + b)*abb from an NFA to a DFA using Thompson’s construction, we will follow
C Program to Recognize Strings Under 'a*', 'a*b+', 'abb'
Compiler Design

C program to Design a lexical analyzer for given language and the lexical analyzer should ignore redundant spaces, tabs and new lines

This program reads a C source code file, tokenizes it into keywords, identifiers, and special characters, and counts the lines.