Compiler Design

C program to Design a lexical analyzer for given language and the lexical analyzer should ignore redundant spaces, tabs and new lines

C Program to Recognize Strings Under 'a*', 'a*b+', 'abb'

This program reads a C source code file, tokenizes it into keywords, identifiers, and special characters, and counts the lines. It also ignores redundant spaces, tabs, and new lines while processing the input

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

void keyword(char str[10]) {
// Check if the given string is a keyword or an identifier
if (strcmp("for", str) == 0 || strcmp("while", str) == 0 || strcmp("do", str) == 0 ||
strcmp("int", str) == 0 || strcmp("float", str) == 0 || strcmp("char", str) == 0 ||
strcmp("double", str) == 0 || strcmp("static", str) == 0 || strcmp("switch", str) == 0 ||
strcmp("case", str) == 0) {
printf("n%s is a keyword", str);
} else {
printf("n%s is an identifier", str);
}
}

int main() {
FILE *f1, *f2, *f3; // File pointers
char c, str[10], st1[10];
int num[100], lineno = 0, tokenvalue = 0, i = 0, j = 0, k = 0;

printf("nEnter the C program:");

f1 = fopen("input", "w"); // Open input file to write
while ((c = getchar()) != EOF) {
putc(c, f1); // Read from standard input and write to file
}
fclose(f1);

f1 = fopen("input", "r"); // Open input file to read
f2 = fopen("identifier", "w"); // Open identifier file to write
f3 = fopen("specialchar", "w"); // Open special character file to write

while ((c = getc(f1)) != EOF) {
// Check if the character is a digit
if (isdigit(c)) {
tokenvalue = c - '0'; // Convert character to integer
c = getc(f1);
while (isdigit(c)) {
tokenvalue = tokenvalue * 10 + (c - '0'); // Form the full number
c = getc(f1);
}
num[i++] = tokenvalue; // Store the number in the array
ungetc(c, f1); // Push back the character for further processing
}
// Check if the character is an alphabetic character
else if (isalpha(c)) {
putc(c, f2); // Write to identifier file
c = getc(f1);
while (isdigit(c) || isalpha(c) || c == '_' || c == '$') {
putc(c, f2); // Write continuous identifier characters
c = getc(f1);
}
putc(' ', f2); // Separate identifiers with a space
ungetc(c, f1); // Push back the character
}
// Ignore spaces and tabs
else if (c == ' ' || c == 't') {
printf(" ");
}
// Count new lines
else if (c == 'n') {
lineno++;
}
// Handle special characters
else {
putc(c, f3);
}
}

fclose(f2); // Close identifier file
fclose(f3); // Close special character file
fclose(f1); // Close input file

// Print numbers found in the program
printf("nThe numbers in the program are: ");
for (j = 0; j < i; j++) {
printf("%d ", num[j]);
}
printf("n");

// Read identifiers and classify them
f2 = fopen("identifier", "r");
k = 0;
printf("The keywords and identifiers are: ");
while ((c = getc(f2)) != EOF) {
if (c != ' ') {
str[k++] = c; // Collect identifier characters
} else {
str[k] = ''; // Null-terminate the string
keyword(str); // Check if it's a keyword or identifier
k = 0; // Reset index for next identifier
}
}
fclose(f2); // Close identifier file

// Read special characters
f3 = fopen("specialchar", "r");
printf("nSpecial characters are: ");
while ((c = getc(f3)) != EOF) {
printf("%c", c); // Print special characters
}
printf("n");
fclose(f3); // Close special character file

// Print the total number of lines
printf("Total number of lines are: %d", lineno);
}


Explanation of the Code Components

  1. Header Files:

    • #include <stdio.h>: Required for input and output functions.
    • #include <stdlib.h>: Provides functions for memory allocation and other utility functions.
    • #include <string.h>: Contains functions for string manipulation.
    • #include <ctype.h>: Used for character classification functions (e.g., isdigit, isalpha).
  2. Keyword Function:

    • This function checks whether a given string is a keyword or an identifier. It compares the string against known keywords and prints the appropriate message.
  3. Main Function:

    • The program starts execution in the main() function.
    • File Handling: The program opens an input file for writing the user-provided C program, and later opens it for reading. It also creates two files for identifiers and special characters.
    • Character Processing:
      • Digits: If the character is a digit, it constructs the complete number by reading subsequent digit characters.
      • Identifiers: If the character is an alphabet, it reads the entire identifier (which can include digits, underscores, and dollar signs) and writes it to the identifier file.
      • Ignoring Whitespace: The program ignores spaces and tabs but counts new line characters.
      • Special Characters: If the character is not a digit or letter, it writes the character to the special character file.
    • After processing the input file, it reads the identifier file to classify identifiers and keywords and prints them. Finally, it reads and prints any special characters.
  4. Final Output:

    • The program prints all numbers found in the C program, lists the keywords and identifiers, shows the special characters, and counts the total lines in the input program.


Input:
Enter Program $ for termination:
{
int a[3],t1,t2;
t1=2; a[0]=1; a[1]=2; a[t1]=3;
t2=-(a[2]+t1*6)/(a[2]-t1);
if t2>5 then
print(t2);
else {
int t3;
t3=99;
t2=-25;
print(-t1+t2*t3); /* this is a comment on 2 lines */
} endif
}
(cntrl+z)
Output:
 Variables : a[3] t1 t2 t3
Operator : - + * / >
Constants : 2 1 3 6 5 99 -25
Keywords : int if then else endif
Special Symbols : , ; ( ) { }
Comments : this is a comment on 2 lines

Team Educate

About Author

Leave a comment

Your email address will not be published. Required fields are marked *

You may also like

C Program to Recognize Strings Under 'a*', 'a*b+', 'abb'
Compiler Design

C Program to Recognize Strings Under ‘a’, ‘ab+’, ‘abb’

This C program is designed to recognize and classify strings according to three specific rules or patterns: a*: A string
Convert from NFA to DFA using Thompson’s rule for (a+b)*abb
Compiler Design

Convert from NFA to DFA using Thompson’s rule for (a+b)*abb

To convert the regular expression (a + b)*abb from an NFA to a DFA using Thompson’s construction, we will follow