Book Awk Learners

By Wishal Jain
2nd February, 2024


CHAPTER 1: 

1.1 Getting Started 

The Structure of an AWK Program 

Running an AWK Program 

Errors 

1.2 Simple Output 

Printing Every Line 

Printing Certain Fields 

NF, the Number of Fields 

Computing and Printing 

Printing Line Numbers 

Putting Text in the Output 

1.3 Fancier Output 

Lining Up Fields  

Sorting the Output

1.4 Selection 

Selection by Comparison 

Selection by Computation 

Selection by Text Content Combinations of Patterns 

Data Validation 

BEGIN and END 

1.5 Computing with AWK 

Counting 

Computing Sums and Averages 

Handling Text 

String Concatenation 

Printing the Last Input Line 

Built-in Functions 

Counting Lines, Words, and Characters 

1.6 Control-Flow Statements 

If-Else Statement 

While Statement 

For Statement 

1.7 Arrays 

1.8 A Handful of Useful "One-liners" 

1.9 What Next? 

CHAPTER 2: THE AWK LANGUAGE 

The Input File countries 

Program Format 

2.1 Patterns 

BEGIN and END 

Expressions as Patterns 

String-Matching Patterns 

Regular Expressions 

Compound Patterns 

Range Patterns 

Summary of Patterns 

2.2 Actions 

Expressions 

Control-Flow Statements 

Empty Statement 

Arrays 

2.3 User-Defined Functions 

2.4 Output 

The print Statement 

Output Separators 

The printf Statement 

Output into Files 

Output into Pipes 

Closing Files and Pipes 

2.5 Input 

Input Separators 

Multiline Records 

The getline Function 

Command-Line Variable Assignments 

Command-Line Arguments 

2.6 Interaction with Other Programs 

The system Function 

Making a Shell Command from an AWK Program 

2. 7 Summary 

CHAPTER 3: DATA PROCESSING  

3.1 Data Transformation and Reduction Summing Columns 

Computing Percentages and Quantiles 

Numbers with Commas Fixed-Field Input 

Program Cross-Reference Checking 

Formatted Output 

3.2 Data Validation 

Balanced Delimiters 

Password-File Checking 

Generating Data-Validation Programs 

Which Version of AWK? 

3.3 Bundle and Unbundle 

3.4 Multiline Records 

Records Separated by Blank Lines 

Processing Multiline Records 

Records with Headers and Trailers 

Name-Value Data 

3.5 Summary 

CHAPTER 4: REPORTS AND DATABASES 

4.1 Generating Reports 

A Simple Report 

A More Complex Report 

4.2 Packaged Queries and Reports 

Form Letters 

4.3 A Relational Database System 

Natural Joins 

The relfile 

q, an awk-like query language 

qawk, a q-to-awk translator 

4.4 Summary 

CHAPTER 5: PROCESSING WORDS 

5.1 Random Text Generation 

Random Choices 

Cliche Generation 

Random Sentences 

5.2 Interactive Text-Manipulation 

Skills Testing: Arithmetic 

Skills Testing: Quiz 

5.3 Text Processing 

Word Counts 

Text Formatting 

Maintaining Cross-References in Manuscripts 

Making a KWIC Index 

Making Indexes 

5.4 Summary 

CHAPTER 6: LITILE LANGUAGES 

6.1 An Assembler and Interpreter 

6.2 A Language for Drawing Graphs 

6.3 A Sort Generator 

6.4 A Reverse-Polish Calculator 

6.5 An Infix Calculator 

6.6 Recursive-Descent Parsing 

6.7 Summary 

CHAPTER 7: EXPERIMENTS WITH ALGORITHMS 

7.1 Sorting 

Insertion Sort 

Quicksort 

Heapsort 

7.2 Profiling 

7.3 Topological Sorting 

Breadth - First Topological 

Sort Depth - First Search 

Depth - First Topological Sort 

7.4 Make: A File Updating Program 

7.5 Summary 

CHAPTER 8: EPILOG 

8.1 AWK as a Language 

8.2 Performance 

8.3 Conclusion 

APPENDIX A :  AWK SUMMARY 

APPENDIX B : ANSWERS TO SELECTED EXERCISE

INDEX

___________________________________________________

               LET'S GET STARTED 

Awk is a convenient and expressive programming language that can be applied to a wide variety of computing and data-manipulation tasks. This chapter is a tutorial, designed to let you start writing your own programs as quickly as possible. Chapter 2 describes the whole language, and the remaining chapters show how awk can be used to solve problems from many different areas. Throughout the book, we have tried to pick examples that you should find useful, interesting, and instructive. 

1. 1 Getting Started 

Useful awk programs are often short, just a line or two. Suppose you have a file called emp.data that contains the name, pay rate in dollars per hour, and number of hours worked for your employees, one employee record per line, like this: 

         Beth          4.00           0 

         Dan           3.75           0 

         Kathy        4.00           10 

         Mark         5.00            20 

         Mary         5.50            22 

         Susie        4.25            18 

Now you want to print the name and pay (rate times hours) for everyone who worked more than zero hours. This is the kind of job that awk is meant for, so it's easy. Just type this command line: 

            awk   ' $3  >  0   {  print  $1,  $2  *  $3  } '   emp.data 

You should get this output: 

           Kathy         40 

           Mark          100 

           Mary          121 

           Susie          76.5 

This command line tells the system to run awk, using the program inside the quote characters, taking its data from the input file emp. data. The part inside the quotes is the complete awk program. It consists of a single pattern-action statement. The pattern, $3  >  0, matches every input line in which the third column, or field, is greater than zero, and the action 

               {  print  $1,  $2  *  $3  } 

prints the first field and the product of the second and third fields of each matched line. 

If you want to print the names of those employees who did not work, type this command line: 

            awk  ' $3  ==  0 {  print  $1  } ' emp.data 

Here the pattern, $3  ==  0, matches each line in which the third field is equal to zero, and the action 

             {  print  $1  } 

prints its first field. As you read this book, try running and modifying the programs that are presented. Since most of the programs are short, you'll quickly get an understanding of how awk works. On a Unix system, the two transactions above would look like this on the terminal: 

           $  awk  ' $3  >  0  {  print  $1,  $2  *  $3  } ' emp.data 

             Kathy          40 

             Mark           100 

             Mary           121 

             Susie          76.5 

            $  awk  ' $3  ==  0  {  print  $1  } ' emp.data 

            Beth 

            Dan 

            $ 

The $ at the beginning of a line is the prompt from the system; it may be different on your machine. 

The Structure of an AWK Program 

Let's step back a moment and look at what is going on. In the command lines above, the parts between the quote characters are programs written in the awk programming language. Each awk program in this chapter is a sequence of one or more pattern-action statements: 

            pattern   {  action  } 

            pattern   {  action  } 

The basic operation of awk is to scan a sequence of input lines one after another, searching for lines that are matched by any of the patterns in the program. The precise meaning of the word "match" depends on the pattern in question; for patterns like $3  >  0, it means "the condition is true." 

Every input line is tested against each of the patterns in turn. For each pattern that matches, the corresponding action (which may involve multiple steps) is performed. Then the next line is read and the matching starts over. This continues until all the input has been read. 

The programs above are typical examples of patterns and actions. 

          $3  ==  0 {  print  $1  } 

is a single pattern-action statement; for every line in which the third field is zero, the first field is printed. Either the pattern or the action (but not both) in a pattern-action statement may be omitted. If a pattern has no action, for example, 

          $3  ==  0 

then each line that the pattern matches (that is, each line for which the condition is true) is printed. This program prints the two lines from the emp. data file where the third field is zero: 

          Beth        4.00        0

          Dan         3.75        0

 If there is an action with no pattern, for example, 

          {  print  $1  } 

then the action, in this case printing the first field, is performed for every input line. Since patterns and actions are both optional, actions are enclosed in braces to distinguish them from patterns. 

Running an AWK Program 

There are several ways to run an awk program. You can type a command line of the form 

          awk  ' program '  input  files 

to run the program on each of the specified input files. For example, you could type 

          awk  ' $3  ==  0   {  print  $1  } '  file1  file2 

to print the first field of every line of file 1 and file2 in which the third field is zero. 

You can omit the input files from the command line and just type 

          awk   '  program  ' 

In this case awk will apply the program to whatever you type next on your terminal until you type an end-of-file signal (control ^d on Unix systems). Here is a sample of a session on Unix:

          $  awk  ' $3  ==  0  {  print  $1  } , 

              Beth    4.00         0 

              Beth 

              Dan     3.75         0 

              Dan 

              Kathy   3.75        10 

              Kathy   3.75         0 

              Kathy 

The heavy characters are what the computer printed. This behavior makes it easy to experiment with awk: type your program, then type data at it and see what happens. We again encourage you to try the examples and variations on them. 

Remark that the program is enclosed in single quotes on the command line. 

This protects characters like $ in the program from being interpreted by the shell and also allows the program to be longer than one line. This arrangement is convenient when the program is short (a few lines). If the program is long, however, it is more convenient to put it into a separate file, say progfile, and type the command line 

          awk  -f  progfile           optional list of input files 

The  -f  option instructs awk to fetch the program from the named file. Any filename can be used in place of progfile. 

Errors 

If you make an error in an awk program, awk will give you a diagnostic message. For example, if you mistype a brace, like this: 

          awk  ' $3  ==  0  [  print  $1  } ' emp.data 

you will get a message like this: 

         awk: syntax error at source line 1 

            context is 

                      $3 == 0 >>> [ <<< 

                      extra } 

                      missing 

         awk :  bailing out at source line 1 

"Syntax error" means that you have made a grammatical error that was detected at the place marked by  >>>   <<<.   "Bailing out" means that no recovery was attempted. Sometimes you get a little more help about what the error was, such as a report of mismatched braces or parentheses. Because of the syntax error, awk did not try to execute this program. Some errors, however, may not be detected until your program is running. For example, if you attempt to divide a number by zero, awk will stop its processing and report the input line number and the line number in the program at which the division was attempted.


Tags:

This site was designed with Websites.co.in - Website Builder

IMPORTANT NOTICE
DISCLAIMER

This website was created by a user of Websites.co.in, a free instant website builder. Websites.co.in does NOT endorse, verify, or guarantee the accuracy, safety, or legality of this site's content, products, or services. Always exercise caution—do not share sensitive data or make payments without independent verification. Report suspicious activity by clicking the report abuse below.

WhatsApp Google Map

Safety and Abuse Reporting

Thanks for being awesome!

We appreciate you contacting us. Our support will get back in touch with you soon!

Have a great day!

Are you sure you want to report abuse against this website?

Please note that your query will be processed only if we find it relevant. Rest all requests will be ignored. If you need help with the website, please login to your dashboard and connect to support