Built-in Functions in gawk - Shell Scripting

The gawk programming language provides quite a few built-in functions that perform common mathematical, string, and even time functions. You can utilize these functions in your gawk programs to help cut down on the coding requirements in your scripts. This section walks you through the different built-in functions available in gawk.

Mathematical functions

If you’ve done programming in any type of language, you’re probably familiar with using built-in functions in your code to perform common mathematical functions. The gawk programming language doesn’t disappoint those looking for advanced mathematical features.

Table below shows the mathematical built-in functions available in gawk.Table The gawk Mathematical Functions

While it does not have an extensive list of mathematical functions, gawk does provide some of the basic elements you need for standard mathematical processing. The int() function produces the integer portion of a value, but it doesn’t round the value. It behaves much like a floor function found in other programming languages. It produces the nearest integer to a value between the value and 0.

The gawk mathematical functions

This means that the int() function of the value 5.6 will return 5, while the int() function of the value −5.6 will return −5.
The rand() function is great for creating random numbers, but you’ll need to use a trick to get meaningful values. The rand() function returns a random number, but only between the values 0 and 1 (not including 0 or 1). To get a larger number, you’ll need to scale the returned value.

A common method for producing larger integer random numbers is to create an algorithm that uses the rand() function, along with the int() function: x = int(10 * rand())This returns a random integer value between (and including) 0 and 9. Just substitute the 10 in the equation with the upper limit value for your application, and you’re ready to go.

Be careful when using some of the mathematical functions, as the gawk programming language does have a limited range of numeric values it can work with. If you go over that range, you’ll get an error message:

$ gawk ’BEGIN{x=exp(100); print x}’
2.68812e+43
$ gawk ’BEGIN{x=exp(1000); print x}’
gawk: cmd. line:1: warning: exp argument 1000 is out of range
inf
$

The first example calculates the exponential of 100, which is a very large number but within the range of the system. The second example attempts to calculate the exponential of 1000, which goes over the numerical range limit of the system and produces an error message.

Besides the standard mathematical functions, gawk also provides a few functions for bitwise manipulating of data:

  • and(v1, v2): Performs a bitwise AND of values v1 and v2.
  • compl(val): Performs the bitwise complement of val.
  • lshift(val, count): Shifts the value val count number of bits left.
  • or(v1, v2): Performs a bitwise OR of values v1 and v2.
  • rshift(val, count): Shifts the value val count number of bits right.
  • xor(v1, v2): Performs a bitwise XOR of values v1 and v2.
  • The bit manipulation functions are useful when working with binary values in your data.

String functions

The gawk programming language also provides several functions you can use to manipulate string values, shown in Table below

Table The gawk String Functions.Some of the string functions are fairly self-explanatory:

$ gawk ’BEGIN{x = "testing"; print toupper(x); print length(x) }’
TESTING
7
$

However, some of the string functions can get pretty complicated. The asort and asorti functions are new gawk functions that allow you to sort an array variable based on either the data element values (asort) or the index values (asorti). Here’s an example of using asort:

$ gawk ’BEGIN{
› var["a"] = 1
› var["g"] = 2
› var["m"] = 3
› var["u"] = 4
› asort(var, test)
› for (i in test)
› print "Index:",i," - value:",test[i]
› }’
Index: 4 - value: 4
Index: 1 - value: 1
Index: 2 - value: 2
Index: 3 - value: 3
$The gawk string functions

The new array, test, contains the newly sorted data elements of the original array, but the index values are now changed to numerical values, indicating the proper sort order.

The split() function is a great way to push data fields into an array for further processing:

$ gawk ’BEGIN{ FS=","}{
› split($0, var)
› print var[1], var[5]
› }’ data1
data11 data15
data21 data25
data31 data35
$

The new array uses sequential numbers for the array index, starting with index value 1 containing the first data field.

Time functions

The gawk programming language contains a few functions to help you deal with time values, shown in Table below

Table The gawk Time Functions.The time functions are often used when working with log files that contain dates that you need to compare. By converting the text representation of a date to the epoch time (the number of seconds since January 1, 1970) you can easily compare dates.

Here’s an example of using the time functions in a gawk program:

$ gawk ’BEGIN{
› date = systime()
› day = strftime("%A, %B %d, %Y", date)
› print day
› }’
Friday, December 28, 2007The gawk time functions

This example uses the systime() function to retrieve the current epoch timestamp from the system, then uses the strftime() function to convert it into a human-readable format using the date shell command’s date format characters.


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

Shell Scripting Topics