UNIX Tutorial Eight

8.1 Advanced Text Processing

Calculating Paleolatitudes

There are a variety of functions within awk that can be used to perform advanced computing commands. To help illustrate how one can use these advanced functions, we will use awk to convert the inclinations in our paleomagnetic database into paleolatitudes. You will recall that the equation for the relationsip between paleoinclination (I) and paleolatitude (L) is

tan I = 2 tan (L)

Since we already have a database file with inclinations in the second column, we can reorganize this equation to solve for the quantity we want to solve for: L.

L = arctan ( 1/2 tan I )

To implement this equation using awk we can use several trigonometric functions that are built in to awk. Since this is our fourth group activity, please generate a new act4 directory inside your groupwork directory and move into that act4 directory.

8.2 Trigonometric Functions

While awk has many function, unfortunately it does not include a tangent function. However, it does have sine (sin), cosine (cos), and arctangent (atan2) built in. We can use those functions to calculate our values (recall tan=sin/cos), but there are some important nuances about these functions that you need to be aware of. First, all awk trigonometric functions use angles in radians instead of degrees, so sin(angle) and cos(angle) expect angle to be in radians and atan2 will return a value in radians. Second, atan2(y,x) expects the input to be in two parameters that would represent y/x. To help avoid confusion over this, the easiest way to use atan2 is to specify the value you want the arctangent of as the first parameter and 1 as the second parameter: atan2(value,1).

With those explanations out of the way we can begin to construct our awk program. To help illustrate how we put together what to type at the command line, I will go through each part step by step. Each of these parts will be combined together as one command at the end, but I want to make sure you understand each part first. So you don't need to write in each part until the end. First we would begin the awk command by specifying the pattern matching part. As in the last tutorial, we do want to skip the header line of our data file, so we begin with

awk 'NR>1

We are now ready for the action part of the awk program that will run when the pattern matching condition is met (after the first line). Using a similar approach to our last awk tutorial, we can establish a variable called dtr to store the information for converting degrees to radians.

{dtr=3.14/180;

Next we can add a command to store the inclination angle in radians

inc=$2*dtr;

Now we can calculate the tangent of the inclination angle

taninc=sin(inc)/cos(inc);

Then we can calculate the paleolatitude in radians using our equation

paleolat=atan2(taninc/2,1);

Finally, we can print the paleolatitude in degrees

print paleolat/dtr}'

All that's left to do is specify the input file name and send the output to a file

../act3/paleomag.txt >! paleolat.txt

The ! character is called the clobber operator for our shell, because it allows you to overwrite a file. If you were to use the > character to try to output to a file that already exists, you will get an error. If you use the two together you can send to the output to that filename regardless of whether that file exists or not. Obviously, you need to be careful about using the clobber, because you might overwrite a file by accident and lose the data inside.

So now if you are to combine each of these steps together on the command line it should look like this

% awk 'NR>1{dtr=3.14/180; inc=$2*dtr; taninc=sin(inc)/cos(inc); paleolat=atan2(taninc/2,1); print paleolat/dtr}' ../act3/paleomag.txt >! paleolat.txt

Go ahead and run that command and use cat to see what the output looks like. The first 3 lines of the 5 line file should look like this

9.11979
7.52539
-13.0659

Exercise 8.1

To prepare for our next tutorial, you need to generate a file that has age in the first column and paleolatitude in the second column. You should be able to adjust the command you just used to generate a new file (age-paleolat.txt). The calculations should all remain the same, you just need to adjust to the print part of the command to print the age and then the paleolat/dtr value. Do you remember where the age value is stored?

Summary

sin(angle) awk command to calculate sine of angle in radians
cos(angle) awk command to calculate cosine of angle in radians
atan2(y,x) awk command to calculate arctangent of y/x
command >! file send output of command to file and overwrite file if it exists

 

brudzimr@muohio.edu, 28th August 2006