[Home]

Table of contents


Functions in R

Functions play a very important role in R. In this page we shall summarise some of the important points regarding them.

The basic structure

A function is like a machine. It has Here is a simple function in R:
myfun = function(x,y) {
  z = x+y
  cat("The difference = ',x-y,'\n')
  sin(z)
}
The name of the function is myfun (any name of your choice, as long as it does not clash with any keyword of R). The word function is a keyword. The inputs to the function are listed (comma-separated) inside the ( ... ). The parentheses are complulsory wven if there are no inputs. The input names are dummy variables (has nothing to do with other variables with the same name). More on this later. The inputs are called the arguments or the parameters of the function.

The body of the function comes next, inside a pairs of braces (optional if the body consists of a single statement).

In body of the function consists of one or more statements, each of which serve one or more of the these three purposes:

Visibility of a variable

The variables outside a function, and those inside a function are linked only via the arguments and returned value. The following example illustrates this:
f = function(x) {
  y = x+1
  x = x-1
  2*x+y
}

x = 25
y = 34
z = 4
val = f(z)
x #prints 25
y #prints 34
z #prints 4
val #prints 10
If you are in a desperate need of a function that modifies a variable value outside, then use the <<- assignment instead of = as in the following example.
f = function(x) {
  y = x+1
  x <<- x-1  #The RHS x is the argument x, the LHS x is the global x. 
  2*x+y
}

x = 25
y = 34
z = 4
val = f(z)
x #prints 5 (modified)
y #prints 34 (unmodified)
z #prints 4 (unmodified)
val #prints 10
But use the <<- assignment very sparingly, as its careless use generally leads to bugs that are hard to detect.

Returning multiple values

Functions in R can return only one object. In order to return multiple objects, you first need to pack them into a single object. Such a composite object is called a list.
f = function(n) {
  list(1:n, sum(1:n), sum((1:n)^2))
}
This function takes a positive integer as input, and returns three things: The function is used like this:
val = f(10)
val\frac 1
val\frac 2
val\frac 3
Notice the use of \frac ... to unpack a list. However, this is cumbersome. An easier way is to use names for the different fields in a list:
f = function(n) {
  list(all=1:n, sum=sum(1:n), sumOfSq=sum((1:n)^2))
}
Then you can use:
val = f(10)
val$all
val$sum
val$sumOfSq

Passing arguments by position or name

Consider this example:
f = function(x,y,z) {
  cat('x = ',x,'y = ','z = ',z)
}
Here are two different ways to call this function:
f(3,4,5) #Output: x = 3 y = 4 z = 5
f(y=3,x=4,z=5) #Output: x = 4 y = 5 z = 5
The second form uses names of the arguments, and is useful when there are many parameters and it is difficult to remember their order. Its usefulness increases even further when used along with defaut values:
f = function(x,y,z=3) {
  cat('x = ',x,'y = ','z = ',z)
}
Consider the following calls to this function:
f(3,4,5) #Output: x = 3 y = 4 z = 5
f(y=3,x=4) #Output: x = 4 y = 5 z = 3 (default value)
Most standard functions in R is of this type: they have m..a..n..y arguments (may be even more than 50), of which most have default values. We care to remember the positions of only yhe first few (th most important ones), and names of the next few (of secondary importance), and just use the defaults for the rest. An example is the plot function:
u = 1:10
v = 2*x+4
plot(u,v) #Passing by position (the first along horizontal axis,
          #the second along vertical)
plot(u,v,ty='l', col='red') #Using names of arguments. 
                            #Many other arguments, like line
                            #types, line width etc are at their default values.

R as a functional programming language

Notice the similarity of the following two R snippets:
x = c(3,4,6,1)
and
x = function(t) t^2
The first makes x an object, while the second makes it a function. The same assignment operation = is used in both the cases. This points at a deep truth about R: it treats objects and functions on an equal footing. You can do with functions pretty much whatever you can do with objects, e.g., you can pass a function as an input to another function, or return a function from a function, you can have arrays of functions, you can create functions on the fly.

Passing functions as an input to another function

fsq = function(g,x) {
  g(g(x))
}

Returning a function from a function

compose = function(f,g) {
  function(x) {
    f(g(x))
  }
}
Here is how you use it:
sinOfCos = compose(sin,cos)
sinOfCos(1) #returns sin(cos(1))
This is of course of limited value. Here is a really useful one:
iter(f,n) {
  function(x) {
    for(i in 1:n) x = f(x)
    x
  }
}
You can numerically solve the equation $x = \cos x$ using this function as:
iter(cos,100)

apply function">The apply function

R allows multidimensional arrays (their modern popular name being tensor). The following line produces a $3\times4\times10$ array consisting of the integers 1 to 120:
x = matrix(1:12, 3,4)
Now suppose we want to find row sums. Then you apply the sum function to the first dimension (i.e., rows):
apply(x,1,sum)
If you want to find column sums, then:
apply(x,2,sum)
You may replace sum by prod or any function that takes a vector in and produces a single number.