Functions in R
The basic structure
Visibility of a variable
Returning multiple values
Passing arguments by position or name
R as a functional programming language
Passing functions as an input to another function
Returning a function from a function
The apply function

Functions in R

Functions play a very important role in R. In this page we shall summarise some of the important points regarding them.

The basic structure

A function is like a machine. It has

zero or more input(s),
one or more output(s),
possible some side-effect(s).

Here is a simple function in R:

myfun = function(x,y) {
  z = x+y
  cat("The difference = ',x-y,'\n')
  sin(z)
}

The name of the function is myfun (any name of your choice, as long as it does not clash with any keyword of R). The word function is a keyword. The inputs to the function are listed (comma-separated) inside the ( ... ). The parentheses are complulsory wven if there are no inputs. The input names are dummy variables (has nothing to do with other variables with the same name). More on this later. The inputs are called the arguments or the parameters of the function.

The body of the function comes next, inside a pairs of braces (optional if the body consists of a single statement).

In body of the function consists of one or more statements, each of which serve one or more of the these three purposes:

internal computation, e.g., the line z = x+y computes z, which is not visible from outside the function.
side-effect, e.g., the line cat("The difference = ',x-y,'\n') prints a line on the screen.
output, i.e., the last line outputs sin(z). The last line is always an output line in R. If you want to produce output at any other line, use return(...).

Visibility of a variable

The variables outside a function, and those inside a function are linked only via the arguments and returned value. The following example illustrates this:

f = function(x) {
  y = x+1
  x = x-1
  2*x+y
}

x = 25
y = 34
z = 4
val = f(z)
x #prints 25
y #prints 34
z #prints 4
val #prints 10

If you are in a desperate need of a function that modifies a variable value outside, then use the <<- assignment instead of = as in the following example.

f = function(x) {
  y = x+1
  x <<- x-1  #The RHS x is the argument x, the LHS x is the global x. 
  2*x+y
}

x = 25
y = 34
z = 4
val = f(z)
x #prints 5 (modified)
y #prints 34 (unmodified)
z #prints 4 (unmodified)
val #prints 10

But use the <<- assignment very sparingly, as its careless use generally leads to bugs that are hard to detect.

Returning multiple values

Functions in R can return only one object. In order to return multiple objects, you first need to pack them into a single object. Such a composite object is called a list.

f = function(n) {
  list(1:n, sum(1:n), sum((1:n)^2))
}

This function takes a positive integer as input, and returns three things:

the array 1,2,...,n
\sum_1^n i
\sum_1^n i^2

The function is used like this:

val = f(10)
val\frac 1
val\frac 2
val\frac 3

Notice the use of \frac ... to unpack a list. However, this is cumbersome. An easier way is to use names for the different fields in a list:

f = function(n) {
  list(all=1:n, sum=sum(1:n), sumOfSq=sum((1:n)^2))
}

Then you can use:

val = f(10)
val$all
val$sum
val$sumOfSq

Passing arguments by position or name

Consider this example:

f = function(x,y,z) {
  cat('x = ',x,'y = ','z = ',z)
}

Here are two different ways to call this function:

f(3,4,5) #Output: x = 3 y = 4 z = 5
f(y=3,x=4,z=5) #Output: x = 4 y = 5 z = 5

The second form uses names of the arguments, and is useful when there are many parameters and it is difficult to remember their order. Its usefulness increases even further when used along with defaut values:

f = function(x,y,z=3) {
  cat('x = ',x,'y = ','z = ',z)
}

Consider the following calls to this function:

f(3,4,5) #Output: x = 3 y = 4 z = 5
f(y=3,x=4) #Output: x = 4 y = 5 z = 3 (default value)

Most standard functions in R is of this type: they have m..a..n..y arguments (may be even more than 50), of which most have default values. We care to remember the positions of only yhe first few (th most important ones), and names of the next few (of secondary importance), and just use the defaults for the rest. An example is the plot function:

u = 1:10
v = 2*x+4
plot(u,v) #Passing by position (the first along horizontal axis,
          #the second along vertical)
plot(u,v,ty='l', col='red') #Using names of arguments. 
                            #Many other arguments, like line
                            #types, line width etc are at their default values.

R as a functional programming language

Notice the similarity of the following two R snippets:

x = c(3,4,6,1)

and

x = function(t) t^2

The first makes x an object, while the second makes it a function. The same assignment operation = is used in both the cases. This points at a deep truth about R: it treats objects and functions on an equal footing. You can do with functions pretty much whatever you can do with objects, e.g., you can pass a function as an input to another function, or return a function from a function, you can have arrays of functions, you can create functions on the fly.

Passing functions as an input to another function

fsq = function(g,x) {
  g(g(x))
}

Returning a function from a function

compose = function(f,g) {
  function(x) {
    f(g(x))
  }
}

Here is how you use it:

sinOfCos = compose(sin,cos)
sinOfCos(1) #returns sin(cos(1))

This is of course of limited value. Here is a really useful one:

iter(f,n) {
  function(x) {
    for(i in 1:n) x = f(x)
    x
  }
}

You can numerically solve the equation $x = \cos x$ using this function as:

iter(cos,100)

`apply` function">The `apply` function

R allows multidimensional arrays (their modern popular name being tensor). The following line produces a $3\times4\times10$ array consisting of the integers 1 to 120:

x = matrix(1:12, 3,4)

Now suppose we want to find row sums. Then you apply the sum function to the first dimension (i.e., rows):

apply(x,1,sum)

If you want to find column sums, then:

apply(x,2,sum)

You may replace sum by prod or any function that takes a vector in and produces a single number.

Table of contents