[Home]
Table of contents
Functions play a very important role in R. In this page we shall
summarise some of the important points regarding them.
A function is like a machine. It has
- zero or more input(s),
- one or more output(s),
- possible some side-effect(s).
Here is a simple function in R:
myfun = function(x,y) {
z = x+y
cat("The difference = ',x-y,'\n')
sin(z)
}
The name of the function is myfun (any name of your
choice, as long as it does not clash with any keyword of R). The
word function is a keyword. The inputs to the
function are listed (comma-separated) inside the (
... ). The parentheses are complulsory wven if there
are no inputs. The input names are dummy variables (has nothing
to do with other variables with the same name). More on this
later. The inputs are called the arguments or
the parameters of the function.
The body of the function comes next, inside a pairs of braces
(optional if the body consists of a single statement).
In body of the function consists of one or more statements, each
of which serve one or more of the these three purposes:
- internal computation, e.g., the line
z = x+y
computes z, which is not visible from outside the function.
- side-effect, e.g., the line
cat("The difference =
',x-y,'\n') prints a line on the screen.
- output, i.e., the last line outputs
sin(z). The
last line is always an output line in R. If you want to produce
output at any other line, use return(...).
The variables outside a function, and those inside a function are
linked only via the arguments and returned value. The following example illustrates
this:
f = function(x) {
y = x+1
x = x-1
2*x+y
}
x = 25
y = 34
z = 4
val = f(z)
x #prints 25
y #prints 34
z #prints 4
val #prints 10
If you are in a desperate need of a function that modifies a variable value
outside, then use the <<- assignment instead
of = as in the following example.
f = function(x) {
y = x+1
x <<- x-1 #The RHS x is the argument x, the LHS x is the global x.
2*x+y
}
x = 25
y = 34
z = 4
val = f(z)
x #prints 5 (modified)
y #prints 34 (unmodified)
z #prints 4 (unmodified)
val #prints 10
But use the <<- assignment very sparingly, as its
careless use generally leads to bugs that are hard to detect.
Functions in R can return only one object. In order to return
multiple objects, you first need to pack them into a single
object. Such a composite object is called a list.
f = function(n) {
list(1:n, sum(1:n), sum((1:n)^2))
}
This function takes a positive integer as input, and returns
three things:
- the array 1,2,...,n
- \sum_1^n i
- \sum_1^n i^2
The function is used like this:
val = f(10)
val\frac 1
val\frac 2
val\frac 3
Notice the use of \frac ... to unpack
a list. However, this is cumbersome. An easier way
is to use names for the different fields in a list:
f = function(n) {
list(all=1:n, sum=sum(1:n), sumOfSq=sum((1:n)^2))
}
Then you can use:
val = f(10)
val$all
val$sum
val$sumOfSq
Consider this example:
f = function(x,y,z) {
cat('x = ',x,'y = ','z = ',z)
}
Here are two different ways to call this function:
f(3,4,5) #Output: x = 3 y = 4 z = 5
f(y=3,x=4,z=5) #Output: x = 4 y = 5 z = 5
The second form uses names of the arguments, and is useful when
there are many parameters and it is difficult to remember their
order. Its usefulness increases even further when used along with
defaut values:
f = function(x,y,z=3) {
cat('x = ',x,'y = ','z = ',z)
}
Consider the following calls to this function:
f(3,4,5) #Output: x = 3 y = 4 z = 5
f(y=3,x=4) #Output: x = 4 y = 5 z = 3 (default value)
Most standard functions in R is of this type: they have
m..a..n..y arguments (may be even more than 50), of which most
have default values. We care to remember the positions of only
yhe first few (th most important ones), and names of the next few
(of secondary importance), and just use the defaults for the
rest. An example is the plot function:
u = 1:10
v = 2*x+4
plot(u,v) #Passing by position (the first along horizontal axis,
#the second along vertical)
plot(u,v,ty='l', col='red') #Using names of arguments.
#Many other arguments, like line
#types, line width etc are at their default values.
Notice the similarity of the following two R snippets:
x = c(3,4,6,1)
and
x = function(t) t^2
The first makes x an object, while the second makes
it a function. The same assignment operation = is
used in both the cases. This points at a deep truth about R: it
treats objects and functions on an equal footing. You can do with
functions pretty much whatever you can do with objects, e.g., you
can pass a function as an input to another function, or return a
function from a function, you can have arrays of functions, you
can create functions on the fly.
fsq = function(g,x) {
g(g(x))
}
compose = function(f,g) {
function(x) {
f(g(x))
}
}
Here is how you use it:
sinOfCos = compose(sin,cos)
sinOfCos(1) #returns sin(cos(1))
This is of course of limited value. Here is a really useful one:
iter(f,n) {
function(x) {
for(i in 1:n) x = f(x)
x
}
}
You can numerically solve the equation $x = \cos x$ using
this function as:
iter(cos,100)
R allows multidimensional arrays (their modern popular name
being tensor).
The following line produces a $3\times4\times10$ array
consisting of the integers 1 to 120:
x = matrix(1:12, 3,4)
Now suppose we want to find row sums. Then you apply
the sum function to the first dimension (i.e.,
rows):
apply(x,1,sum)
If you want to find column sums, then:
apply(x,2,sum)
You may replace sum by prod or any
function that takes a vector in and produces a single number.