locator() function allows you to interactively
select a point from a plot. To see its effect start a new plot:
plot(1:10, 1:10)This just creates scatterplot with the points ($(k,k)$ for $k=1,...,10.$ Now issue the command
locator(1)The prompt will go away, indicating that the program is running waiting for input from you. Visit the plot window, take your mouse over the plot region. You will notice the mouse cursor changing to a crosshair. Click at some point. Immediately the R prompt will return and coordinates (wrt the axis used in the plot) of the click point will be shown on screen. The
1 passed
to locator tells it to return after a single
click. If you want to click thrice, then
pass 3. Then after three clicks a list will be
returned with two fields in it: x
and y, each a vector of length 3. Here is how you
access these:
p = locator(3) p$x #All the x values p$y #All the y values p$x[1] #x component of the first click p$y[1] #y component of the first clickThe function
points adds points to
an existing plot. Points outside the current plot region
are silently ignored.
plot(1:10,1:10) points(5,6.col='red') points(c(2,4,2), c(3,9,1), col='blue') #Plots (2,3), (4,9) and (2,1)
EXERCISE:
Use the plot, locator
and points functions to draw a point at a clicked point.
EXERCISE: Repeat the above exercise, but for two clicks.
If you uselocator() with specifying the number of
clicks, then it allows any number of clicks (you have to
terminate by right-clicking). The length function
will return the length of a vector. For example,
x = c(3,4,-1,0,0) length(x) # returns 5.
EXERCISE:
Use locator() without specifying the number of
clicks. Click 5 times (and terminate by right-clicking). Then use
length to find the number of clicks.
someVector is any vector, then
for ( x in someVector ) {
blah blah using x
}
will carry out blah blah using x for each entry x
in someVector. For instance,
myVector = 3,4,7,-9,0)) {
for (x in myVector) {
print(sin(x))
}
will print the sine of all the values in the vector. Of course,
you could have achieved the same effect using just:
sin(myVector)This latter form is also more efficient. It automatically loops over all the entries. It is called an implicit loop, as opposed to an explicit loop, using
for. Explicit loops should be avoided if
implicit loops suffice. Explicit loops are needed only when repeating
some multi-step complicated computation.
In our case, we want to run a simulation many times. So use a
loop like for( i in 1:1000) {...} that will repeat
the ... part 1000 times.
EXERCISE: Use explicit loop to print the squares of the first 10 natural numbers. Then use implicit loop to achieve the same.
Often we need to save the result of the computation after each pass of the loop. To take a trivial example, suppose we again want to compute the sine value for each entry in a vector, but instead of printing the values on screen, we want to save them in another vector,y:
myVector = c(3,4,-9, 0, 3 , 7)
y = numeric(6) # 6 is the length of myVector
for (i in 1:6) {
y[i] = sin(myVector[i])
}
The line y = numeric(6) creates a numeric vector of
length 6, containing only zeroes. You also have used y =
rep(0,6) to acieve the same effect. If you want to avoid
explicit mention of 6,
then you should write y = numeric(length(myVector))
or y = rep(0,length(myVector)).
This trivial example is only for illustration purpose. In this
example the following implicit loop is preferable
y = sin(myVector)
EXERCISE:
Pick variable name that is not currently in use,
say ttt. Now execute the following code:
for( i in 1:10) {
ttt[i] = sin(i)
}
The error message will tell you why we need to start with a line
like ttt = numeric(10) or ttt = rep(0,10).
sample function. Here is the
most general form that need:
sample(c(1,1.5,2,3),2,rep=T,prob=c(3,2,5,6))Let`s understand this carefully. The first argument is a vector giving the sample space. Here it is $\{1,1.5,2,3\}.$ This could also be {Male, Female}, in which case the first input to
sample would have
been c('Male','Female'). The second input is the
number of times you want to carry out the experiment. The third
input rep=T says that the sampling is done with
replacement. The last input gives the probabilities (upto a
normalising factor which R will compute automatically).
This is the most general form. In most applications, however, the
sample space will be like $\{1,...,n\}$ for some $n.$
Also the probabilities will all be the same. Then we simply
write:
sample(n,2,rep=T)instead of the more elaborate
sample(1:n,2,rep=T,prob=rep(1,n)) # Here rep(1,n) means (1,...,1) n timesOn some rare ocassions, we shall need to draw a random sample without replacement, like drawing two persons randomly from $n$ persons (the two persons must be distinct):
sample(n,2)
EXERCISE: Roll a biased die 100 times, where $P(i) = P(7-i) \propto i$ for $i=1,2,3.$ Do not compute the constant of proportionality explicitly. Find the mean.
Thesample function is particularly suitable when
the probabilities are all equal, or are specified in some
patternless way. If, however, the probabilities vary according to
some standard distribution, then you will be better off using
some function specific to that distribution. R has most of the
standard distributions built in. To carry out those random
experiments the function always starts with the
letter r followed by an abbreivation of the
distribution, like rbinom for
binomial, rpois for Poisson, and so on. You need to
look up the help of those functions to learn about how the
specify the parameter values. For instance, commonly we talk
about the Binomial($n,p$)distribution. R calls $n$
the size and $p$ the prob.
The following lines create a barchart based on the relative frequency
distribution of a $Poisson(5)$ random sample of
size $1000.$
barplot(table(rpois(1000,lambda=5))/1000)
EXERCISE: Make a similar barplot based on a random sample of size 100 from a Binomial($10,0.3$) distribution. What is the sample mean?
x, some of entries are positive. We
may count all those using the following technique:
sum(x > 0)Details of why it works were given in the R tutorial. Here
> is a Boolean operator. Other
Boolean operators
include: < , <=, > , >=, ==,
!=. You may combine Boolean operations
using & (and), | (or)
and ! (not). For instance, to count all entries
lying in (3,10] we use:
sum(x > 3 & x <= 10)There is another Boolean operator
%in% which is the
"belongs to" operator. For instance, 4 %in%
c(3,6,4,1) returns TRUE, while 6 %in%
c(4,5,2) returns FALSE.
EXERCISE:
Roll a die 1000 times. Count the number of times you get an odd
number. You may use %in% or you may use the modulo
operator & (e.g., 5&3 is 2).
EXERCISE:
I roll two dice 100 times each:
x = sample(6,100,rep=T)
y = sample(6,100,rep=T)
Count the number of times the sum of the two outcomes exceed 7.
result = numeric(N)
for(i in 1:N) {
#Perform th random experiment once
#Do the math with the outcome
result = #Whatever you want to save
}
hist(result)
mean(result)
For instance, we may toss a coin 100 times in
each pass, and save the number of heads obtained. If we repeat
this 10 times, then we shall end up with a
vector nhead of length 10,
where nhead[i] stores the number of heads in
the $i$-th batch of 100 tosses. Here is a code to do this:
nhead = numeric(1000)
for( i in 1:1000) {
tosses = sample(c("Head","Tail"), 100, rep=T)
nhead = sum(tosses=="Head")
}
mean(nhead)
table(nhead) # frequency distribution
barplot(table(nhead))
EXERCISE: A Roulette wheel in a casino consists of a rotating wheel on which a ball rolls. The wheel has some boxes attached to its periphery, marked 0,1,2,...,32. When the wheel stops rolling, the ball drops in one of the boxes. The outcome of this random experiment is the number written on the box. I have bought a ticket priced Rs 100 to play this. I shall get Rs 500 if the number is divisible by 5. Simulate to find my expected payoff per game.
EXERCISE: Use simulation to approximate the expected value of $(X-\bar X)^3$ where $X\sim Poisson(3).$
EXERCISE: Use simulation to approximate the expected value of $(X-\bar X)^3$ where $X\sim Poisson(\lambda)$ for $\lambda \in [1,5].$ Make a plot of the expected value against $\lambda.$
[Hint: Take a grid of values of $\lambda$ in $[1,5]$ and apply the last exercise for each.