##  TRUE
Exercise A.2 Write a function that takes a numeric vector
x and a threshold value
as arguments and returns the vector of all values in
x greater than
Test the function on
seq(0, 1, 0.1) with threshold 0.3. Have the example
from Exercise A.1 in mind.
Exercise A.3 Investigate how your function from Exercise A.2
treats missing values (
NA), infinite values
-Inf) and the special value “Not a Number” (
NaN). Rewrite your
function (if necessary) to exclude all or some of such values from
Hint: The functions
is.finite are useful.
Histograms with non-equidistant breaks
The following three exercises will use a data set consisting of measurements of infrared emissions from objects outside of our galax. We will focus on the variable F12, which is the total 12 micron band flux density.
The purpose of this exercise is two-fold. First, you will get familiar with the data and see how different choices of visualizations using histograms can affect your interpretation of the data. Second, you will learn more about how to write functions in R and gain a better understanding of how they work.
Exercise A.4 Plot a histogram of
log(F12) using the default value of the argument
breaks. Experiment with alternative values of
Exercise A.5 Write your own function, called
my_breaks, which takes two arguments,
x (a vector) and
h (a positive integer). Let
h have default value
5. The function should first sort
x into increasing order and then return the vector that: starts with the smallest entry in
contains every \(h\)th unique entry from the sorted
x; ends with the largest entry in
For example, if
h = 2 and
x = c(1, 3, 2, 5, 10, 11, 1, 1, 3) the function should return
c(1, 3, 10, 11). To see this, first sort
x, which gives the vector
c(1, 1, 1, 2, 3, 3, 5, 10, 11), whose unique
c(1, 2, 3, 5, 10, 11). Every second unique entry is
c(1, 3, 10), and then the largest entry
11 is concatenated.
Hint: The functions
unique can be useful.
Use your function to construct breakpoints for the histogram for different values of
h, and compare with the histograms obtained in Exercise A.4.
Exercise A.6 If there are no ties in the data set, the function above will produce breakpoints
h observations in the interval between two consecutive breakpoints
(except the last two perhaps). If there are ties, the function will by construction
return unique breakpoints, but there may be
h observations in some intervals.
The intention is now to rewrite
my_breaks so that if possible each interval
my_breaks function with this intention and so that is has the
- All breakpoints must be unique.
- The range of the breakpoints must cover the range of
- For two subsequent breakpoints, \(a\) and \(b\), there must be at least
hobservations in the interval \((a,b],\) provided
h < length(x). (With the exception that for the first two breakpoints, the interval is \([a,b].\))
Functions and objects
my_hist, which takes a single argument
hand plots a histogram of
log(F12). Extend the implementation so that any additional argument specified when calling
my_histis passed on to
hist. Investigate and explain what happens when executing the following function calls.
Exercise A.8 Modify your
my_hist function so that it returns an object of class
which is not plotted. Write a print method for objects of this class,
which prints just the number of cells.
Hint: It can be useful to know about the function
summarymethod that returns a data frame with two columns containing the midpoints of the cells and the counts.
plotmethod for objects of class
ggplot2for plotting the histogram.
Functions and environments
The following exercises assume that you have implemented a
as in Exercise A.7.
Exercise A.11 What happens if you remove that data and call
What is the environment of
my_hist? Change it to a new environment, and assign
(using the function
assign) the data to a
variable with an appropriate name in that environment. Once this is done,
check what now happens when calling
the data is removed from the global environment.
Exercise A.12 Write a function that takes an argument
x (the data) and
returns a function, where the returned function
takes an argument
h (just as
my_hist) and plots a histogram (just as
Because the return value is a function, we may refer to the function
as a function factory.
What is the environment of the function created by the function factory? What is in the environment? Does it have any effect when calling the function whether the data is altered or removed from the global environment?
Exercise A.13 Evaluate the following function call:
What is the type and class of
tmp? What happens when
plot(tmp, col = "red")
is executed? How can you find help on what
plot does with an
object of this class? Specifically, how do you find the documentation for the
col, which is not an argument of