nearfunction (x, y, tol = .Machine$double.eps^0.5)
{
abs(x - y) < tol
}
<bytecode: 0x000001b7c23a4650>
<environment: namespace:dplyr>
In this chapter we will work with nycflights13 and tidyverse packages
Q1. How does dplyr::near() work? Type near() to see the source code. is sqrt(2)^2 near 2?
Step 1. To begin, I asked Google: “How can I view the source code of a function in R?”.
Typing near in the console returns the following:
function (x, y, tol = .Machine$double.eps^0.5)
{
abs(x - y) < tol
}
<bytecode: 0x000001b7c23a4650>
<environment: namespace:dplyr>
Typing view(near) in the console displays the same output in a separate tab.
Step 2. Compare the two vectors provided in Chapter 12.
Rewrite the vector x in a simpler way, to better understand how near works. For example:
Are x and y identical?
Not really! computers store numbers with a fixed number of decimal points. See below.
And dplyr::near ignores small differences.
Q2. Use mutate(), is.na(), and count() together to describe how the missing values in dep_time, sched_dep_time and dep_delay are connected.
One way to figure this out is:
nycflights13::flights |>
mutate(missing_sched_dep_time = case_when(is.na(sched_dep_time) ~ TRUE, TRUE ~ FALSE),
missing_dep_time = case_when(is.na(dep_time) ~ TRUE, TRUE ~ FALSE),
missing_dep_delay = case_when(is.na(dep_delay) ~ TRUE, TRUE ~ FALSE)) |>
group_by(missing_sched_dep_time, missing_dep_time, missing_dep_delay) |> count()# A tibble: 2 × 4
missing_sched_dep_time missing_dep_time missing_dep_delay n
<lgl> <lgl> <lgl> <int>
1 FALSE FALSE FALSE 328521
2 FALSE TRUE TRUE 8255
Another way, without dplyr::mutate()
nycflights13::flights|> group_by(is.na(sched_dep_time), is.na(dep_time), is.na(dep_delay)) |> count()# A tibble: 2 × 4
`is.na(sched_dep_time)` `is.na(dep_time)` `is.na(dep_delay)` n
<lgl> <lgl> <lgl> <int>
1 FALSE FALSE FALSE 328521
2 FALSE TRUE TRUE 8255
Flights with missing data on ‘departure time’ also had missing data on ‘departure delay’. However, scheduled departure time was always reported (i.e., no missing data).