Logo ROOT   6.30.04
Reference Guide
 All Namespaces Files Pages
df012_DefinesAndFiltersAsStrings.C
Go to the documentation of this file.
1 /// \file
2 /// \ingroup tutorial_dataframe
3 /// \notebook -nodraw
4 ///
5 /// This tutorial illustrates how to save some typing when using RDataFrame
6 /// by invoking functions that perform jit-compiling at runtime.
7 ///
8 /// \macro_code
9 /// \macro_output
10 ///
11 /// \date October 2017
12 /// \author Guilherme Amadio
13 
14 void df012_DefinesAndFiltersAsStrings()
15 {
16  // We will inefficiently calculate an approximation of pi by generating
17  // some data and doing very simple filtering and analysis on it
18 
19  // We start by creating an empty dataframe where we will insert 10 million
20  // random points in a square of side 2.0 (that is, with an inscribed circle
21  // of radius 1.0)
22 
23  size_t npoints = 10000000;
24  ROOT::RDataFrame tdf(npoints);
25 
26  // Define what we want inside the dataframe. We do not need to define p as an array,
27  // but we do it here to demonstrate how to use jitting with RDataFrame
28 
29  // NOTE: Although it's possible to use "for (auto&& x : p)" below, it will
30  // shadow the name of the data column "x", and may cause compilation failures
31  // if the local variable and the data column are of different types or the
32  // local x variable is declared in the global scope of the lambda function
33 
34  auto pidf = tdf.Define("x", "gRandom->Uniform(-1.0, 1.0)")
35  .Define("y", "gRandom->Uniform(-1.0, 1.0)")
36  .Define("p", "std::array<double, 2> v{x, y}; return v;")
37  .Define("r", "double r2 = 0.0; for (auto&& x : p) r2 += x*x; return sqrt(r2);");
38 
39  // Now we have a dataframe with columns x, y, p (which is a point based on x
40  // and y), and the radius r = sqrt(x*x + y*y). In order to approximate pi, we
41  // need to know how many of our data points fall inside the unit circle compared
42  // with the total number of points. The ratio of the areas is
43  //
44  // A_circle / A_square = pi r*r / l * l, where r = 1.0, and l = 2.0
45  //
46  // Therefore, we can approximate pi with 4 times the number of points inside the
47  // unit circle over the total number of points in our dataframe:
48 
49  auto incircle = *(pidf.Filter("r <= 1.0").Count());
50 
51  double pi_approx = 4.0 * incircle / npoints;
52 
53  std::cout << "pi is approximately equal to " << pi_approx << std::endl;
54 }