Logo ROOT   6.30.04
Reference Guide
 All Namespaces Files Pages
df012_DefinesAndFiltersAsStrings.py
Go to the documentation of this file.
1 ## \file
2 ## \ingroup tutorial_dataframe
3 ## \notebook -nodraw
4 ##
5 ## This tutorial illustrates how to use jit-compiling features of RDataFrame
6 ## to define data using C++ code in a Python script
7 ##
8 ## \macro_code
9 ## \macro_output
10 ##
11 ## \date October 2017
12 ## \author Guilherme Amadio
13 
14 import ROOT
15 
16 ## We will inefficiently calculate an approximation of pi by generating
17 ## some data and doing very simple filtering and analysis on it.
18 
19 ## We start by creating an empty dataframe where we will insert 10 million
20 ## random points in a square of side 2.0 (that is, with an inscribed unit
21 ## circle).
22 
23 npoints = 10000000
24 tdf = ROOT.ROOT.RDataFrame(npoints)
25 
26 ## Define what data we want inside the dataframe. We do not need to define p
27 ## as an array, but we do it here to demonstrate how to use jitting with RDataFrame
28 
29 pidf = tdf.Define("x", "gRandom->Uniform(-1.0, 1.0)") \
30  .Define("y", "gRandom->Uniform(-1.0, 1.0)") \
31  .Define("p", "std::array<double, 2> v{x, y}; return v;") \
32  .Define("r", "double r2 = 0.0; for (auto&& w : p) r2 += w*w; return sqrt(r2);")
33 
34 ## Now we have a dataframe with columns x, y, p (which is a point based on x
35 ## and y), and the radius r = sqrt(x*x + y*y). In order to approximate pi, we
36 ## need to know how many of our data points fall inside the circle of radius
37 ## one compared with the total number of points. The ratio of the areas is
38 ##
39 ## A_circle / A_square = pi r*r / l * l, where r = 1.0, and l = 2.0
40 ##
41 ## Therefore, we can approximate pi with 4 times the number of points inside
42 ## the unit circle over the total number of points:
43 
44 incircle = pidf.Filter("r <= 1.0").Count().GetValue()
45 
46 pi_approx = 4.0 * incircle / npoints
47 
48 print("pi is approximately equal to %g" % (pi_approx))