[triqs/statistics] Tools for statistical analysis: binning, jackknife and autocorrelation time¶
Introduction¶
Given the statistical samples \(\lbrace x_i\rbrace _{i=0\dots N-1}\) and \(\lbrace y_i\rbrace _{i=0\dots N-1}\) of random variables \(X\) and \(Y\), one often wants to compute the estimate of the following observables:
\(\langle X \rangle\), \(\langle X\rangle/\langle Y \rangle\), \(\langle X \rangle^2\), or in general \(f(\langle X \rangle , \langle Y \rangle, \dots)\)
as well as the estimate of the errors:
\(\Delta\langle X \rangle\), \(\Delta\langle X\rangle /\langle Y \rangle\), \(\Delta\langle X\rangle ^2\) or \(\Delta f(\langle X \rangle , \langle Y \rangle, \dots)\)
The estimate of the expectation values is the empirical average :
\(\langle X \rangle \approx \frac{1}{N} \sum_{i=0}^{N-1} x_i\)
If the samples are independent from each other and \(f\) is a linear function of its variables (e.g \(f=Id\)):
\((\Delta \langle X \rangle)^2 \approx \frac{\frac{N-1}{N} \sigma^2({x})}{N}\)
where \(\sigma^2({x})\) is the empirical variance of the sample.
In the general case, however,
This library allows one to reliably compute the estimates of \(f(\langle X \rangle , \langle Y \rangle, \dots)\) and its error bar \(\Delta f(\langle X \rangle , \langle Y \rangle, \dots)\) in the general case.
Synopsis¶
- average_and_error takes an object with the Observable concept (see below) and returns a struct with two members val and error:
- val is the estimate of the expectation value of the random variable for a given sample of it
- error is the estimate of the error on this expectation value for the given sample
Concepts¶
TimeSeries¶
An object has the concept of a TimeSeries if it has the following member functions:
Return type | Name |
---|---|
value_type | operator[](int i) |
int | size() |
and the following member type:
Name | Property |
---|---|
value_type | belong to an algebra (has +,-,* operators) |
Observable¶
An object has the concept of an observable if it is a TimeSeries and has, additionally, the following member function:
Return type | Name |
---|---|
observable& | operator<<(T x) |
where T belongs to an algebra.
Example¶
#include <triqs/clef.hpp>
#include <triqs/statistics.hpp>
using namespace triqs::statistics;
int main() {
observable<double> X;
X << 1.0;
X << -1.0;
X << .5;
X << .0;
std::cout << average_and_error(X) << std::endl;
std::cout << average_and_error(X * X) << std::endl;
return 0;
}
Histogram¶
histogram is a lightweight object used to represent and to accumulate a histogram of a real random variable.
#include <triqs/statistics.hpp>
using namespace triqs::statistics;
int main() {
// Histogram with 21 bins over [0;10] range
histogram hist{0, 10, 21};
// General information about histogram
std::cout << "Number of bins = " << hist.size() << std::endl;
auto limits = hist.limits();
std::cout << "Histogram range [" << limits.first << ";" << limits.second << "]" << std::endl;
// Accumulate some value
for (double x : {-10.0, -0.05, 1.1, 2.0, 2.2, 2.9, 3.4, 5.0, 9.0, 10.0, 10.5, 12.1, 32.2}) hist << x;
// Print accumulated histogram
std::cout << "Histogram:\n" << hist << std::endl;
// Accumulated and lost samples
std::cout << "Accumulated data points: " << hist.n_data_pts() << std::endl;
std::cout << "Lost data points: " << hist.n_lost_pts() << std::endl;
// Direct access to histogram data
std::cout << "Histogram data: " << hist.data() << std::endl;
// Make normalized histogram (PDF)
std::cout << "PDF:\n" << pdf(hist) << std::endl;
// Make integrated and normalized histogram (CDF)
std::cout << "CDF:\n" << cdf(hist) << std::endl;
return 0;
}