{
    "content": [
        {
            "type": "text",
            "text": "# statistics (pydoc)\n\n**Summary:** statistics - Basic statistics module.\n\n## Section Outline\n\n- **NAME** (2 lines)\n- **MODULE REFERENCE** (8 lines)\n- **DESCRIPTION** (101 lines)\n- **CLASSES** (5 lines) — 2 subsections\n  - class NormalDist (140 lines)\n  - class StatisticsError (68 lines)\n- **FUNCTIONS** (1 lines) — 18 subsections\n  - correlation (15 lines)\n  - covariance (15 lines)\n  - fmean (8 lines)\n  - geometric_mean (11 lines)\n  - harmonic_mean (22 lines)\n  - linear_regression (23 lines)\n  - mean (15 lines)\n  - median (11 lines)\n  - median_grouped (25 lines)\n  - median_high (10 lines)\n  - median_low (10 lines)\n  - mode (21 lines)\n  - multimode (12 lines)\n  - pstdev (7 lines)\n  - pvariance (33 lines)\n  - quantiles (15 lines)\n  - stdev (7 lines)\n  - variance (36 lines)\n- **DATA** (2 lines)\n- **FILE** (3 lines)\n\n## Full Content\n\n### NAME\n\nstatistics - Basic statistics module.\n\n### MODULE REFERENCE\n\nhttps://docs.python.org/3.10/library/statistics.html\n\nThe following documentation is automatically generated from the Python\nsource files.  It may be incomplete, incorrect or include features that\nare considered implementation detail and may vary between Python\nimplementations.  When in doubt, consult the module reference at the\nlocation listed above.\n\n### DESCRIPTION\n\nThis module provides functions for calculating statistics of data, including\naverages, variance, and standard deviation.\n\nCalculating averages\n--------------------\n\n==================  ==================================================\nFunction            Description\n==================  ==================================================\nmean                Arithmetic mean (average) of data.\nfmean               Fast, floating point arithmetic mean.\ngeometricmean      Geometric mean of data.\nharmonicmean       Harmonic mean of data.\nmedian              Median (middle value) of data.\nmedianlow          Low median of data.\nmedianhigh         High median of data.\nmediangrouped      Median, or 50th percentile, of grouped data.\nmode                Mode (most common value) of data.\nmultimode           List of modes (most common values of data).\nquantiles           Divide data into intervals with equal probability.\n==================  ==================================================\n\nCalculate the arithmetic mean (\"the average\") of data:\n\n>>> mean([-1.0, 2.5, 3.25, 5.75])\n2.625\n\n\nCalculate the standard median of discrete data:\n\n>>> median([2, 3, 4, 5])\n3.5\n\n\nCalculate the median, or 50th percentile, of data grouped into class intervals\ncentred on the data values provided. E.g. if your data points are rounded to\nthe nearest whole number:\n\n>>> mediangrouped([2, 2, 3, 3, 3, 4])  #doctest: +ELLIPSIS\n2.8333333333...\n\nThis should be interpreted in this way: you have two data points in the class\ninterval 1.5-2.5, three data points in the class interval 2.5-3.5, and one in\nthe class interval 3.5-4.5. The median of these data points is 2.8333...\n\n\nCalculating variability or spread\n---------------------------------\n\n==================  =============================================\nFunction            Description\n==================  =============================================\npvariance           Population variance of data.\nvariance            Sample variance of data.\npstdev              Population standard deviation of data.\nstdev               Sample standard deviation of data.\n==================  =============================================\n\nCalculate the standard deviation of sample data:\n\n>>> stdev([2.5, 3.25, 5.5, 11.25, 11.75])  #doctest: +ELLIPSIS\n4.38961843444...\n\nIf you have previously calculated the mean, you can pass it as the optional\nsecond argument to the four \"spread\" functions to avoid recalculating it:\n\n>>> data = [1, 2, 2, 4, 4, 4, 5, 6]\n>>> mu = mean(data)\n>>> pvariance(data, mu)\n2.5\n\n\nStatistics for relations between two inputs\n-------------------------------------------\n\n==================  ====================================================\nFunction            Description\n==================  ====================================================\ncovariance          Sample covariance for two variables.\ncorrelation         Pearson's correlation coefficient for two variables.\nlinearregression   Intercept and slope for simple linear regression.\n==================  ====================================================\n\nCalculate covariance, Pearson's correlation, and simple linear regression\nfor two inputs:\n\n>>> x = [1, 2, 3, 4, 5, 6, 7, 8, 9]\n>>> y = [1, 2, 3, 1, 2, 3, 1, 2, 3]\n>>> covariance(x, y)\n0.75\n>>> correlation(x, y)  #doctest: +ELLIPSIS\n0.31622776601...\n>>> linearregression(x, y)  #doctest:\nLinearRegression(slope=0.1, intercept=1.5)\n\n\nExceptions\n----------\n\nA single exception is defined: StatisticsError is a subclass of ValueError.\n\n### CLASSES\n\nbuiltins.ValueError(builtins.Exception)\nStatisticsError\nbuiltins.object\nNormalDist\n\n#### class NormalDist\n\n|  NormalDist(mu=0.0, sigma=1.0)\n|\n|  Normal distribution of a random variable\n|\n|  Methods defined here:\n|\n|  add(x1, x2)\n|      Add a constant or another NormalDist instance.\n|\n|      If *other* is a constant, translate mu by the constant,\n|      leaving sigma unchanged.\n|\n|      If *other* is a NormalDist, add both the means and the variances.\n|      Mathematically, this works only if the two distributions are\n|      independent or if they are jointly normally distributed.\n|\n|  eq(x1, x2)\n|      Two NormalDist objects are equal if their mu and sigma are both equal.\n|\n|  getstate(self)\n|\n|  hash(self)\n|      NormalDist objects hash equal if their mu and sigma are both equal.\n|\n|  init(self, mu=0.0, sigma=1.0)\n|      NormalDist where mu is the mean and sigma is the standard deviation.\n|\n|  mul(x1, x2)\n|      Multiply both mu and sigma by a constant.\n|\n|      Used for rescaling, perhaps to change measurement units.\n|      Sigma is scaled with the absolute value of the constant.\n|\n|  neg(x1)\n|      Negates mu while keeping sigma the same.\n|\n|  pos(x1)\n|      Return a copy of the instance.\n|\n|  radd = add(x1, x2)\n|\n|  repr(self)\n|      Return repr(self).\n|\n|  rmul = mul(x1, x2)\n|\n|  rsub(x1, x2)\n|      Subtract a NormalDist from a constant or another NormalDist.\n|\n|  setstate(self, state)\n|\n|  sub(x1, x2)\n|      Subtract a constant or another NormalDist instance.\n|\n|      If *other* is a constant, translate by the constant mu,\n|      leaving sigma unchanged.\n|\n|      If *other* is a NormalDist, subtract the means and add the variances.\n|      Mathematically, this works only if the two distributions are\n|      independent or if they are jointly normally distributed.\n|\n|  truediv(x1, x2)\n|      Divide both mu and sigma by a constant.\n|\n|      Used for rescaling, perhaps to change measurement units.\n|      Sigma is scaled with the absolute value of the constant.\n|\n|  cdf(self, x)\n|      Cumulative distribution function.  P(X <= x)\n|\n|  invcdf(self, p)\n|      Inverse cumulative distribution function.  x : P(X <= x) = p\n|\n|      Finds the value of the random variable such that the probability of\n|      the variable being less than or equal to that value equals the given\n|      probability.\n|\n|      This function is also called the percent point function or quantile\n|      function.\n|\n|  overlap(self, other)\n|      Compute the overlapping coefficient (OVL) between two normal distributions.\n|\n|      Measures the agreement between two normal probability distributions.\n|      Returns a value between 0.0 and 1.0 giving the overlapping area in\n|      the two underlying probability density functions.\n|\n|          >>> N1 = NormalDist(2.4, 1.6)\n|          >>> N2 = NormalDist(3.2, 2.0)\n|          >>> N1.overlap(N2)\n|          0.8035050657330205\n|\n|  pdf(self, x)\n|      Probability density function.  P(x <= X < x+dx) / dx\n|\n|  quantiles(self, n=4)\n|      Divide into *n* continuous intervals with equal probability.\n|\n|      Returns a list of (n - 1) cut points separating the intervals.\n|\n|      Set *n* to 4 for quartiles (the default).  Set *n* to 10 for deciles.\n|      Set *n* to 100 for percentiles which gives the 99 cuts points that\n|      separate the normal distribution in to 100 equal sized groups.\n|\n|  samples(self, n, *, seed=None)\n|      Generate *n* samples for a given mean and standard deviation.\n|\n|  zscore(self, x)\n|      Compute the Standard Score.  (x - mean) / stdev\n|\n|      Describes *x* in terms of the number of standard deviations\n|      above or below the mean of the normal distribution.\n|\n|  ----------------------------------------------------------------------\n|  Class methods defined here:\n|\n|  fromsamples(data) from builtins.type\n|      Make a normal distribution instance from sample data.\n|\n|  ----------------------------------------------------------------------\n|  Readonly properties defined here:\n|\n|  mean\n|      Arithmetic mean of the normal distribution.\n|\n|  median\n|      Return the median of the normal distribution\n|\n|  mode\n|      Return the mode of the normal distribution\n|\n|      The mode is the value x where which the probability density\n|      function (pdf) takes its maximum value.\n|\n|  stdev\n|      Standard deviation of the normal distribution.\n|\n|  variance\n|      Square of the standard deviation.\n\n#### class StatisticsError\n\n|  Method resolution order:\n|      StatisticsError\n|      builtins.ValueError\n|      builtins.Exception\n|      builtins.BaseException\n|      builtins.object\n|\n|  Data descriptors defined here:\n|\n|  weakref\n|      list of weak references to the object (if defined)\n|\n|  ----------------------------------------------------------------------\n|  Methods inherited from builtins.ValueError:\n|\n|  init(self, /, *args, kwargs)\n|      Initialize self.  See help(type(self)) for accurate signature.\n|\n|  ----------------------------------------------------------------------\n|  Static methods inherited from builtins.ValueError:\n|\n|  new(*args, kwargs) from builtins.type\n|      Create and return a new object.  See help(type) for accurate signature.\n|\n|  ----------------------------------------------------------------------\n|  Methods inherited from builtins.BaseException:\n|\n|  delattr(self, name, /)\n|      Implement delattr(self, name).\n|\n|  getattribute(self, name, /)\n|      Return getattr(self, name).\n|\n|  reduce(...)\n|      Helper for pickle.\n|\n|  repr(self, /)\n|      Return repr(self).\n|\n|  setattr(self, name, value, /)\n|      Implement setattr(self, name, value).\n|\n|  setstate(...)\n|\n|  str(self, /)\n|      Return str(self).\n|\n|  withtraceback(...)\n|      Exception.withtraceback(tb) --\n|      set self.traceback to tb and return self.\n|\n|  ----------------------------------------------------------------------\n|  Data descriptors inherited from builtins.BaseException:\n|\n|  cause\n|      exception cause\n|\n|  context\n|      exception context\n|\n|  dict\n|\n|  suppresscontext\n|\n|  traceback\n|\n|  args\n\n### FUNCTIONS\n\n#### correlation\n\nPearson's correlation coefficient\n\nReturn the Pearson's correlation coefficient for two inputs. Pearson's\ncorrelation coefficient *r* takes values between -1 and +1. It measures the\nstrength and direction of the linear relationship, where +1 means very\nstrong, positive linear relationship, -1 very strong, negative linear\nrelationship, and 0 no linear relationship.\n\n>>> x = [1, 2, 3, 4, 5, 6, 7, 8, 9]\n>>> y = [9, 8, 7, 6, 5, 4, 3, 2, 1]\n>>> correlation(x, x)\n1.0\n>>> correlation(x, y)\n-1.0\n\n#### covariance\n\nCovariance\n\nReturn the sample covariance of two inputs *x* and *y*. Covariance\nis a measure of the joint variability of two inputs.\n\n>>> x = [1, 2, 3, 4, 5, 6, 7, 8, 9]\n>>> y = [1, 2, 3, 1, 2, 3, 1, 2, 3]\n>>> covariance(x, y)\n0.75\n>>> z = [9, 8, 7, 6, 5, 4, 3, 2, 1]\n>>> covariance(x, z)\n-7.5\n>>> covariance(z, x)\n-7.5\n\n#### fmean\n\nConvert data to floats and compute the arithmetic mean.\n\nThis runs faster than the mean() function and it always returns a float.\nIf the input dataset is empty, it raises a StatisticsError.\n\n>>> fmean([3.5, 4.0, 5.25])\n4.25\n\n#### geometric_mean\n\nConvert data to floats and compute the geometric mean.\n\nRaises a StatisticsError if the input dataset is empty,\nif it contains a zero, or if it contains a negative value.\n\nNo special efforts are made to achieve exact results.\n(However, this may change in the future.)\n\n>>> round(geometricmean([54, 24, 36]), 9)\n36.0\n\n#### harmonic_mean\n\nReturn the harmonic mean of data.\n\nThe harmonic mean is the reciprocal of the arithmetic mean of the\nreciprocals of the data.  It can be used for averaging ratios or\nrates, for example speeds.\n\nSuppose a car travels 40 km/hr for 5 km and then speeds-up to\n60 km/hr for another 5 km. What is the average speed?\n\n>>> harmonicmean([40, 60])\n48.0\n\nSuppose a car travels 40 km/hr for 5 km, and when traffic clears,\nspeeds-up to 60 km/hr for the remaining 30 km of the journey. What\nis the average speed?\n\n>>> harmonicmean([40, 60], weights=[5, 30])\n56.0\n\nIf ``data`` is empty, or any element is less than zero,\n``harmonicmean`` will raise ``StatisticsError``.\n\n#### linear_regression\n\nSlope and intercept for simple linear regression.\n\nReturn the slope and intercept of simple linear regression\nparameters estimated using ordinary least squares. Simple linear\nregression describes relationship between an independent variable\n*x* and a dependent variable *y* in terms of linear function:\n\ny = slope * x + intercept + noise\n\nwhere *slope* and *intercept* are the regression parameters that are\nestimated, and noise represents the variability of the data that was\nnot explained by the linear regression (it is equal to the\ndifference between predicted and actual values of the dependent\nvariable).\n\nThe parameters are returned as a named tuple.\n\n>>> x = [1, 2, 3, 4, 5]\n>>> noise = NormalDist().samples(5, seed=42)\n>>> y = [3 * x[i] + 2 + noise[i] for i in range(5)]\n>>> linearregression(x, y)  #doctest: +ELLIPSIS\nLinearRegression(slope=3.09078914170..., intercept=1.75684970486...)\n\n#### mean\n\nReturn the sample arithmetic mean of data.\n\n>>> mean([1, 2, 3, 4, 4])\n2.8\n\n>>> from fractions import Fraction as F\n>>> mean([F(3, 7), F(1, 21), F(5, 3), F(1, 3)])\nFraction(13, 21)\n\n>>> from decimal import Decimal as D\n>>> mean([D(\"0.5\"), D(\"0.75\"), D(\"0.625\"), D(\"0.375\")])\nDecimal('0.5625')\n\nIf ``data`` is empty, StatisticsError will be raised.\n\n#### median\n\nReturn the median (middle value) of numeric data.\n\nWhen the number of data points is odd, return the middle data point.\nWhen the number of data points is even, the median is interpolated by\ntaking the average of the two middle values:\n\n>>> median([1, 3, 5])\n3\n>>> median([1, 3, 5, 7])\n4.0\n\n#### median_grouped\n\nReturn the 50th percentile (median) of grouped continuous data.\n\n>>> mediangrouped([1, 2, 2, 3, 4, 4, 4, 4, 4, 5])\n3.7\n>>> mediangrouped([52, 52, 53, 54])\n52.5\n\nThis calculates the median as the 50th percentile, and should be\nused when your data is continuous and grouped. In the above example,\nthe values 1, 2, 3, etc. actually represent the midpoint of classes\n0.5-1.5, 1.5-2.5, 2.5-3.5, etc. The middle value falls somewhere in\nclass 3.5-4.5, and interpolation is used to estimate it.\n\nOptional argument ``interval`` represents the class interval, and\ndefaults to 1. Changing the class interval naturally will change the\ninterpolated 50th percentile value:\n\n>>> mediangrouped([1, 3, 3, 5, 7], interval=1)\n3.25\n>>> mediangrouped([1, 3, 3, 5, 7], interval=2)\n3.5\n\nThis function does not check whether the data points are at least\n``interval`` apart.\n\n#### median_high\n\nReturn the high median of data.\n\nWhen the number of data points is odd, the middle value is returned.\nWhen it is even, the larger of the two middle values is returned.\n\n>>> medianhigh([1, 3, 5])\n3\n>>> medianhigh([1, 3, 5, 7])\n5\n\n#### median_low\n\nReturn the low median of numeric data.\n\nWhen the number of data points is odd, the middle value is returned.\nWhen it is even, the smaller of the two middle values is returned.\n\n>>> medianlow([1, 3, 5])\n3\n>>> medianlow([1, 3, 5, 7])\n3\n\n#### mode\n\nReturn the most common data point from discrete or nominal data.\n\n``mode`` assumes discrete data, and returns a single value. This is the\nstandard treatment of the mode as commonly taught in schools:\n\n>>> mode([1, 1, 2, 3, 3, 3, 3, 4])\n3\n\nThis also works with nominal (non-numeric) data:\n\n>>> mode([\"red\", \"blue\", \"blue\", \"red\", \"green\", \"red\", \"red\"])\n'red'\n\nIf there are multiple modes with same frequency, return the first one\nencountered:\n\n>>> mode(['red', 'red', 'green', 'blue', 'blue'])\n'red'\n\nIf *data* is empty, ``mode``, raises StatisticsError.\n\n#### multimode\n\nReturn a list of the most frequently occurring values.\n\nWill return more than one result if there are multiple modes\nor an empty list if *data* is empty.\n\n>>> multimode('aabbbbbbbbcc')\n['b']\n>>> multimode('aabbbbccddddeeffffgg')\n['b', 'd', 'f']\n>>> multimode('')\n[]\n\n#### pstdev\n\nReturn the square root of the population variance.\n\nSee ``pvariance`` for arguments and other details.\n\n>>> pstdev([1.5, 2.5, 2.5, 2.75, 3.25, 4.75])\n0.986893273527251\n\n#### pvariance\n\nReturn the population variance of ``data``.\n\ndata should be a sequence or iterable of Real-valued numbers, with at least one\nvalue. The optional argument mu, if given, should be the mean of\nthe data. If it is missing or None, the mean is automatically calculated.\n\nUse this function to calculate the variance from the entire population.\nTo estimate the variance from a sample, the ``variance`` function is\nusually a better choice.\n\nExamples:\n\n>>> data = [0.0, 0.25, 0.25, 1.25, 1.5, 1.75, 2.75, 3.25]\n>>> pvariance(data)\n1.25\n\nIf you have already calculated the mean of the data, you can pass it as\nthe optional second argument to avoid recalculating it:\n\n>>> mu = mean(data)\n>>> pvariance(data, mu)\n1.25\n\nDecimals and Fractions are supported:\n\n>>> from decimal import Decimal as D\n>>> pvariance([D(\"27.5\"), D(\"30.25\"), D(\"30.25\"), D(\"34.5\"), D(\"41.75\")])\nDecimal('24.815')\n\n>>> from fractions import Fraction as F\n>>> pvariance([F(1, 4), F(5, 4), F(1, 2)])\nFraction(13, 72)\n\n#### quantiles\n\nDivide *data* into *n* continuous intervals with equal probability.\n\nReturns a list of (n - 1) cut points separating the intervals.\n\nSet *n* to 4 for quartiles (the default).  Set *n* to 10 for deciles.\nSet *n* to 100 for percentiles which gives the 99 cuts points that\nseparate *data* in to 100 equal sized groups.\n\nThe *data* can be any iterable containing sample.\nThe cut points are linearly interpolated between data points.\n\nIf *method* is set to *inclusive*, *data* is treated as population\ndata.  The minimum value is treated as the 0th percentile and the\nmaximum value is treated as the 100th percentile.\n\n#### stdev\n\nReturn the square root of the sample variance.\n\nSee ``variance`` for arguments and other details.\n\n>>> stdev([1.5, 2.5, 2.5, 2.75, 3.25, 4.75])\n1.0810874155219827\n\n#### variance\n\nReturn the sample variance of data.\n\ndata should be an iterable of Real-valued numbers, with at least two\nvalues. The optional argument xbar, if given, should be the mean of\nthe data. If it is missing or None, the mean is automatically calculated.\n\nUse this function when your data is a sample from a population. To\ncalculate the variance from the entire population, see ``pvariance``.\n\nExamples:\n\n>>> data = [2.75, 1.75, 1.25, 0.25, 0.5, 1.25, 3.5]\n>>> variance(data)\n1.3720238095238095\n\nIf you have already calculated the mean of your data, you can pass it as\nthe optional second argument ``xbar`` to avoid recalculating it:\n\n>>> m = mean(data)\n>>> variance(data, m)\n1.3720238095238095\n\nThis function does not check that ``xbar`` is actually the mean of\n``data``. Giving arbitrary values for ``xbar`` may lead to invalid or\nimpossible results.\n\nDecimals and Fractions are supported:\n\n>>> from decimal import Decimal as D\n>>> variance([D(\"27.5\"), D(\"30.25\"), D(\"30.25\"), D(\"34.5\"), D(\"41.75\")])\nDecimal('31.01875')\n\n>>> from fractions import Fraction as F\n>>> variance([F(1, 6), F(1, 2), F(5, 3)])\nFraction(67, 108)\n\n### DATA\n\nall = ['NormalDist', 'StatisticsError', 'correlation', 'covariance...\n\n### FILE\n\n/usr/lib/python3.10/statistics.py\n\n"
        }
    ],
    "structuredContent": {
        "command": "statistics",
        "section": "",
        "mode": "pydoc",
        "summary": "statistics - Basic statistics module.",
        "synopsis": null,
        "tldr_summary": null,
        "tldr_examples": [],
        "tldr_source": null,
        "flags": [],
        "examples": [],
        "see_also": [],
        "section_outline": [
            {
                "name": "NAME",
                "lines": 2,
                "subsections": []
            },
            {
                "name": "MODULE REFERENCE",
                "lines": 8,
                "subsections": []
            },
            {
                "name": "DESCRIPTION",
                "lines": 101,
                "subsections": []
            },
            {
                "name": "CLASSES",
                "lines": 5,
                "subsections": [
                    {
                        "name": "class NormalDist",
                        "lines": 140
                    },
                    {
                        "name": "class StatisticsError",
                        "lines": 68
                    }
                ]
            },
            {
                "name": "FUNCTIONS",
                "lines": 1,
                "subsections": [
                    {
                        "name": "correlation",
                        "lines": 15
                    },
                    {
                        "name": "covariance",
                        "lines": 15
                    },
                    {
                        "name": "fmean",
                        "lines": 8
                    },
                    {
                        "name": "geometric_mean",
                        "lines": 11
                    },
                    {
                        "name": "harmonic_mean",
                        "lines": 22
                    },
                    {
                        "name": "linear_regression",
                        "lines": 23
                    },
                    {
                        "name": "mean",
                        "lines": 15
                    },
                    {
                        "name": "median",
                        "lines": 11
                    },
                    {
                        "name": "median_grouped",
                        "lines": 25
                    },
                    {
                        "name": "median_high",
                        "lines": 10
                    },
                    {
                        "name": "median_low",
                        "lines": 10
                    },
                    {
                        "name": "mode",
                        "lines": 21
                    },
                    {
                        "name": "multimode",
                        "lines": 12
                    },
                    {
                        "name": "pstdev",
                        "lines": 7
                    },
                    {
                        "name": "pvariance",
                        "lines": 33
                    },
                    {
                        "name": "quantiles",
                        "lines": 15
                    },
                    {
                        "name": "stdev",
                        "lines": 7
                    },
                    {
                        "name": "variance",
                        "lines": 36
                    }
                ]
            },
            {
                "name": "DATA",
                "lines": 2,
                "subsections": []
            },
            {
                "name": "FILE",
                "lines": 3,
                "subsections": []
            }
        ]
    }
}