mizani - Man Page

Name

mizani — Mizani Documentation

Mizani is python library that provides the pieces necessary to create scales for a graphics system. It is based on the R Scales package.

Contents

bounds - Limiting data values for a palette

Continuous variables have values anywhere in the range minus infinite to plus infinite. However, when creating a visual representation of these values what usually matters is the relative difference between the values. This is where rescaling comes into play.

The values are mapped onto a range that a scale can deal with. For graphical representation that range tends to be [0, 1] or [0, n], where n is some number that makes the plotted object overflow the plotting area.

Although a scale may be able handle the [0, n] range, it may be desirable to have a lower bound greater than zero. For example, if data values get mapped to zero on a scale whose graphical representation is the size/area/radius/length some data will be invisible. The solution is to restrict the lower bound e.g. [0.1, 1]. Similarly you can restrict the upper bound -- using these functions.

mizani.bounds.censor(x: TFloatVector, range: tuple[float, float] = (0, 1), only_finite: bool = True) -> TFloatVector

Convert any values outside of range to a NULL type object.

Parameters
x (numpy:array_like)

Values to manipulate

range (python:tuple)

(min, max) giving desired output range

only_finite (bool)

If True (the default), will only modify finite values.

Returns
x (numpy:array_like)

Censored array

Notes

All values in x should be of the same type. only_finite parameter is not considered for Datetime and Timedelta types.

The NULL type object depends on the type of values in x.

  • float - float('nan')
  • int - float('nan')
  • datetime.datetime : np.datetime64(NaT)
  • datetime.timedelta : np.timedelta64(NaT)

Examples

>>> a = np.array([1, 2, np.inf, 3, 4, -np.inf, 5])
>>> censor(a, (0, 10))
array([  1.,   2.,  inf,   3.,   4., -inf,   5.])
>>> censor(a, (0, 10), False)
array([ 1.,  2., nan,  3.,  4., nan,  5.])
>>> censor(a, (2, 4))
array([ nan,   2.,  inf,   3.,   4., -inf,  nan])
mizani.bounds.expand_range(range: tuple[float, float], mul: float = 0, add: float = 0, zero_width: float = 1) -> tuple[float, float]

Expand a range with a multiplicative or additive constant

Parameters
range (python:tuple)

Range of data. Size 2.

mul (python:int | python:float)

Multiplicative constant

add (python:int | python:float | timedelta)

Additive constant

zero_width (python:int | python:float | timedelta)

Distance to use if range has zero width

Returns
out (python:tuple)

Expanded range

Notes

If expanding datetime or timedelta types, add and zero_width must be suitable timedeltas i.e. You should not mix types between Numpy, Pandas and the datetime module.

Examples

>>> expand_range((3, 8))
(3, 8)
>>> expand_range((0, 10), mul=0.1)
(-1.0, 11.0)
>>> expand_range((0, 10), add=2)
(-2, 12)
>>> expand_range((0, 10), mul=.1, add=2)
(-3.0, 13.0)
>>> expand_range((0, 1))
(0, 1)

When the range has zero width

>>> expand_range((5, 5))
(4.5, 5.5)
mizani.bounds.rescale(x: FloatArrayLike, to: tuple[float, float] = (0, 1), _from: tuple[float, float] | None = None) -> NDArrayFloat

Rescale numeric vector to have specified minimum and maximum.

Parameters
x (numpy:array_like | numeric)

1D vector of values to manipulate.

to (python:tuple)

output range (numeric vector of length two)

_from (python:tuple)

input range (numeric vector of length two). If not given, is calculated from the range of x

Returns
out (numpy:array_like)

Rescaled values

Examples

>>> x = [0, 2, 4, 6, 8, 10]
>>> rescale(x)
array([0. , 0.2, 0.4, 0.6, 0.8, 1. ])
>>> rescale(x, to=(0, 2))
array([0. , 0.4, 0.8, 1.2, 1.6, 2. ])
>>> rescale(x, to=(0, 2), _from=(0, 20))
array([0. , 0.2, 0.4, 0.6, 0.8, 1. ])
mizani.bounds.rescale_max(x: FloatArrayLike, to: tuple[float, float] = (0, 1), _from: tuple[float, float] | None = None) -> NDArrayFloat

Rescale numeric vector to have specified maximum.

Parameters
x (numpy:array_like)

1D vector of values to manipulate.

to (python:tuple)

output range (numeric vector of length two)

_from (python:tuple)

input range (numeric vector of length two). If not given, is calculated from the range of x. Only the 2nd (max) element is essential to the output.

Returns
out (numpy:array_like)

Rescaled values

Examples

>>> x = np.array([0, 2, 4, 6, 8, 10])
>>> rescale_max(x, (0, 3))
array([0. , 0.6, 1.2, 1.8, 2.4, 3. ])

Only the 2nd (max) element of the parameters to and _from are essential to the output.

>>> rescale_max(x, (1, 3))
array([0. , 0.6, 1.2, 1.8, 2.4, 3. ])
>>> rescale_max(x, (0, 20))
array([ 0.,  4.,  8., 12., 16., 20.])

If max(x) < _from[1] then values will be scaled beyond the requested maximum (to[1]).

>>> rescale_max(x, to=(1, 3), _from=(-1, 6))
array([0., 1., 2., 3., 4., 5.])

If the values are the same, they taken on the requested maximum. This includes an array of all zeros.

>>> rescale_max(np.array([5, 5, 5]))
array([1., 1., 1.])
>>> rescale_max(np.array([0, 0, 0]))
array([1, 1, 1])
mizani.bounds.rescale_mid(x: FloatArrayLike, to: tuple[float, float] = (0, 1), _from: tuple[float, float] | None = None, mid: float = 0) -> NDArrayFloat

Rescale numeric vector to have specified minimum, midpoint, and maximum.

Parameters
x (numpy:array_like)

1D vector of values to manipulate.

to (python:tuple)

output range (numeric vector of length two)

_from (python:tuple)

input range (numeric vector of length two). If not given, is calculated from the range of x

mid (numeric)

mid-point of input range

Returns
out (numpy:array_like)

Rescaled values

Examples

>>> rescale_mid([1, 2, 3], mid=1)
array([0.5 , 0.75, 1.  ])
>>> rescale_mid([1, 2, 3], mid=2)
array([0. , 0.5, 1. ])

rescale_mid does have the same signature as rescale and rescale_max. In cases where we need a compatible function with the same signature, we use a closure around the extra mid argument.

>>> def rescale_mid_compat(mid):
...     def _rescale(x, to=(0, 1), _from=None):
...         return rescale_mid(x, to, _from, mid=mid)
...     return _rescale
>>> rescale_mid2 = rescale_mid_compat(mid=2)
>>> rescale_mid2([1, 2, 3])
array([0. , 0.5, 1. ])
mizani.bounds.squish_infinite(x: FloatArrayLike, range: tuple[float, float] = (0, 1)) -> NDArrayFloat

Truncate infinite values to a range.

Parameters
x (numpy:array_like)

Values that should have infinities squished.

range (python:tuple)

The range onto which to squish the infinites. Must be of size 2.

Returns
out (numpy:array_like)

Values with infinites squished.

Examples

>>> arr1 = np.array([0, .5, .25, np.inf, .44])
>>> arr2 = np.array([0, -np.inf, .5, .25, np.inf])
>>> squish_infinite(arr1)
array([0.  , 0.5 , 0.25, 1.  , 0.44])
>>> squish_infinite(arr2, (-10, 9))
array([  0.  , -10.  ,   0.5 ,   0.25,   9.  ])
mizani.bounds.zero_range(x: tuple[Any, Any], tol: float = 2.220446049250313e-14) -> bool

Determine if range of vector is close to zero.

Parameters
x (numpy:array_like)

Value(s) to check. If it is an array_like, it should be of length 2.

tol (python:float)

Tolerance. Default tolerance is the machine epsilon times 10^2.

Returns
out (bool)

Whether x has zero range.

Examples

>>> zero_range([1, 1])
True
>>> zero_range([1, 2])
False
>>> zero_range([1, 2], tol=2)
True
mizani.bounds.expand_range_distinct(range: tuple[float, float], expand: tuple[float, float] | tuple[float, float, float, float] = (0, 0, 0, 0), zero_width: float = 1) -> tuple[float, float]

Expand a range with a multiplicative or additive constants

Similar to expand_range() but both sides of the range expanded using different constants

Parameters
range (python:tuple)

Range of data. Size 2

expand (python:tuple)

Length 2 or 4. If length is 2, then the same constants are used for both sides. If length is 4 then the first two are are the Multiplicative (mul) and Additive (add) constants for the lower limit, and the second two are the constants for the upper limit.

zero_width (python:int | python:float | timedelta)

Distance to use if range has zero width

Returns
out (python:tuple)

Expanded range

Examples

>>> expand_range_distinct((3, 8))
(3, 8)
>>> expand_range_distinct((0, 10), (0.1, 0))
(-1.0, 11.0)
>>> expand_range_distinct((0, 10), (0.1, 0, 0.1, 0))
(-1.0, 11.0)
>>> expand_range_distinct((0, 10), (0.1, 0, 0, 0))
(-1.0, 10)
>>> expand_range_distinct((0, 10), (0, 2))
(-2, 12)
>>> expand_range_distinct((0, 10), (0, 2, 0, 2))
(-2, 12)
>>> expand_range_distinct((0, 10), (0, 0, 0, 2))
(0, 12)
>>> expand_range_distinct((0, 10), (.1, 2))
(-3.0, 13.0)
>>> expand_range_distinct((0, 10), (.1, 2, .1, 2))
(-3.0, 13.0)
>>> expand_range_distinct((0, 10), (0, 0, .1, 2))
(0, 13.0)
mizani.bounds.squish(x: FloatArrayLike, range: tuple[float, float] = (0, 1), only_finite: bool = True) -> NDArrayFloat

Squish values into range.

Parameters
x (numpy:array_like)

Values that should have out of range values squished.

range (python:tuple)

The range onto which to squish the values.

only_finite: boolean

When true, only squishes finite values.

Returns
out (numpy:array_like)

Values with out of range values squished.

Examples

>>> squish([-1.5, 0.2, 0.8, 1.0, 1.2])
array([0. , 0.2, 0.8, 1. , 1. ])
>>> squish([-np.inf, -1.5, 0.2, 0.8, 1.0, np.inf], only_finite=False)
array([0. , 0. , 0.2, 0.8, 1. , 1. ])

breaks - Partitioning a scale for readability

All scales have a means by which the values that are mapped onto the scale are interpreted. Numeric digital scales put out numbers for direct interpretation, but most scales cannot do this. What they offer is named markers/ticks that aid in assessing the values e.g. the common odometer will have ticks and values to help gauge the speed of the vehicle.

The named markers are what we call breaks. Properly calculated breaks make interpretation straight forward. These functions provide ways to calculate good(hopefully) breaks.

class mizani.breaks.breaks_log(n: int = 5, base: float = 10)

Integer breaks on log transformed scales

Parameters
n (python:int)

Desired number of breaks

base (python:int)

Base of logarithm

Examples

>>> x = np.logspace(3, 6)
>>> limits = min(x), max(x)
>>> breaks_log()(limits)
array([     1000,    10000,   100000,  1000000])
>>> breaks_log(2)(limits)
array([  1000, 100000])
>>> breaks_log()([0.1, 1])
array([0.1, 0.3, 1. , 3. ])
__call__(limits: tuple[float, float]) -> NDArrayFloat

Compute breaks

Parameters
limits (python:tuple)

Minimum and maximum values

Returns
out (numpy:array_like)

Sequence of breaks points

class mizani.breaks.breaks_symlog

Breaks for the Symmetric Logarithm Transform

Examples

>>> limits = (-100, 100)
>>> breaks_symlog()(limits)
array([-100,  -10,    0,   10,  100])
__call__(limits: tuple[float, float]) -> NDArrayFloat

Call self as a function.

class mizani.breaks.minor_breaks(n: int = 1)

Compute minor breaks

This is the naive method. It does not take into account the transformation.

Parameters
n (python:int)

Number of minor breaks between the major breaks.

Examples

>>> major = [1, 2, 3, 4]
>>> limits = [0, 5]
>>> minor_breaks()(major, limits)
array([0.5, 1.5, 2.5, 3.5, 4.5])
>>> minor_breaks()([1, 2], (1, 2))
array([1.5])

More than 1 minor break.

>>> minor_breaks(3)([1, 2], (1, 2))
array([1.25, 1.5 , 1.75])
>>> minor_breaks()([1, 2], (1, 2), 3)
array([1.25, 1.5 , 1.75])
__call__(major: FloatArrayLike, limits: tuple[float, float] | None = None, n: int | None = None) -> NDArrayFloat

Minor breaks

Parameters
major (numpy:array_like)

Major breaks

limits (numpy:array_like | python:None)

Limits of the scale. If array_like, must be of size 2. If None, then the minimum and maximum of the major breaks are used.

n (python:int)

Number of minor breaks between the major breaks. If None, then self.n is used.

Returns
out (numpy:array_like)

Minor beraks

class mizani.breaks.minor_breaks_trans(trans: Trans, n: int = 1)

Compute minor breaks for transformed scales

The minor breaks are computed in data space. This together with major breaks computed in transform space reveals the non linearity of of a scale. See the log transforms created with log_trans() like log10_trans.

Parameters
trans (trans or type)

Trans object or trans class.

n (python:int)

Number of minor breaks between the major breaks.

Examples

>>> from mizani.transforms import sqrt_trans
>>> major = [1, 2, 3, 4]
>>> limits = [0, 5]
>>> t1 = sqrt_trans()
>>> t1.minor_breaks(major, limits)
array([1.58113883, 2.54950976, 3.53553391])

# Changing the regular minor_breaks method

>>> t2 = sqrt_trans()
>>> t2.minor_breaks = minor_breaks()
>>> t2.minor_breaks(major, limits)
array([0.5, 1.5, 2.5, 3.5, 4.5])

More than 1 minor break

>>> major = [1, 10]
>>> limits = [1, 10]
>>> t2.minor_breaks(major, limits, 4)
array([2.8, 4.6, 6.4, 8.2])
__call__(major: FloatArrayLike, limits: tuple[float, float] | None = None, n: int | None = None) -> NDArrayFloat

Minor breaks for transformed scales

Parameters
major (numpy:array_like)

Major breaks

limits (numpy:array_like | python:None)

Limits of the scale. If array_like, must be of size 2. If None, then the minimum and maximum of the major breaks are used.

n (python:int)

Number of minor breaks between the major breaks. If None, then self.n is used.

Returns
out (numpy:array_like)

Minor breaks

class mizani.breaks.breaks_date(n: int = 5, *, width: str | None = None)

Regularly spaced dates

Parameters
n

Desired number of breaks.

width (python:str | python:None)

An interval specification. Must be one of [second, minute, hour, day, week, month, year] If None, the interval automatic.

Examples

>>> from datetime import datetime
>>> limits = (datetime(2010, 1, 1), datetime(2026, 1, 1))

Default breaks will be regularly spaced but the spacing is automatically determined

>>> breaks = breaks_date(9)
>>> [d.year for d in breaks(limits)]
[2010, 2012, 2014, 2016, 2018, 2020, 2022, 2024, 2026]

Breaks at 4 year intervals

>>> breaks = breaks_date(width='4 year')
>>> [d.year for d in breaks(limits)]
[2010, 2014, 2018, 2022, 2026]
__call__(limits: tuple[datetime, datetime] | tuple[date, date]) -> Sequence[datetime]

Compute breaks

Parameters
limits (python:tuple)

Minimum and maximum datetime.datetime values.

Returns
out (numpy:array_like)

Sequence of break points.

class mizani.breaks.breaks_timedelta(n: int = 5, Q: Sequence[float] = (1, 2, 5, 10))

Timedelta breaks

Returns
out (python:callable() f(limits))

A function that takes a sequence of two datetime.timedelta values and returns a sequence of break points.

Examples

>>> from datetime import timedelta
>>> breaks = breaks_timedelta()
>>> x = [timedelta(days=i*365) for i in range(25)]
>>> limits = min(x), max(x)
>>> major = breaks(limits)
>>> [val.total_seconds()/(365*24*60*60)for val in major]
[0.0, 5.0, 10.0, 15.0, 20.0, 25.0]
__call__(limits: tuple[Timedelta, Timedelta]) -> TimedeltaArrayLike

Compute breaks

Parameters
limits (python:tuple)

Minimum and maximum datetime.timedelta values.

Returns
out (numpy:array_like)

Sequence of break points.

class mizani.breaks.breaks_extended(n: int = 5, Q: Sequence[float] = (1, 5, 2, 2.5, 4, 3), only_inside: bool = False, w: Sequence[float] = (0.25, 0.2, 0.5, 0.05))

An extension of Wilkinson's tick position algorithm

Parameters
n (python:int)

Desired number of breaks

Q (python:list)

List of nice numbers

only_inside (bool)

If True, then all the breaks will be within the given range.

w (python:list)

Weights applied to the four optimization components (simplicity, coverage, density, and legibility). They should add up to 1.

References

  • Talbot, J., Lin, S., Hanrahan, P. (2010) An Extension of Wilkinson's Algorithm for Positioning Tick Labels on Axes, InfoVis 2010.

Additional Credit to Justin Talbot on whose code this implementation is almost entirely based.

Examples

>>> limits = (0, 9)
>>> breaks_extended()(limits)
array([  0. ,   2.5,   5. ,   7.5,  10. ])
>>> breaks_extended(n=6)(limits)
array([  0.,   2.,   4.,   6.,   8.,  10.])
__call__(limits: tuple[float, float]) -> NDArrayFloat

Calculate the breaks

Parameters
limits (array)

Minimum and maximum values.

Returns
out (numpy:array_like)

Sequence of break points.

labels - Labelling breaks

Scales have guides and these are what help users make sense of the data mapped onto the scale. Common examples of guides include the x-axis, the y-axis, the keyed legend and a colorbar legend. The guides have demarcations(breaks), some of which must be labelled.

The label_* functions below create functions that convert data values as understood by a specific scale and return string representations of those values. Manipulating the string representation of a value helps improve readability of the guide.

class mizani.labels.label_comma(accuracy: float | None = None, precision: int = 0, scale: float = 1, prefix: str = '', suffix: str = '', big_mark: str = ',', decimal_mark: str = '.', fill: str = '', style_negative: Literal['-', 'hyphen', 'parens'] = '-', style_positive: Literal['', '+', ' '] = '', align: Literal['<', '>', '=', '^'] = '>', width: int | None = None)

Labels of numbers with commas as separators

Parameters
precision (python:int)

Number of digits after the decimal point.

Examples

>>> label_comma()([1000, 2, 33000, 400])
['1,000', '2', '33,000', '400']
class mizani.labels.label_custom(fmt: str = '{}', style: Literal['old', 'new'] = 'new')

Creating a custom labelling function

Parameters
fmt (python:str, optional)

Format string. Default is the generic new style format braces, {}.

style ('new' | 'old')

Whether to use new style or old style formatting. New style uses the str.format() while old style uses %. The format string must be written accordingly.

Examples

>>> label = label_custom('{:.2f} USD')
>>> label([3.987, 2, 42.42])
['3.99 USD', '2.00 USD', '42.42 USD']
__call__(x: FloatArrayLike) -> Sequence[str]

Format a sequence of inputs

Parameters
x (array)

Input

Returns
out (python:list)

List of strings.

class mizani.labels.label_currency(accuracy: float | None = None, precision: int | None = None, scale: float = 1, prefix: str = '$', suffix: str = '', big_mark: str = '', decimal_mark: str = '.', fill: str = '', style_negative: Literal['-', 'hyphen', 'parens'] = '-', style_positive: Literal['', '+', ' '] = '', align: Literal['<', '>', '=', '^'] = '>', width: int | None = None)

Labelling currencies

Parameters
prefix (python:str)

What to put before the value.

Examples

>>> x = [1.232, 99.2334, 4.6, 9, 4500]
>>> label_currency()(x)
['$1.23', '$99.23', '$4.60', '$9.00', '$4500.00']
>>> label_currency(prefix='C$', precision=0, big_mark=',')(x)
['C$1', 'C$99', 'C$5', 'C$9', 'C$4,500']
mizani.labels.label_dollar

alias of label_currency

class mizani.labels.label_percent(accuracy: float | None = None, precision: int | None = None, scale: float = 100, prefix: str = '', suffix: str = '%', big_mark: str = '', decimal_mark: str = '.', fill: str = '', style_negative: Literal['-', 'hyphen', 'parens'] = '-', style_positive: Literal['', '+', ' '] = '', align: Literal['<', '>', '=', '^'] = '>', width: int | None = None)

Labelling percentages

Multiply by one hundred and display percent sign

Examples

>>> label = label_percent()
>>> label([.45, 9.515, .01])
['45%', '952%', '1%']
>>> label([.654, .8963, .1])
['65%', '90%', '10%']
class mizani.labels.label_scientific(digits: int = 3)

Scientific number labels

Parameters
digits (python:int)

Significant digits.

Notes

Be careful when using many digits (15+ on a 64 bit computer). Consider of the machine epsilon.

Examples

>>> x = [.12, .23, .34, 45]
>>> label_scientific()(x)
['1.2e-01', '2.3e-01', '3.4e-01', '4.5e+01']
__call__(x: FloatArrayLike) -> Sequence[str]

Call self as a function.

class mizani.labels.label_date(fmt: str = '%Y-%m-%d', tz: tzinfo | None = None)

Datetime labels

Parameters
fmt (python:str)

Format string. See strftime.

tz (datetime.tzinfo, optional)

Time zone information. If none is specified, the time zone will be that of the first date. If the first date has no time information then a time zone is chosen by other means.

Examples

>>> from datetime import datetime
>>> x = [datetime(x, 1, 1) for x in [2010, 2014, 2018, 2022]]
>>> label_date()(x)
['2010-01-01', '2014-01-01', '2018-01-01', '2022-01-01']
>>> label_date('%Y')(x)
['2010', '2014', '2018', '2022']

Can format time

>>> x = [datetime(2017, 12, 1, 16, 5, 7)]
>>> label_date("%Y-%m-%d %H:%M:%S")(x)
['2017-12-01 16:05:07']

Time zones are respected

>>> UTC = ZoneInfo('UTC')
>>> UG = ZoneInfo('Africa/Kampala')
>>> x = [datetime(2010, 1, 1, i) for i in [8, 15]]
>>> x_tz = [datetime(2010, 1, 1, i, tzinfo=UG) for i in [8, 15]]
>>> label_date('%Y-%m-%d %H:%M')(x)
['2010-01-01 08:00', '2010-01-01 15:00']
>>> label_date('%Y-%m-%d %H:%M')(x_tz)
['2010-01-01 08:00', '2010-01-01 15:00']

Format with a specific time zone

>>> label_date('%Y-%m-%d %H:%M', tz=UTC)(x_tz)
['2010-01-01 05:00', '2010-01-01 12:00']
>>> label_date('%Y-%m-%d %H:%M', tz='EST')(x_tz)
['2010-01-01 00:00', '2010-01-01 07:00']
__call__(x: Sequence[datetime]) -> Sequence[str]

Format a sequence of inputs

Parameters
x (array)

Input

Returns
out (python:list)

List of strings.

class mizani.labels.label_number(accuracy: float | None = None, precision: int | None = None, scale: float = 1, prefix: str = '', suffix: str = '', big_mark: str = '', decimal_mark: str = '.', fill: str = '', style_negative: Literal['-', 'hyphen', 'parens'] = '-', style_positive: Literal['', '+', ' '] = '', align: Literal['<', '>', '=', '^'] = '>', width: int | None = None)

Labelling numbers

Parameters
precision (python:int)

Number of digits after the decimal point.

suffix (python:str)

What to put after the value.

big_mark (python:str)

The thousands separator. This is usually a comma or a dot.

decimal_mark (python:str)

What to use to separate the decimals digits.

Examples

>>> label_number()([.654, .8963, .1])
['0.65', '0.90', '0.10']
>>> label_number(accuracy=0.0001)([.654, .8963, .1])
['0.6540', '0.8963', '0.1000']
>>> label_number(precision=4)([.654, .8963, .1])
['0.6540', '0.8963', '0.1000']
>>> label_number(prefix="$")([5, 24, -42])
['$5', '$24', '-$42']
>>> label_number(suffix="s")([5, 24, -42])
['5s', '24s', '-42s']
>>> label_number(big_mark="_")([1e3, 1e4, 1e5, 1e6])
['1_000', '10_000', '100_000', '1_000_000']
>>> label_number(width=3)([1, 10, 100, 1000])
['  1', ' 10', '100', '1000']
>>> label_number(align="^", width=5)([1, 10, 100, 1000])
['  1  ', ' 10  ', ' 100 ', '1000 ']
>>> label_number(style_positive=" ")([5, 24, -42])
[' 5', ' 24', '-42']
>>> label_number(style_positive="+")([5, 24, -42])
['+5', '+24', '-42']
>>> label_number(prefix="$", style_negative="braces")([5, 24, -42])
['$5', '$24', '($42)']
__call__(x: FloatArrayLike) -> Sequence[str]

Call self as a function.

class mizani.labels.label_log(base: float = 10, exponent_limits: tuple[int, int] = (-4, 4), mathtex: bool = False)

Log number labels

Parameters
base (python:int)

Base of the logarithm. Default is 10.

exponent_limits (python:tuple)

limits (int, int) where if the any of the powers of the numbers falls outside, then the labels will be in exponent form. This only applies for base 10.

mathtex (bool)

If True, return the labels in mathtex format as understood by Matplotlib.

Examples

>>> label_log()([0.001, 0.1, 100])
['0.001', '0.1', '100']
>>> label_log()([0.0001, 0.1, 10000])
['1e-4', '1e-1', '1e4']
>>> label_log(mathtex=True)([0.0001, 0.1, 10000])
['$10^{-4}$', '$10^{-1}$', '$10^{4}$']
__call__(x: FloatArrayLike) -> Sequence[str]

Format a sequence of inputs

Parameters
x (array)

Input

Returns
out (python:list)

List of strings.

class mizani.labels.label_timedelta(units: DurationUnit | None = None, show_units: bool = True, zero_has_units: bool = True, usetex: bool = False, space: bool = True, use_plurals: bool = True)

Timedelta labels

Parameters
units (python:str, optional)

The units in which the breaks will be computed. If None, they are decided automatically. Otherwise, the value should be one of:

'ns'    # nanoseconds
'us'    # microseconds
'ms'    # milliseconds
's'     # seconds
'min'   # minute
'h'     # hour
'day'     # day
'week'  # week
'month' # month
'year'  # year
show_units (bool)

Whether to append the units symbol to the values.

zero_has_units (bool)

If True a value of zero

usetex (bool)

If True, they microseconds identifier string is rendered with greek letter mu. Default is False.

space (bool)

If True add a space between the value and the units

use_plurals (bool)

If True, for the when the value is not 1 and the units are one of week, month and year, the plural form of the unit is used e.g. 2 weeks.

Examples

>>> from datetime import timedelta
>>> x = [timedelta(days=31*i) for i in range(5)]
>>> label_timedelta()(x)
['0 months', '1 month', '2 months', '3 months', '4 months']
>>> label_timedelta(use_plurals=False)(x)
['0 month', '1 month', '2 month', '3 month', '4 month']
>>> label_timedelta(units='day')(x)
['0 days', '31 days', '62 days', '93 days', '124 days']
>>> label_timedelta(units='day', zero_has_units=False)(x)
['0', '31 days', '62 days', '93 days', '124 days']
>>> label_timedelta(units='day', show_units=False)(x)
['0', '31', '62', '93', '124']
__call__(x: TimedeltaArrayLike) -> Sequence[str]

Call self as a function.

class mizani.labels.label_pvalue(accuracy: float = 0.001, add_p: float = False)

p-values labelling

Parameters
accuracy (python:float)

Number to round to

add_p (bool)

Whether to prepend "p=" or "p<" to the output

Examples

>>> x = [.90, .15, .015, .009, 0.0005]
>>> label_pvalue()(x)
['0.9', '0.15', '0.015', '0.009', '<0.001']
>>> label_pvalue(0.1)(x)
['0.9', '0.1', '<0.1', '<0.1', '<0.1']
>>> label_pvalue(0.1, True)(x)
['p=0.9', 'p=0.1', 'p<0.1', 'p<0.1', 'p<0.1']
__call__(x: FloatArrayLike) -> Sequence[str]

Format a sequence of inputs

Parameters
x (array)

Input

Returns
out (python:list)

List of strings.

class mizani.labels.label_ordinal(prefix: str = '', suffix: str = '', big_mark: str = '')

Ordinal number labelling

Parameters
prefix (python:str)

What to put before the value.

suffix (python:str)

What to put after the value.

big_mark (python:str)

The thousands separator. This is usually a comma or a dot.

Examples

>>> label_ordinal()(range(8))
['0th', '1st', '2nd', '3rd', '4th', '5th', '6th', '7th']
>>> label_ordinal(suffix=' Number')(range(11, 15))
['11th Number', '12th Number', '13th Number', '14th Number']
__call__(x: FloatArrayLike) -> Sequence[str]

Call self as a function.

class mizani.labels.label_bytes(symbol: Literal['auto'] | BytesSymbol = 'auto', units: Literal['binary', 'si'] = 'binary', fmt: str = '{:.0f} ')

Labelling byte numbers

Parameters
symbol (python:str)

Valid symbols are "B", "kB", "MB", "GB", "TB", "PB", "EB", "ZB", and "YB" for SI units, and the "iB" variants for binary units. Default is "auto" where the symbol to be used is determined separately for each value of 1x.

units ("binary" | "si")

Which unit base to use, 1024 for "binary" or 1000 for "si".

fmt (python:str, optional)

Format sting. Default is {:.0f}.

Examples

>>> x = [1000, 1000000, 4e5]
>>> label_bytes()(x)
['1000 B', '977 KiB', '391 KiB']
>>> label_bytes(units='si')(x)
['1 kB', '1 MB', '400 kB']
__call__(x: FloatArrayLike) -> Sequence[str]

Call self as a function.

palettes - Mapping values onto the domain of a scale

Palettes are the link between data values and the values along the dimension of a scale. Before a collection of values can be represented on a scale, they are transformed by a palette. This transformation is knowing as mapping. Values are mapped onto a scale by a palette.

Scales tend to have restrictions on the magnitude of quantities that they can intelligibly represent. For example, the size of a point should be significantly smaller than the plot panel onto which it is plotted or else it would be hard to compare two or more points. Therefore palettes must be created that enforce such restrictions. This is the reason for the *_pal functions that create and return the actual palette functions.

mizani.palettes.hls_palette(n_colors: int = 6, h: float = 0.01, l: float = 0.6, s: float = 0.65) -> Sequence[tuple[float, float, float]]

Get a set of evenly spaced colors in HLS hue space.

h, l, and s should be between 0 and 1

Parameters
n_colors (python:int)

number of colors in the palette

h (python:float)

first hue

l (python:float)

lightness

s (python:float)

saturation

Returns
palette (python:list)

List of colors as RGB hex strings.

SEE ALSO:

hsluv_palette

Make a palette using evenly spaced circular hues in the HSLuv system.

Examples

>>> len(hls_palette(2))
2
>>> len(hls_palette(9))
9
mizani.palettes.hsluv_palette(n_colors: int = 6, h: float = 0.01, s: float = 0.9, l: float = 0.65) -> Sequence[tuple[float, float, float]]

Get a set of evenly spaced colors in HSLuv hue space.

h, s, and l should be between 0 and 1

Parameters
n_colors (python:int)

number of colors in the palette

h (python:float)

first hue

s (python:float)

saturation

l (python:float)

lightness

Returns
palette (python:list)

List of colors as RGB hex strings.

SEE ALSO:

hls_palette

Make a palette using evenly spaced circular hues in the HSL system.

Examples

>>> len(hsluv_palette(3))
3
>>> len(hsluv_palette(11))
11
class mizani.palettes.rescale_pal(range: tuple[float, float] = (0.1, 1))

Rescale the input to the specific output range.

Useful for alpha, size, and continuous position.

Parameters
range (python:tuple)

Range of the scale

Returns
out (function)

Palette function that takes a sequence of values in the range [0, 1] and returns values in the specified range.

Examples

>>> palette = rescale_pal()
>>> palette([0, .2, .4, .6, .8, 1])
array([0.1 , 0.28, 0.46, 0.64, 0.82, 1.  ])

The returned palette expects inputs in the [0, 1] range. Any value outside those limits is clipped to range[0] or range[1].

>>> palette([-2, -1, 0.2, .4, .8, 2, 3])
array([0.1 , 0.1 , 0.28, 0.46, 0.82, 1.  , 1.  ])
class mizani.palettes.area_pal(range: tuple[float, float] = (1, 6))

Point area palette (continuous).

Parameters
range (python:tuple)

Numeric vector of length two, giving range of possible sizes. Should be greater than 0.

Returns
out (function)

Palette function that takes a sequence of values in the range [0, 1] and returns values in the specified range.

Examples

>>> x = np.arange(0, .6, .1)**2
>>> palette = area_pal()
>>> palette(x)
array([1. , 1.5, 2. , 2.5, 3. , 3.5])

The results are equidistant because the input x is in area space, i.e it is squared.

class mizani.palettes.abs_area(max: float)

Point area palette (continuous), with area proportional to value.

Parameters
max (python:float)

A number representing the maximum size

Returns
out (function)

Palette function that takes a sequence of values in the range [0, 1] and returns values in the range [0, max].

Examples

>>> x = np.arange(0, .8, .1)**2
>>> palette = abs_area(5)
>>> palette(x)
array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5])

Compared to area_pal(), abs_area() will handle values in the range [-1, 0] without returning np.nan. And values whose absolute value is greater than 1 will be clipped to the maximum.

class mizani.palettes.grey_pal(start: float = 0.2, end: float = 0.8)

Utility for creating continuous grey scale palette

Parameters
start (python:float)

grey value at low end of palette

end (python:float)

grey value at high end of palette

Returns
out (function)

Continuous color palette that takes a single int parameter n and returns n equally spaced colors.

Examples

>>> palette = grey_pal()
>>> palette(5)
['#333333', '#737373', '#989898', '#b4b4b4', '#cccccc']
class mizani.palettes.hue_pal(h: float = 0.01, l: float = 0.6, s: float = 0.65, color_space: Literal['hls', 'hsluv'] = 'hls')

Utility for making hue palettes for color schemes.

Parameters
h (python:float)

first hue. In the [0, 1] range

l (python:float)

lightness. In the [0, 1] range

s (python:float)

saturation. In the [0, 1] range

color_space ('hls' | 'hsluv')

Color space to use for the palette. hls for https://en.wikipedia.org/wiki/HSL_and_HSV or hsluv for https://www.hsluv.org/.

Returns
out (function)

A discrete color palette that takes a single int parameter n and returns n equally spaced colors. Though the palette is continuous, since it is varies the hue it is good for categorical data. However if n is large enough the colors show continuity.

Examples

>>> hue_pal()(5)
['#db5f57', '#b9db57', '#57db94', '#5784db', '#c957db']
>>> hue_pal(color_space='hsluv')(5)
['#e0697e', '#9b9054', '#569d79', '#5b98ab', '#b675d7']
class mizani.palettes.brewer_pal(type: ColorScheme | ColorSchemeShort = 'seq', palette: int | str = 1, direction: Literal[1, -1] = 1)

Utility for making a brewer palette

Parameters
type ('sequential' | 'qualitative' | 'diverging')

Type of palette. Sequential, Qualitative or Diverging. The following abbreviations may be used, seq, qual or div.

palette (python:int | python:str)

Which palette to choose from. If is an integer, it must be in the range [0, m], where m depends on the number sequential, qualitative or diverging palettes. If it is a string, then it is the name of the palette.

direction (python:int)

The order of colours in the scale. If -1 the order of colors is reversed. The default is 1.

Returns
out (function)

A color palette that takes a single int parameter n and returns n colors. The maximum value of n varies depending on the parameters.

Examples

>>> brewer_pal()(5)
['#EFF3FF', '#BDD7E7', '#6BAED6', '#3182BD', '#08519C']
>>> brewer_pal('qual')(5)
['#7FC97F', '#BEAED4', '#FDC086', '#FFFF99', '#386CB0']
>>> brewer_pal('qual', 2)(5)
['#1B9E77', '#D95F02', '#7570B3', '#E7298A', '#66A61E']
>>> brewer_pal('seq', 'PuBuGn')(5)
['#F6EFF7', '#BDC9E1', '#67A9CF', '#1C9099', '#016C59']

The available color names for each palette type can be obtained using the following code:

from mizani._colors.brewer import get_palette_names

print(get_palette_names("sequential"))
print(get_palette_names("qualitative"))
print(get_palette_names("diverging"))
class mizani.palettes.gradient_n_pal(colors: Sequence[str], values: Sequence[float] | None = None)

Create a n color gradient palette

Parameters
colors (python:list)

list of colors

values (python:list, optional)

list of points in the range [0, 1] at which to place each color. Must be the same size as colors. Default to evenly space the colors

Returns
out (function)

Continuous color palette that takes a single parameter either a float or a sequence of floats maps those value(s) onto the palette and returns color(s). The float(s) must be in the range [0, 1].

Examples

>>> palette = gradient_n_pal(['red', 'blue'])
>>> palette([0, .25, .5, .75, 1])
['#ff0000', '#bf0040', '#7f0080', '#4000bf', '#0000ff']
>>> palette([-np.inf, 0, np.nan, 1, np.inf])
[None, '#ff0000', None, '#0000ff', None]
class mizani.palettes.cmap_pal(name: str)

Create a continuous palette using a colormap

Parameters
name (python:str)

Name of colormap

Returns
out (function)

Continuous color palette that takes a single parameter either a float or a sequence of floats maps those value(s) onto the palette and returns color(s). The float(s) must be in the range [0, 1].

Examples

>>> palette = cmap_pal('viridis')
>>> palette([.1, .2, .3, .4, .5])
['#482475', '#414487', '#355f8d', '#2a788e', '#21918c']
class mizani.palettes.cmap_d_pal(name: str)

Create a discrete palette from a colormap

Parameters
name (python:str)

Name of colormap

Returns
out (function)

A discrete color palette that takes a single int parameter n and returns n colors. The maximum value of n varies depending on the parameters.

Examples

>>> palette = cmap_d_pal('viridis')
>>> palette(5)
['#440154', '#3b528b', '#21918c', '#5ec962', '#fde725']
class mizani.palettes.desaturate_pal(color: str, prop: float, reverse: bool = False)

Create a palette that desaturate a color by some proportion

Parameters
color (color)

html color name, hex, rgb-tuple

prop (python:float)

saturation channel of color will be multiplied by this value

reverse (bool)

Whether to reverse the palette.

Returns
out (function)

Continuous color palette that takes a single parameter either a float or a sequence of floats maps those value(s) onto the palette and returns color(s). The float(s) must be in the range [0, 1].

Examples

>>> palette = desaturate_pal('red', .1)
>>> palette([0, .25, .5, .75, 1])
['#ff0000', '#e21d1d', '#c53a3a', '#a95656', '#8c7373']
class mizani.palettes.manual_pal(values: Sequence[Any])

Create a palette from a list of values

Parameters
values (python:sequence)

Values that will be returned by the palette function.

Returns
out (function)

A function palette that takes a single int parameter n and returns n values.

Examples

>>> palette = manual_pal(['a', 'b', 'c', 'd', 'e'])
>>> palette(3)
['a', 'b', 'c']
mizani.palettes.xkcd_palette(colors: Sequence[str]) -> Sequence[RGBHexColor]

Make a palette with color names from the xkcd color survey.

See xkcd for the full list of colors: http://xkcd.com/color/rgb/

Parameters
colors (python:list of strings)

List of keys in the mizani.colors.xkcd_rgb dictionary.

Returns
palette (python:list)

List of colors as RGB hex strings.

Examples

>>> palette = xkcd_palette(['red', 'green', 'blue'])
>>> palette
['#E50000', '#15B01A', '#0343DF']
>>> from mizani._colors.named_colors import XKCD
>>> list(sorted(XKCD.keys()))[:4]
['xkcd:acid green', 'xkcd:adobe', 'xkcd:algae', 'xkcd:algae green']
mizani.palettes.crayon_palette(colors: Sequence[str]) -> Sequence[RGBHexColor]

Make a palette with color names from Crayola crayons.

The colors come from http://en.wikipedia.org/wiki/List_of_Crayola_crayon_colors

Parameters
colors (python:list of strings)

List of keys in the mizani.colors.crayloax_rgb dictionary.

Returns
palette (python:list)

List of colors as RGB hex strings.

Examples

>>> palette = crayon_palette(['almond', 'silver', 'yellow'])
>>> palette
['#EED9C4', '#C9C0BB', '#FBE870']
>>> from mizani._colors.named_colors import CRAYON
>>> list(sorted(CRAYON.keys()))[:3]
['crayon:almond', 'crayon:antique brass', 'crayon:apricot']
class mizani.palettes.cubehelix_pal(start: int = 0, rotation: float = 0.4, gamma: float = 1.0, hue: float = 0.8, light: float = 0.85, dark: float = 0.15, reverse: bool = False)

Utility for creating discrete palette from the cubehelix system.

This produces a colormap with linearly-decreasing (or increasing) brightness. That means that information will be preserved if printed to black and white or viewed by someone who is colorblind.

Parameters
start (python:float (0 <= start <= 3))

The hue at the start of the helix.

rot (python:float)

Rotations around the hue wheel over the range of the palette.

gamma (python:float (0 <= gamma))

Gamma factor to emphasize darker (gamma < 1) or lighter (gamma > 1) colors.

hue (python:float (0 <= hue <= 1))

Saturation of the colors.

dark (python:float (0 <= dark <= 1))

Intensity of the darkest color in the palette.

light (python:float (0 <= light <= 1))

Intensity of the lightest color in the palette.

reverse (bool)

If True, the palette will go from dark to light.

Returns
out (function)

Continuous color palette that takes a single int parameter n and returns n equally spaced colors.

References

Green, D. A. (2011). "A colour scheme for the display of astronomical intensity images". Bulletin of the Astromical Society of India, Vol. 39, p. 289-295.

Examples

>>> palette = cubehelix_pal()
>>> palette(5)
['#edd1cb', '#d499a7', '#aa678f', '#6e4071', '#2d1e3e']
mizani.palettes.identity_pal() -> Callable[[T], T]

Create palette that maps values onto themselves

Returns
out (function)

Palette function that takes a value or sequence of values and returns the same values.

Examples

>>> palette = identity_pal()
>>> palette(9)
9
>>> palette([2, 4, 6])
[2, 4, 6]
class mizani.palettes.none_pal

Discrete palette that returns only None values

transforms - Transforming variables, scales and coordinates

"The Grammar of Graphics (2005)" by Wilkinson, Anand and Grossman describes three types of transformations.

  • Variable transformations - Used to make statistical operations on variables appropriate and meaningful. They are also used to new variables.
  • Scale transformations - Used to make statistical objects displayed on dimensions appropriate and meaningful.
  • Coordinate transformations - Used to manipulate the geometry of graphics to help perceive relationships and find meaningful structures for representing variations.

Variable and scale transformations are similar in-that they lead to plotted objects that are indistinguishable. Typically, variable transformation is done outside the graphics system and so the system cannot provide transformation specific guides & decorations for the plot. The trans is aimed at being useful for scale and coordinate transformations.

class mizani.transforms.asn_trans(*, domain: DomainType = (-inf, inf), transform_is_linear: bool = True, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)

Arc-sin square-root Transformation

transform_is_linear: bool = True

Whether the transformation over the whole domain is linear. e.g. 2x is linear while 1/x and log(x) are not.

transform(x: FloatArrayLike) -> NDArrayFloat

Transform of x

inverse(x: FloatArrayLike) -> NDArrayFloat

Inverse of x

class mizani.transforms.atanh_trans(*, domain: DomainType = (-inf, inf), transform_is_linear: bool = True, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)

Arc-tangent Transformation

transform_is_linear: bool = True

Whether the transformation over the whole domain is linear. e.g. 2x is linear while 1/x and log(x) are not.

transform(x)

Transform of x

inverse(x)

Inverse of x

class mizani.transforms.boxcox_trans(p: float, offset: int = 0, *, domain: DomainType = (-inf, inf), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)

Boxcox Transformation

The Box-Cox transformation is a flexible transformation, often used to transform data towards normality.

The Box-Cox power transformation (type 1) requires strictly positive values and takes the following form for y \gt 0:

y^{(\lambda)} = \frac{y^\lambda - 1}{\lambda}

When y = 0, the natural log transform is used.

Parameters
p (python:float)

Transformation exponent \lambda.

offset (python:int)

Constant offset. 0 for Box-Cox type 1, otherwise any non-negative constant (Box-Cox type 2). The default is 0. modulus_trans() sets the default to 1.

SEE ALSO:

modulus_trans()

References

transform(x: FloatArrayLike) -> NDArrayFloat

Transform of x

inverse(x: FloatArrayLike) -> NDArrayFloat

Inverse of x

class mizani.transforms.modulus_trans(p: float, offset: int = 1, *, domain: DomainType = (-inf, inf), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)

Modulus Transformation

The modulus transformation generalises Box-Cox to work with both positive and negative values.

When y \neq 0

y^{(\lambda)} = sign(y) * \frac{(|y| + 1)^\lambda - 1}{\lambda}

and when y = 0

y^{(\lambda)} =  sign(y) * \ln{(|y| + 1)}

Parameters
p (python:float)

Transformation exponent \lambda.

offset (python:int)

Constant offset. 0 for Box-Cox type 1, otherwise any non-negative constant (Box-Cox type 2). The default is 1. boxcox_trans() sets the default to 0.

SEE ALSO:

boxcox_trans()

References

transform(x: FloatArrayLike) -> NDArrayFloat

Transform of x

inverse(x: FloatArrayLike) -> NDArrayFloat

Inverse of x

class mizani.transforms.datetime_trans(tz: tzinfo | str | None = None, *, domain: DomainType = (datetime.datetime(1, 1, 1, 0, 0, tzinfo=zoneinfo.ZoneInfo(key='UTC')), datetime.datetime(9999, 12, 31, 0, 0, tzinfo=zoneinfo.ZoneInfo(key='UTC'))), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)

Datetime Transformation

Parameters
tz (python:str | ZoneInfo)

Timezone information

Examples

>>> from zoneinfo import ZoneInfo
>>> UTC = ZoneInfo("UTC")
>>> EST = ZoneInfo("EST")
>>> t = datetime_trans(EST)
>>> x = [datetime(2022, 1, 20, tzinfo=UTC)]
>>> x2 = t.inverse(t.transform(x))
>>> list(x) == list(x2)
True
>>> x[0].tzinfo == x2[0].tzinfo
False
>>> x[0].tzinfo.key
'UTC'
>>> x2[0].tzinfo.key
'EST'
breaks_func: BreaksFunction

Callable to calculate breaks

format_func: FormatFunction

Function to format breaks

transform(x: DatetimeArrayLike) -> NDArrayFloat

Transform from date to a numerical format

The transform values a unit of [days].

inverse(x: FloatArrayLike) -> NDArrayDatetime

Transform to date from numerical format

property tzinfo

Alias of tz

diff_type_to_num(x: TimedeltaArrayLike) -> FloatArrayLike

Covert timedelta to numerical format

The timedeltas are converted to a unit of [days].

class mizani.transforms.exp_trans(base: float = 2.718281828459045, *, domain: DomainType = (-inf, inf), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)

Create a exponential transform class for base

This is inverse of the log transform.

Parameters
base (python:float)

Base of the logarithm

Returns
out (type)

Exponential transform class

transform(x)

Transform of x

inverse(x)

Inverse of x

class mizani.transforms.identity_trans(transform_is_linear: bool = True, *, domain: DomainType = (-inf, inf), breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)

Identity Transformation

Examples

The default trans returns one minor break between every pair of major break

>>> major = [0, 1, 2]
>>> t = identity_trans()
>>> t.minor_breaks(major)
array([0.5, 1.5])

Create a trans that returns 4 minor breaks

>>> t = identity_trans(minor_breaks_func=minor_breaks(4))
>>> t.minor_breaks(major)
array([0.2, 0.4, 0.6, 0.8, 1.2, 1.4, 1.6, 1.8])
transform_is_linear: bool = True

Whether the transformation over the whole domain is linear. e.g. 2x is linear while 1/x and log(x) are not.

transform(x)

Transform of x

inverse(x)

Inverse of x

class mizani.transforms.log10_trans(base: float = 10, *, domain: DomainType = (2.2250738585072014e-308, inf), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)

Log 10 Transformation

class mizani.transforms.log1p_trans(*, domain: DomainType = (-inf, inf), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)

Log plus one Transformation

transform(x)

Transform of x

inverse(x)

Inverse of x

class mizani.transforms.log2_trans(base: float = 2, *, domain: DomainType = (2.2250738585072014e-308, inf), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)

Log 2 Transformation

class mizani.transforms.log_trans(base: float = 2.718281828459045, *, domain: DomainType = (2.2250738585072014e-308, inf), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)

Create a log transform class for base

Parameters
base (python:float)

Base for the logarithm. If None, then the natural log is used.

Returns
out (type)

Log transform class

transform(x)

Transform of x

inverse(x)

Inverse of x

class mizani.transforms.logit_trans

Logit Transformation

class mizani.transforms.probability_trans(distribution: str, *args, **kwargs)

Probability Transformation

Parameters
distribution (python:str)

Name of the distribution. Valid distributions are listed at scipy.stats. Any of the continuous or discrete distributions.

args (python:tuple)

Arguments passed to the distribution functions.

kwargs (python:dict)

Keyword arguments passed to the distribution functions.

Notes

Make sure that the distribution is a good enough approximation for the data. When this is not the case, computations may run into errors. Absence of any errors does not imply that the distribution fits the data.

transform(x: FloatArrayLike) -> NDArrayFloat

Transform of x

inverse(x: FloatArrayLike) -> NDArrayFloat

Inverse of x

class mizani.transforms.probit_trans

Probit Transformation

class mizani.transforms.reverse_trans(*, domain: DomainType = (-inf, inf), transform_is_linear: bool = True, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)

Reverse Transformation

transform_is_linear: bool = True

Whether the transformation over the whole domain is linear. e.g. 2x is linear while 1/x and log(x) are not.

transform(x)

Transform of x

inverse(x)

Inverse of x

class mizani.transforms.sqrt_trans(*, domain: DomainType = (0, inf), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)

Square-root Transformation

transform(x)

Transform of x

inverse(x)

Inverse of x

class mizani.transforms.symlog_trans(*, domain: DomainType = (-inf, inf), transform_is_linear: bool = False, breaks_func: BreaksFunction = <mizani.breaks.breaks_symlog object>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)

Symmetric Log Transformation

They symmetric logarithmic transformation is defined as

f(x) = log(x+1) for x >= 0
       -log(-x+1) for x < 0

It can be useful for data that has a wide range of both positive and negative values (including zero).

breaks_func: BreaksFunction = <mizani.breaks.breaks_symlog object>

Callable to calculate breaks

transform(x: FloatArrayLike) -> NDArrayFloat

Transform of x

inverse(x: FloatArrayLike) -> NDArrayFloat

Inverse of x

class mizani.transforms.timedelta_trans(*, domain: DomainType = (datetime.timedelta(days=-999999999), datetime.timedelta(days=999999999, seconds=86399, microseconds=999999)), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)

Timedelta Transformation

breaks_func: BreaksFunction

Callable to calculate breaks

format_func: FormatFunction

Function to format breaks

transform(x: TimedeltaArrayLike) -> NDArrayFloat

Transform from Timeddelta to numerical format

The transform values have a unit of [days]

inverse(x: FloatArrayLike) -> Sequence[pd.Timedelta]

Transform to Timedelta from numerical format

diff_type_to_num(x: TimedeltaArrayLike) -> FloatArrayLike

Covert timedelta to numerical format

The timedeltas are converted to a unit of [days].

class mizani.transforms.pd_timedelta_trans(*, domain: DomainType = (Timedelta('-106752 days +00:12:43.145224193'), Timedelta('106751 days 23:47:16.854775807')), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)

Pandas timedelta Transformation

class mizani.transforms.pseudo_log_trans(sigma: float = 1, base: float = 2.718281828459045, *, domain: DomainType = (-inf, inf), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)

Pseudo-log transformation

A transformation mapping numbers to a signed logarithmic scale with a smooth transition to linear scale around 0.

Parameters
sigma (python:float)

Scaling factor for the linear part.

base (python:int)

Approximate logarithm used. If None, then the natural log is used.

transform(x: FloatArrayLike) -> NDArrayFloat

Transform of x

inverse(x: FloatArrayLike) -> NDArrayFloat

Inverse of x

minor_breaks(major: FloatArrayLike, limits: tuple[float, float] | None = None, n: int | None = None) -> NDArrayFloat

Calculate minor_breaks

class mizani.transforms.reciprocal_trans(*, domain: DomainType = (-inf, inf), transform_is_linear: bool = False, breaks_func: BreaksFunction = <factory>, format_func: FormatFunction = <factory>, minor_breaks_func: MinorBreaksFunction | None = None)

Reciprocal Transformation

transform(x: FloatArrayLike) -> NDArrayFloat

Transform of x

inverse(x: FloatArrayLike) -> NDArrayFloat

Inverse of x

class mizani.transforms.trans(*, domain: 'DomainType' = (-inf, inf), transform_is_linear: 'bool' = False, breaks_func: 'BreaksFunction' = <factory>, format_func: 'FormatFunction' = <factory>, minor_breaks_func: 'MinorBreaksFunction | None' = None)
transform_is_linear: bool = False

Whether the transformation over the whole domain is linear. e.g. 2x is linear while 1/x and log(x) are not.

breaks_func: BreaksFunction

Callable to calculate breaks

format_func: FormatFunction

Function to format breaks

minor_breaks_func: MinorBreaksFunction | None = None

Callable to calculate minor breaks

abstract transform(x: TFloatArrayLike) -> TFloatArrayLike

Transform of x

abstract inverse(x: TFloatArrayLike) -> TFloatArrayLike

Inverse of x

property domain_is_numerical: bool

Return True if transformation acts on numerical data. e.g. int, float, and imag are numerical but datetime is not.

minor_breaks(major: FloatArrayLike, limits: tuple[float, float] | None = None, n: int | None = None) -> NDArrayFloat

Calculate minor_breaks

breaks(limits: DomainType) -> NDArrayFloat

Calculate breaks in data space and return them in transformed space.

Expects limits to be in transform space, this is the same space as that where the domain is specified.

This method wraps around breaks_() to ensure that the calculated breaks are within the domain the transform. This is helpful in cases where an aesthetic requests breaks with limits expanded for some padding, yet the expansion goes beyond the domain of the transform. e.g for a probability transform the breaks will be in the domain [0, 1] despite any outward limits.

Parameters
limits (python:tuple)

The scale limits. Size 2.

Returns
out (numpy:array_like)

Major breaks

format(x: Any) -> Sequence[str]

Format breaks

When subclassing, you can override this function, or you can just define format_func.

diff_type_to_num(x: Any) -> FloatArrayLike

Convert the difference between two points in the domain to a numeric

This function is necessary for some arithmetic operations in the transform space of a domain when the difference in between any two points in that domain is not numeric.

For example for a domain of datetime value types, the difference on the domain is of type timedelta. In this case this function should expect timedeltas and convert them to float values that compatible (same units) as the transform value of datetimes.

Parameters
x

Differences

mizani.transforms.gettrans(t: str | Type[trans] | trans | None = None)

Return a trans object

Parameters
t (python:str | type | trans)

Name of transformation function. If None, returns an identity transform.

Returns

out (trans)

scale - Implementing a scale

According to On the theory of scales of measurement by S.S. Stevens, scales can be classified in four ways -- nominal, ordinal, interval and ratio. Using current(2016) terminology, nominal data is made up of unordered categories, ordinal data is made up of ordered categories and the two can be classified as discrete. On the other hand both interval and ratio data are continuous.

The scale classes below show how the rest of the Mizani package can be used to implement the two categories of scales. The key tasks are training and mapping and these correspond to the train and map methods.

To train a scale on data means, to make the scale learn the limits of the data. This is elaborate (or worthy of a dedicated method) for two reasons:

  • Practical -- data may be split up across more than one object, yet all will be represented by a single scale.
  • Conceptual -- training is a key action that may need to be inserted into multiple locations of the data processing pipeline before a graphic can be created.

To map data onto a scale means, to associate data values with values(potential readings) on a scale. This is perhaps the most important concept unpinning a scale.

The apply methods are simple examples of how to put it all together.

class mizani.scale.scale_continuous

Continuous scale

classmethod apply(x: FloatArrayLike, palette: ContinuousPalette, na_value: Any = None, trans: Trans | None = None) -> NDArrayFloat

Scale data continuously

Parameters
x (numpy:array_like)

Continuous values to scale

palette (python:callable() f(x))

Palette to use

na_value (object)

Value to use for missing values.

trans (trans)

How to transform the data before scaling. If None, no transformation is done.

Returns
out (numpy:array_like)

Scaled values

classmethod train(new_data: FloatArrayLike, old: tuple[float, float] | None = None) -> tuple[float, float]

Train a continuous scale

Parameters
new_data (numpy:array_like)

New values

old (numpy:array_like)

Old range

Returns
out (python:tuple)

Limits(range) of the scale

classmethod map(x: FloatArrayLike, palette: ContinuousPalette, limits: tuple[float, float], na_value: Any = None, oob: Callable[[TVector], TVector] = <function censor>) -> NDArrayFloat

Map values to a continuous palette

Parameters
x (numpy:array_like)

Continuous values to scale

palette (python:callable() f(x))

palette to use

na_value (object)

Value to use for missing values.

oob (python:callable() f(x))

Function to deal with values that are beyond the limits

Returns
out (numpy:array_like)

Values mapped onto a palette

class mizani.scale.scale_discrete

Discrete scale

classmethod apply(x: AnyArrayLike, palette: DiscretePalette, na_value: Any = None)

Scale data discretely

Parameters
x (numpy:array_like)

Discrete values to scale

palette (python:callable() f(x))

Palette to use

na_value (object)

Value to use for missing values.

Returns
out (numpy:array_like)

Scaled values

classmethod train(new_data: AnyArrayLike, old: Sequence[Any] | None = None, drop: bool = False, na_rm: bool = False) -> Sequence[Any]

Train a continuous scale

Parameters
new_data (numpy:array_like)

New values

old (numpy:array_like)

Old range. List of values known to the scale.

drop (bool)

Whether to drop(not include) unused categories

na_rm (bool)

If True, remove missing values. Missing values are either NaN or None.

Returns
out (python:list)

Values covered by the scale

classmethod map(x: AnyArrayLike, palette: DiscretePalette, limits: Sequence[Any], na_value: Any = None) -> AnyArrayLike

Map values to a discrete palette

Parameters
palette (python:callable() f(x))

palette to use

x (numpy:array_like)

Continuous values to scale

na_value (object)

Value to use for missing values.

Returns
out (numpy:array_like)

Values mapped onto a palette

Installation

mizani can be can be installed in a couple of ways depending on purpose.

Official release installation

For a normal user, it is recommended to install the official release.

$ pip install mizani

Development installation

To do any development you have to clone the mizani source repository and install the package in development mode. These commands do all of that:

$ git clone https://github.com/has2k1/mizani.git
$ cd mizani
$ pip install -e .

If you only want to use the latest development sources and do not care about having a cloned repository, e.g. if a bug you care about has been fixed but an official release has not come out yet, then use this command:

$ pip install git+https://github.com/has2k1/mizani.git

Changelog

v0.13.1

2024-12-10

Enhancements

  • Type checking pass with numpy 2.2.

v0.13.0

2024-10-24 .SS API Changes

  • Support for numpy timedelta64 has been removed. It was not well supported in the first place, so removing it should be of consequence.
  • mizani.transforms.trans_new function has been deprecated.

Enhancements

  • ~mizani.breaks.breaks_date has been slightly improved for the case when it generates monthly breaks.

New

  • trans gained new method diff_type_to_num that should be helpful with some arithmetic operations for non-numeric domains.

v0.12.2

2024-09-04 .SS Bug Fixes

  • Fixed squish and squish_infinite to work for non writeable pandas series. This is broken for numpy 2.1.0.

v0.12.1

2024-08-19 .SS Enhancements

  • Renamed "husl" color palette type to "hsluv". "husl" is the old name but we still work although not part of the API.

v0.12.0

2024-07-30 .SS API Changes

  • mizani now requires python 3.9 and above.

Bug Fixes

  • Fixed bug where a date with a timezone could lose the timezone. #45.

v0.11.4

2024-05-24 .SS Bug Fixes

  • Fixed squish and squish_infinite so that they do not reuse numpy arrays. The users object is not modified.

    This also prevents exceptions where the numpy array backs a pandas object and it is protected by copy-on-write.

v0.11.3

2024-05-09 .SS Bug Fixes

  • Fixed bug when calculating monthly breaks where when the limits are narrow and do not align with the start and end of the month, there were no dates returned. (#42)

v0.11.2

2024-04-26 .SS Bug Fixes

  • Added the ability to create reversed colormap for cmap_pal and cmap_d_pal using the matplotlib convention of name_r.

v0.11.1

2024-03-27 .SS Bug Fixes

  • Fix mizani.palettes.brewer_pal to return exact colors in the when the requested colors are less than or equal to those in the palette.
  • Add all matplotlib colormap and make them avalaible from cmap_pal and cmap_d_pal (#39).

New

  • Added breaks_symlog to calculate breaks for the symmetric logarithm transformation.

Changes

  • The default big_mark for label_number has been changed from a comma to nothing.

v0.11.0

2024-02-12 .SS Enhancements

  • Removed FutureWarnings when using pandas 2.1.0

New

  • Added breaks_symlog to calculate breaks for the symmetric logarithm transformation.

Changes

  • The default big_mark for label_number has been changed from a comma to nothing.

v0.10.0

2023-07-28 .SS API Changes

  • mpl_format has been removed, number_format takes its place.
  • mpl_breaks has been removed, extended_breaks has always been the default and it is sufficient.
  • matplotlib has been removed as a dependency of mizani.
  • mizani now requires python 3.9 and above.
  • The units parameter for of timedelta_format now accepts the values "min", "day", "week", "month", instead of "m", "d", "w", "M".
  • The naming convention for break formatting methods has changed from *_format to label_*. Specifically these methods have been renamed.

    • comma_format is now label_comma
    • custom_format is now label_custom
    • currency_format is now label_currency
    • label_dollar is now label_dollar
    • percent_format is now label_percent
    • scientific_format is now label_scientific
    • date_format is now label_date
    • number_format is now label_number
    • log_format is now label_log
    • timedelta_format is now label_timedelta
    • pvalue_format is now label_pvalue
    • ordinal_format is now label_ordinal
    • number_bytes_format is now label_bytes
  • The naming convention for break calculating methods has changed from *_breaks to breaks_*. Specifically these methods have been renamed.

    • log_breaks is now breaks_log
    • trans_minor_breaks is now minor_breaks_trans
    • date_breaks is now breaks_date
    • timedelta_breaks is now breaks_timedelta
    • extended_breaks is now breaks_extended
  • dataspace_is_numerical has changed to domain_is_numerical and it is now determined dynamically.
  • The default minor_breaks for all transforms that are not linear are now calculated in dataspace. But only if the dataspace is numerical.

New

  • symlog_trans for symmetric log transformation

v0.9.2

2023-05-25 .SS Bug Fixes

  • Fixed regression in but in date_format where it cannot deal with UTC timezone from timezone #30.

v0.9.1

2023-05-19 .SS Bug Fixes

  • Fixed but in date_format to handle datetime sequences within the same timezone but a mixed daylight saving state. (plotnine #687)

v0.9.0

2023-04-15 .SS API Changes

  • palettable dropped as a dependency.

Bug Fixes

  • Fixed bug in datetime_trans where a pandas series with an index that did not start at 0 could not be transformed.
  • Install tzdata on pyiodide/emscripten. #27

v0.8.1

2022-09-28 .SS Bug Fixes

  • Fixed regression bug in log_format for where formatting for bases 2, 8 and 16 would fail if the values were float-integers.

Enhancements

  • log_format now uses exponent notation for bases other than base 10.

v0.8.0

2022-09-26 .SS API Changes

  • The lut parameter of cmap_pal and cmap_d_pal has been deprecated and will removed in a future version.
  • datetime_trans gained parameter tz that controls the timezone of the transformation.
  • log_format gained boolean parameter mathtex for TeX values as understood matplotlib instead of values in scientific notation.

Bug Fixes

  • Fixed bug in zero_range where uint64 values would cause a RuntimeError.

v0.7.4

2022-04-02 .SS API Changes

  • comma_format is now imported automatically when using *.
  • Fixed issue with scale_discrete so that if you train on data with Nan and specify and old range that also has NaN, the result range does not include two NaN values.

v0.7.3

(2020-10-29) .SS Bug Fixes

  • Fixed log_breaks for narrow range if base=2 (#76).

v0.7.2

(2020-10-29) .SS Bug Fixes

  • Fixed bug in rescale_max() to properly handle values whose maximum is zero (#16).

v0.7.1

(2020-06-05) .SS Bug Fixes

  • Fixed regression in mizani.scales.scale_discrete.train() when trainning on values with some categoricals that have common elements.

v0.7.0

(2020-06-04) .SS Bug Fixes

  • Fixed issue with mizani.formatters.log_breaks where non-linear breaks could not be generated if the limits where greater than the largest integer sys.maxsize.
  • Fixed mizani.palettes.gradient_n_pal() to return nan for nan values.
  • Fixed mizani.scales.scale_discrete.train() when training categoricals to maintain the order. (plotnine #381)

v0.6.0

(2019-08-15) .SS New

  • Added pvalue_format
  • Added ordinal_format
  • Added number_bytes_format
  • Added pseudo_log_trans()
  • Added reciprocal_trans
  • Added modulus_trans()

Enhancements

  • mizani.breaks.date_breaks now supports intervals in the

    order of seconds.

  • mizani.palettes.brewer_pal now supports a direction argument to control the order of the returned colors.

API Changes

  • boxcox_trans() now only accepts positive values. For both positive and negative values, modulus_trans() has been added.

v0.5.4

(2019-03-26) .SS Enhancements

  • mizani.formatters.log_format now does a better job of approximating labels for numbers like 3.000000000000001e-05.

API Changes

  • exponent_threshold parameter of mizani.formatters.log_format has been deprecated.

v0.5.3

(2018-12-24) .SS API Changes

  • Log transforms now default to base - 2 minor breaks. So base 10 has 8 minor breaks and 9 partitions, base 8 has 6 minor breaks and 7 partitions, ..., base 2 has 0 minor breaks and a single partition.

v0.5.2

(2018-10-17) .SS Bug Fixes

  • Fixed issue where some functions that took pandas series would return output where the index did not match that of the input.

v0.5.1

(2018-10-15) .SS Bug Fixes

  • Fixed issue with log_breaks, so that it does not fail needlessly when the limits in the (0, 1) range.

Enhancements

  • Changed log_format to return better formatted breaks.

v0.5.0

(2018-11-10) .SS API Changes

  • Support for python 2 has been removed.
  • call() and

    meth:~mizani.breaks.trans_minor_breaks.call now accept optional parameter n which is the number of minor breaks between any two major breaks.

  • The parameter nan_value has be renamed to na_value.
  • The parameter nan_rm has be renamed to na_rm.

Enhancements

  • Better support for handling missing values when training discrete scales.
  • Changed the algorithm for log_breaks, it can now return breaks that do not fall on the integer powers of the base.

v0.4.6

(2018-03-20) .INDENT 0.0

  • Added squish

v0.4.5

(2018-03-09) .INDENT 0.0

  • Added identity_pal
  • Added cmap_d_pal

v0.4.4

(2017-12-13) .INDENT 0.0

  • Fixed date_format to respect the timezones of the dates (#8).

v0.4.3

(2017-12-01) .INDENT 0.0

  • Changed date_breaks to have more variety in the spacing between the breaks.
  • Fixed date_format to respect time part of the date (#7).

v0.4.2

(2017-11-06) .INDENT 0.0

  • Fixed (regression) break calculation for the non ordinal transforms.

v0.4.1

(2017-11-04) .INDENT 0.0

  • trans objects can now be instantiated with parameter to override attributes of the instance. And the default methods for computing breaks and minor breaks on the transform instance are not class attributes, so they can be modified without global repercussions.

v0.4.0

(2017-10-24) .SS API Changes

  • Breaks and formatter generating functions have been converted to classes, with a __call__ method. How they are used has not changed, but this makes them move flexible.
  • ExtendedWilkson class has been removed. extended_breaks() now contains the implementation of the break calculating algorithm.

v0.3.4

(2017-09-12) .INDENT 0.0

  • Fixed issue where some formatters methods failed if passed empty breaks argument.
  • Fixed issue with log_breaks() where if the limits were with in the same order of magnitude the calculated breaks were always the ends of the order of magnitude.

    Now log_breaks()((35, 50)) returns [35,  40,  45,  50] as breaks instead of [1, 100].

v0.3.3

(2017-08-30) .INDENT 0.0

  • Fixed SettingWithCopyWarnings in squish_infinite().
  • Added log_format().

API Changes

  • Added log_trans now uses log_format() as the formatting method.

v0.3.2

(2017-07-14) .INDENT 0.0

  • Added expand_range_distinct()

v0.3.1

(2017-06-22) .INDENT 0.0

  • Fixed bug where using log_breaks() with Numpy 1.13.0 led to a ValueError.

v0.3.0

(2017-04-24) .INDENT 0.0

  • Added xkcd_palette(), a palette that selects from 954 named colors.
  • Added crayon_palette(), a palette that selects from 163 named colors.
  • Added cubehelix_pal(), a function that creates a continuous palette from the cubehelix system.
  • Fixed bug where a color palette would raise an exception when passed a single scalar value instead of a list-like.
  • extended_breaks() and mpl_breaks() now return a single break if the limits are equal. Previous, one run into an Overflow and the other returned a sequence filled with n of the same limit.

API Changes

  • mpl_breaks() now returns a function that (strictly) expects a tuple with the minimum and maximum values.

v0.2.0

(2017-01-27) .INDENT 0.0

  • Fixed bug in censor() where a sequence of values with an irregular index would lead to an exception.
  • Fixed boundary issues due internal loss of precision in ported function seq().
  • Added mizani.breaks.extended_breaks() which computes breaks using a modified version of Wilkinson's tick algorithm.
  • Changed the default function mizani.transforms.trans.breaks_() used by mizani.transforms.trans to compute breaks from mizani.breaks.mpl_breaks() to mizani.breaks.extended_breaks().
  • mizani.breaks.timedelta_breaks() now uses mizani.breaks.extended_breaks() internally instead of mizani.breaks.mpl_breaks().
  • Added manual palette function mizani.palettes.manual_pal().
  • Requires pandas version 0.19.0 or higher.

v0.1.0

(2016-06-30)

First public release

Author

Hassan Kibirige

Info

Dec 10, 2024 0.13.1 Mizani