mizani - Man Page
Name
mizani — Mizani Documentation
Mizani is python library that provides the pieces necessary to create scales for a graphics system. It is based on the R Scales package.
Contents
bounds - Limiting data values for a palette
Continuous variables have values anywhere in the range minus infinite to plus infinite. However, when creating a visual representation of these values what usually matters is the relative difference between the values. This is where rescaling comes into play.
The values are mapped onto a range that a scale can deal with. For graphical representation that range tends to be [0, 1] or [0, n], where n is some number that makes the plotted object overflow the plotting area.
Although a scale may be able handle the [0, n] range, it may be desirable to have a lower bound greater than zero. For example, if data values get mapped to zero on a scale whose graphical representation is the size/area/radius/length some data will be invisible. The solution is to restrict the lower bound e.g. [0.1, 1]. Similarly you can restrict the upper bound -- using these functions.
- mizani.bounds.censor(x: TFloatVector, range: TupleFloat2 = (0, 1), only_finite: bool = True) -> TFloatVector
Convert any values outside of range to a NULL type object.
- Parameters
- x
numpy:array_like Values to manipulate
- range
python:tuple (min, max) giving desired output range
- only_finite
bool If True (the default), will only modify finite values.
- Returns
- x
numpy:array_like Censored array
Notes
All values in x should be of the same type. only_finite parameter is not considered for Datetime and Timedelta types.
The NULL type object depends on the type of values in x.
- float - float('nan')
- int - float('nan')
- datetime.datetime : np.datetime64(NaT)
- datetime.timedelta : np.timedelta64(NaT)
Examples
>>> a = np.array([1, 2, np.inf, 3, 4, -np.inf, 5]) >>> censor(a, (0, 10)) array([ 1., 2., inf, 3., 4., -inf, 5.]) >>> censor(a, (0, 10), False) array([ 1., 2., nan, 3., 4., nan, 5.]) >>> censor(a, (2, 4)) array([ nan, 2., inf, 3., 4., -inf, nan])
- mizani.bounds.expand_range(range: TupleFloat2, mul: float = 0, add: float = 0, zero_width: float = 1) -> TupleFloat2
Expand a range with a multiplicative or additive constant
- Parameters
- range
python:tuple Range of data. Size 2.
- mul
python:int | python:float Multiplicative constant
- add
python:int | python:float | timedelta Additive constant
- zero_width
python:int | python:float | timedelta Distance to use if range has zero width
- Returns
- out
python:tuple Expanded range
Notes
If expanding datetime or timedelta types, add and zero_width must be suitable timedeltas i.e. You should not mix types between Numpy, Pandas and the datetime module.
Examples
>>> expand_range((3, 8)) (3, 8) >>> expand_range((0, 10), mul=0.1) (-1.0, 11.0) >>> expand_range((0, 10), add=2) (-2, 12) >>> expand_range((0, 10), mul=.1, add=2) (-3.0, 13.0) >>> expand_range((0, 1)) (0, 1)
When the range has zero width
>>> expand_range((5, 5)) (4.5, 5.5)
- mizani.bounds.rescale(x: FloatArrayLike, to: TupleFloat2 = (0, 1), _from: TupleFloat2 | None = None) -> NDArrayFloat
Rescale numeric vector to have specified minimum and maximum.
- Parameters
- x
numpy:array_like | numeric 1D vector of values to manipulate.
- to
python:tuple output range (numeric vector of length two)
- _from
python:tuple input range (numeric vector of length two). If not given, is calculated from the range of x
- Returns
- out
numpy:array_like Rescaled values
Examples
>>> x = [0, 2, 4, 6, 8, 10] >>> rescale(x) array([0. , 0.2, 0.4, 0.6, 0.8, 1. ]) >>> rescale(x, to=(0, 2)) array([0. , 0.4, 0.8, 1.2, 1.6, 2. ]) >>> rescale(x, to=(0, 2), _from=(0, 20)) array([0. , 0.2, 0.4, 0.6, 0.8, 1. ])
- mizani.bounds.rescale_max(x: FloatArrayLike, to: TupleFloat2 = (0, 1), _from: TupleFloat2 | None = None) -> NDArrayFloat
Rescale numeric vector to have specified maximum.
- Parameters
- x
numpy:array_like 1D vector of values to manipulate.
- to
python:tuple output range (numeric vector of length two)
- _from
python:tuple input range (numeric vector of length two). If not given, is calculated from the range of x. Only the 2nd (max) element is essential to the output.
- Returns
- out
numpy:array_like Rescaled values
Examples
>>> x = np.array([0, 2, 4, 6, 8, 10]) >>> rescale_max(x, (0, 3)) array([0. , 0.6, 1.2, 1.8, 2.4, 3. ])
Only the 2nd (max) element of the parameters to and _from are essential to the output.
>>> rescale_max(x, (1, 3)) array([0. , 0.6, 1.2, 1.8, 2.4, 3. ]) >>> rescale_max(x, (0, 20)) array([ 0., 4., 8., 12., 16., 20.])
If max(x) < _from[1] then values will be scaled beyond the requested maximum (to[1]).
>>> rescale_max(x, to=(1, 3), _from=(-1, 6)) array([0., 1., 2., 3., 4., 5.])
If the values are the same, they taken on the requested maximum. This includes an array of all zeros.
>>> rescale_max(np.array([5, 5, 5])) array([1., 1., 1.]) >>> rescale_max(np.array([0, 0, 0])) array([1, 1, 1])
- mizani.bounds.rescale_mid(x: FloatArrayLike, to: TupleFloat2 = (0, 1), _from: TupleFloat2 | None = None, mid: float = 0) -> NDArrayFloat
Rescale numeric vector to have specified minimum, midpoint, and maximum.
- Parameters
- x
numpy:array_like 1D vector of values to manipulate.
- to
python:tuple output range (numeric vector of length two)
- _from
python:tuple input range (numeric vector of length two). If not given, is calculated from the range of x
- mid
numeric mid-point of input range
- Returns
- out
numpy:array_like Rescaled values
Examples
>>> rescale_mid([1, 2, 3], mid=1) array([0.5 , 0.75, 1. ]) >>> rescale_mid([1, 2, 3], mid=2) array([0. , 0.5, 1. ])
rescale_mid does have the same signature as rescale and rescale_max. In cases where we need a compatible function with the same signature, we use a closure around the extra mid argument.
>>> def rescale_mid_compat(mid): ... def _rescale(x, to=(0, 1), _from=None): ... return rescale_mid(x, to, _from, mid=mid) ... return _rescale
>>> rescale_mid2 = rescale_mid_compat(mid=2) >>> rescale_mid2([1, 2, 3]) array([0. , 0.5, 1. ])
- mizani.bounds.squish_infinite(x: FloatArrayLike, range: TupleFloat2 = (0, 1)) -> NDArrayFloat
Truncate infinite values to a range.
- Parameters
- x
numpy:array_like Values that should have infinities squished.
- range
python:tuple The range onto which to squish the infinites. Must be of size 2.
- Returns
- out
numpy:array_like Values with infinites squished.
Examples
>>> arr1 = np.array([0, .5, .25, np.inf, .44]) >>> arr2 = np.array([0, -np.inf, .5, .25, np.inf]) >>> squish_infinite(arr1) array([0. , 0.5 , 0.25, 1. , 0.44]) >>> squish_infinite(arr2, (-10, 9)) array([ 0. , -10. , 0.5 , 0.25, 9. ])
- mizani.bounds.zero_range(x: tuple[Any, Any], tol: float = 2.220446049250313e-14) -> bool
Determine if range of vector is close to zero.
- Parameters
- x
numpy:array_like Value(s) to check. If it is an array_like, it should be of length 2.
- tol
python:float Tolerance. Default tolerance is the machine epsilon times 10^2.
- Returns
- out
bool Whether x has zero range.
Examples
>>> zero_range([1, 1]) True >>> zero_range([1, 2]) False >>> zero_range([1, 2], tol=2) True
- mizani.bounds.expand_range_distinct(range: TupleFloat2, expand: TupleFloat2 | TupleFloat4 = (0, 0, 0, 0), zero_width: float = 1) -> TupleFloat2
Expand a range with a multiplicative or additive constants
Similar to expand_range() but both sides of the range expanded using different constants
- Parameters
- range
python:tuple Range of data. Size 2
- expand
python:tuple Length 2 or 4. If length is 2, then the same constants are used for both sides. If length is 4 then the first two are are the Multiplicative (mul) and Additive (add) constants for the lower limit, and the second two are the constants for the upper limit.
- zero_width
python:int | python:float | timedelta Distance to use if range has zero width
- Returns
- out
python:tuple Expanded range
Examples
>>> expand_range_distinct((3, 8)) (3, 8) >>> expand_range_distinct((0, 10), (0.1, 0)) (-1.0, 11.0) >>> expand_range_distinct((0, 10), (0.1, 0, 0.1, 0)) (-1.0, 11.0) >>> expand_range_distinct((0, 10), (0.1, 0, 0, 0)) (-1.0, 10) >>> expand_range_distinct((0, 10), (0, 2)) (-2, 12) >>> expand_range_distinct((0, 10), (0, 2, 0, 2)) (-2, 12) >>> expand_range_distinct((0, 10), (0, 0, 0, 2)) (0, 12) >>> expand_range_distinct((0, 10), (.1, 2)) (-3.0, 13.0) >>> expand_range_distinct((0, 10), (.1, 2, .1, 2)) (-3.0, 13.0) >>> expand_range_distinct((0, 10), (0, 0, .1, 2)) (0, 13.0)
- mizani.bounds.squish(x: FloatArrayLike, range: TupleFloat2 = (0, 1), only_finite: bool = True) -> NDArrayFloat
Squish values into range.
- Parameters
- x
numpy:array_like Values that should have out of range values squished.
- range
python:tuple The range onto which to squish the values.
- only_finite: boolean
When true, only squishes finite values.
- Returns
- out
numpy:array_like Values with out of range values squished.
Examples
>>> squish([-1.5, 0.2, 0.8, 1.0, 1.2]) array([0. , 0.2, 0.8, 1. , 1. ])
>>> squish([-np.inf, -1.5, 0.2, 0.8, 1.0, np.inf], only_finite=False) array([0. , 0. , 0.2, 0.8, 1. , 1. ])
breaks - Partitioning a scale for readability
All scales have a means by which the values that are mapped onto the scale are interpreted. Numeric digital scales put out numbers for direct interpretation, but most scales cannot do this. What they offer is named markers/ticks that aid in assessing the values e.g. the common odometer will have ticks and values to help gauge the speed of the vehicle.
The named markers are what we call breaks. Properly calculated breaks make interpretation straight forward. These functions provide ways to calculate good(hopefully) breaks.
- class mizani.breaks.breaks_log(n: int = 5, base: float = 10)
Integer breaks on log transformed scales
- Parameters
- n
python:int Desired number of breaks
- base
python:int Base of logarithm
Examples
>>> x = np.logspace(3, 6) >>> limits = min(x), max(x) >>> breaks_log()(limits) array([ 1000, 10000, 100000, 1000000]) >>> breaks_log(2)(limits) array([ 1000, 100000]) >>> breaks_log()([0.1, 1]) array([0.1, 0.3, 1. , 3. ])
- __call__(limits: TupleFloat2) -> NDArrayFloat
Compute breaks
- Parameters
- limits
python:tuple Minimum and maximum values
- Returns
- out
numpy:array_like Sequence of breaks points
- class mizani.breaks.breaks_symlog
Breaks for the Symmetric Logarithm Transform
- Parameters
- n
python:int Desired number of breaks
- base
python:int Base of logarithm
Examples
>>> limits = (-100, 100) >>> breaks_symlog()(limits) array([-100, -10, 0, 10, 100])
- __call__(limits: TupleFloat2) -> NDArrayFloat
Call self as a function.
- class mizani.breaks.minor_breaks(n: int = 1)
Compute minor breaks
This is the naive method. It does not take into account the transformation.
- Parameters
- n
python:int Number of minor breaks between the major breaks.
Examples
>>> major = [1, 2, 3, 4] >>> limits = [0, 5] >>> minor_breaks()(major, limits) array([0.5, 1.5, 2.5, 3.5, 4.5]) >>> minor_breaks()([1, 2], (1, 2)) array([1.5])
More than 1 minor break.
>>> minor_breaks(3)([1, 2], (1, 2)) array([1.25, 1.5 , 1.75]) >>> minor_breaks()([1, 2], (1, 2), 3) array([1.25, 1.5 , 1.75])
- __call__(major: FloatArrayLike, limits: TupleFloat2 | None = None, n: int | None = None) -> NDArrayFloat
Minor breaks
- Parameters
- major
numpy:array_like Major breaks
- limits
numpy:array_like | python:None Limits of the scale. If array_like, must be of size 2. If None, then the minimum and maximum of the major breaks are used.
- n
python:int Number of minor breaks between the major breaks. If None, then self.n is used.
- Returns
- out
numpy:array_like Minor beraks
- class mizani.breaks.minor_breaks_trans(trans: Trans, n: int = 1)
Compute minor breaks for transformed scales
The minor breaks are computed in data space. This together with major breaks computed in transform space reveals the non linearity of of a scale. See the log transforms created with log_trans() like log10_trans.
- Parameters
- trans
trans or type Trans object or trans class.
- n
python:int Number of minor breaks between the major breaks.
Examples
>>> from mizani.transforms import sqrt_trans >>> major = [1, 2, 3, 4] >>> limits = [0, 5] >>> t1 = sqrt_trans() >>> t1.minor_breaks(major, limits) array([1.58113883, 2.54950976, 3.53553391])
# Changing the regular minor_breaks method
>>> t2 = sqrt_trans() >>> t2.minor_breaks = minor_breaks() >>> t2.minor_breaks(major, limits) array([0.5, 1.5, 2.5, 3.5, 4.5])
More than 1 minor break
>>> major = [1, 10] >>> limits = [1, 10] >>> t2.minor_breaks(major, limits, 4) array([2.8, 4.6, 6.4, 8.2])
- __call__(major: FloatArrayLike, limits: TupleFloat2 | None = None, n: int | None = None) -> NDArrayFloat
Minor breaks for transformed scales
- Parameters
- major
numpy:array_like Major breaks
- limits
numpy:array_like | python:None Limits of the scale. If array_like, must be of size 2. If None, then the minimum and maximum of the major breaks are used.
- n
python:int Number of minor breaks between the major breaks. If None, then self.n is used.
- Returns
- out
numpy:array_like Minor breaks
- class mizani.breaks.breaks_date(n: int = 5, width: str | None = None)
Regularly spaced dates
- Parameters
- n
Desired number of breaks.
- width
python:str | python:None An interval specification. Must be one of [second, minute, hour, day, week, month, year] If None, the interval automatic.
Examples
>>> from datetime import datetime >>> limits = (datetime(2010, 1, 1), datetime(2026, 1, 1))
Default breaks will be regularly spaced but the spacing is automatically determined
>>> breaks = breaks_date(9) >>> [d.year for d in breaks(limits)] [2010, 2012, 2014, 2016, 2018, 2020, 2022, 2024, 2026]
Breaks at 4 year intervals
>>> breaks = breaks_date('4 year') >>> [d.year for d in breaks(limits)] [2010, 2014, 2018, 2022, 2026]
- __call__(limits: TupleT2[datetime]) -> Sequence[datetime]
Compute breaks
- Parameters
- limits
python:tuple Minimum and maximum datetime.datetime values.
- Returns
- out
numpy:array_like Sequence of break points.
- class mizani.breaks.breaks_timedelta(n: int = 5, Q: Sequence[float] = (1, 2, 5, 10))
Timedelta breaks
- Returns
- out
python:callable() f(limits) A function that takes a sequence of two datetime.timedelta values and returns a sequence of break points.
Examples
>>> from datetime import timedelta >>> breaks = breaks_timedelta() >>> x = [timedelta(days=i*365) for i in range(25)] >>> limits = min(x), max(x) >>> major = breaks(limits) >>> [val.total_seconds()/(365*24*60*60)for val in major] [0.0, 5.0, 10.0, 15.0, 20.0, 25.0]
- __call__(limits: tuple[Timedelta, Timedelta]) -> NDArrayTimedelta
Compute breaks
- Parameters
- limits
python:tuple Minimum and maximum datetime.timedelta values.
- Returns
- out
numpy:array_like Sequence of break points.
- class mizani.breaks.breaks_extended(n: int = 5, Q: Sequence[float] = (1, 5, 2, 2.5, 4, 3), only_inside: bool = False, w: Sequence[float] = (0.25, 0.2, 0.5, 0.05))
An extension of Wilkinson's tick position algorithm
- Parameters
- n
python:int Desired number of breaks
- Q
python:list List of nice numbers
- only_inside
bool If True, then all the breaks will be within the given range.
- w
python:list Weights applied to the four optimization components (simplicity, coverage, density, and legibility). They should add up to 1.
References
- Talbot, J., Lin, S., Hanrahan, P. (2010) An Extension of Wilkinson's Algorithm for Positioning Tick Labels on Axes, InfoVis 2010.
Additional Credit to Justin Talbot on whose code this implementation is almost entirely based.
Examples
>>> limits = (0, 9) >>> breaks_extended()(limits) array([ 0. , 2.5, 5. , 7.5, 10. ]) >>> breaks_extended(n=6)(limits) array([ 0., 2., 4., 6., 8., 10.])
- __call__(limits: TupleFloat2) -> NDArrayFloat
Calculate the breaks
- Parameters
- limits
array Minimum and maximum values.
- Returns
- out
numpy:array_like Sequence of break points.
labels - Labelling breaks
Scales have guides and these are what help users make sense of the data mapped onto the scale. Common examples of guides include the x-axis, the y-axis, the keyed legend and a colorbar legend. The guides have demarcations(breaks), some of which must be labelled.
The label_* functions below create functions that convert data values as understood by a specific scale and return string representations of those values. Manipulating the string representation of a value helps improve readability of the guide.
- class mizani.labels.label_comma(accuracy: float | None = None, precision: int = 0, scale: float = 1, prefix: str = '', suffix: str = '', big_mark: str = ',', decimal_mark: str = '.', fill: str = '', style_negative: Literal['-', 'hyphen', 'parens'] = '-', style_positive: Literal['', '+', ' '] = '', align: Literal['<', '>', '=', '^'] = '>', width: int | None = None)
Labels of numbers with commas as separators
- Parameters
- precision
python:int Number of digits after the decimal point.
Examples
>>> label_comma()([1000, 2, 33000, 400]) ['1,000', '2', '33,000', '400']
- class mizani.labels.label_custom(fmt: str = '{}', style: Literal['old', 'new'] = 'new')
Creating a custom labelling function
- Parameters
- fmt
python:str, optional Format string. Default is the generic new style format braces, {}.
- style
'new' | 'old' Whether to use new style or old style formatting. New style uses the str.format() while old style uses %. The format string must be written accordingly.
Examples
>>> label = label_custom('{:.2f} USD') >>> label([3.987, 2, 42.42]) ['3.99 USD', '2.00 USD', '42.42 USD']
- __call__(x: FloatArrayLike) -> Sequence[str]
Format a sequence of inputs
- Parameters
- x
array Input
- Returns
- out
python:list List of strings.
- class mizani.labels.label_currency(accuracy: float | None = None, precision: int | None = None, scale: float = 1, prefix: str = '$', suffix: str = '', big_mark: str = '', decimal_mark: str = '.', fill: str = '', style_negative: Literal['-', 'hyphen', 'parens'] = '-', style_positive: Literal['', '+', ' '] = '', align: Literal['<', '>', '=', '^'] = '>', width: int | None = None)
Labelling currencies
- Parameters
- prefix
python:str What to put before the value.
Examples
>>> x = [1.232, 99.2334, 4.6, 9, 4500] >>> label_currency()(x) ['$1.23', '$99.23', '$4.60', '$9.00', '$4500.00'] >>> label_currency(prefix='C$', precision=0, big_mark=',')(x) ['C$1', 'C$99', 'C$5', 'C$9', 'C$4,500']
- mizani.labels.label_dollar
alias of label_currency
- class mizani.labels.label_percent(accuracy: float | None = None, precision: int | None = None, scale: float = 100, prefix: str = '', suffix: str = '%', big_mark: str = '', decimal_mark: str = '.', fill: str = '', style_negative: Literal['-', 'hyphen', 'parens'] = '-', style_positive: Literal['', '+', ' '] = '', align: Literal['<', '>', '=', '^'] = '>', width: int | None = None)
Labelling percentages
Multiply by one hundred and display percent sign
Examples
>>> label = label_percent() >>> label([.45, 9.515, .01]) ['45%', '952%', '1%'] >>> label([.654, .8963, .1]) ['65%', '90%', '10%']
- class mizani.labels.label_scientific(digits: int = 3)
Scientific number labels
- Parameters
- digits
python:int Significant digits.
Notes
Be careful when using many digits (15+ on a 64 bit computer). Consider of the machine epsilon.
Examples
>>> x = [.12, .23, .34, 45] >>> label_scientific()(x) ['1.2e-01', '2.3e-01', '3.4e-01', '4.5e+01']
- __call__(x: FloatArrayLike) -> Sequence[str]
Call self as a function.
- class mizani.labels.label_date(fmt: str = '%Y-%m-%d', tz: tzinfo | None = None)
Datetime labels
- Parameters
- fmt
python:str Format string. See strftime.
- tz
datetime.tzinfo, optional Time zone information. If none is specified, the time zone will be that of the first date. If the first date has no time information then a time zone is chosen by other means.
Examples
>>> from datetime import datetime >>> x = [datetime(x, 1, 1) for x in [2010, 2014, 2018, 2022]] >>> label_date()(x) ['2010-01-01', '2014-01-01', '2018-01-01', '2022-01-01'] >>> label_date('%Y')(x) ['2010', '2014', '2018', '2022']
Can format time
>>> x = [datetime(2017, 12, 1, 16, 5, 7)] >>> label_date("%Y-%m-%d %H:%M:%S")(x) ['2017-12-01 16:05:07']
Time zones are respected
>>> UTC = ZoneInfo('UTC') >>> UG = ZoneInfo('Africa/Kampala') >>> x = [datetime(2010, 1, 1, i) for i in [8, 15]] >>> x_tz = [datetime(2010, 1, 1, i, tzinfo=UG) for i in [8, 15]] >>> label_date('%Y-%m-%d %H:%M')(x) ['2010-01-01 08:00', '2010-01-01 15:00'] >>> label_date('%Y-%m-%d %H:%M')(x_tz) ['2010-01-01 08:00', '2010-01-01 15:00']
Format with a specific time zone
>>> label_date('%Y-%m-%d %H:%M', tz=UTC)(x_tz) ['2010-01-01 05:00', '2010-01-01 12:00'] >>> label_date('%Y-%m-%d %H:%M', tz='EST')(x_tz) ['2010-01-01 00:00', '2010-01-01 07:00']
- __call__(x: Sequence[datetime]) -> Sequence[str]
Format a sequence of inputs
- Parameters
- x
array Input
- Returns
- out
python:list List of strings.
- class mizani.labels.label_number(accuracy: float | None = None, precision: int | None = None, scale: float = 1, prefix: str = '', suffix: str = '', big_mark: str = '', decimal_mark: str = '.', fill: str = '', style_negative: Literal['-', 'hyphen', 'parens'] = '-', style_positive: Literal['', '+', ' '] = '', align: Literal['<', '>', '=', '^'] = '>', width: int | None = None)
Labelling numbers
- Parameters
- precision
python:int Number of digits after the decimal point.
- suffix
python:str What to put after the value.
- big_mark
python:str The thousands separator. This is usually a comma or a dot.
- decimal_mark
python:str What to use to separate the decimals digits.
Examples
>>> label_number()([.654, .8963, .1]) ['0.65', '0.90', '0.10'] >>> label_number(accuracy=0.0001)([.654, .8963, .1]) ['0.6540', '0.8963', '0.1000'] >>> label_number(precision=4)([.654, .8963, .1]) ['0.6540', '0.8963', '0.1000'] >>> label_number(prefix="$")([5, 24, -42]) ['$5', '$24', '-$42'] >>> label_number(suffix="s")([5, 24, -42]) ['5s', '24s', '-42s'] >>> label_number(big_mark="_")([1e3, 1e4, 1e5, 1e6]) ['1_000', '10_000', '100_000', '1_000_000'] >>> label_number(width=3)([1, 10, 100, 1000]) [' 1', ' 10', '100', '1000'] >>> label_number(align="^", width=5)([1, 10, 100, 1000]) [' 1 ', ' 10 ', ' 100 ', '1000 '] >>> label_number(style_positive=" ")([5, 24, -42]) [' 5', ' 24', '-42'] >>> label_number(style_positive="+")([5, 24, -42]) ['+5', '+24', '-42'] >>> label_number(prefix="$", style_negative="braces")([5, 24, -42]) ['$5', '$24', '($42)']
- __call__(x: FloatArrayLike) -> Sequence[str]
Call self as a function.
- class mizani.labels.label_log(base: float = 10, exponent_limits: TupleInt2 = (-4, 4), mathtex: bool = False)
Log number labels
- Parameters
- base
python:int Base of the logarithm. Default is 10.
- exponent_limits
python:tuple limits (int, int) where if the any of the powers of the numbers falls outside, then the labels will be in exponent form. This only applies for base 10.
- mathtex
bool If True, return the labels in mathtex format as understood by Matplotlib.
Examples
>>> label_log()([0.001, 0.1, 100]) ['0.001', '0.1', '100']
>>> label_log()([0.0001, 0.1, 10000]) ['1e-4', '1e-1', '1e4']
>>> label_log(mathtex=True)([0.0001, 0.1, 10000]) ['$10^{-4}$', '$10^{-1}$', '$10^{4}$']
- __call__(x: FloatArrayLike) -> Sequence[str]
Format a sequence of inputs
- Parameters
- x
array Input
- Returns
- out
python:list List of strings.
- class mizani.labels.label_timedelta(units: DurationUnit | None = None, show_units: bool = True, zero_has_units: bool = True, usetex: bool = False, space: bool = True, use_plurals: bool = True)
Timedelta labels
- Parameters
- units
python:str, optional The units in which the breaks will be computed. If None, they are decided automatically. Otherwise, the value should be one of:
'ns' # nanoseconds 'us' # microseconds 'ms' # milliseconds 's' # seconds 'min' # minute 'h' # hour 'day' # day 'week' # week 'month' # month 'year' # year
- show_units
bool Whether to append the units symbol to the values.
- zero_has_units
bool If True a value of zero
- usetex
bool If True, they microseconds identifier string is rendered with greek letter mu. Default is False.
- space
bool If True add a space between the value and the units
- use_plurals
bool If True, for the when the value is not 1 and the units are one of week, month and year, the plural form of the unit is used e.g. 2 weeks.
Examples
>>> from datetime import timedelta >>> x = [timedelta(days=31*i) for i in range(5)] >>> label_timedelta()(x) ['0 months', '1 month', '2 months', '3 months', '4 months'] >>> label_timedelta(use_plurals=False)(x) ['0 month', '1 month', '2 month', '3 month', '4 month'] >>> label_timedelta(units='day')(x) ['0 days', '31 days', '62 days', '93 days', '124 days'] >>> label_timedelta(units='day', zero_has_units=False)(x) ['0', '31 days', '62 days', '93 days', '124 days'] >>> label_timedelta(units='day', show_units=False)(x) ['0', '31', '62', '93', '124']
- __call__(x: NDArrayTimedelta) -> Sequence[str]
Call self as a function.
- class mizani.labels.label_pvalue(accuracy: float = 0.001, add_p: float = False)
p-values labelling
- Parameters
- accuracy
python:float Number to round to
- add_p
bool Whether to prepend "p=" or "p<" to the output
Examples
>>> x = [.90, .15, .015, .009, 0.0005] >>> label_pvalue()(x) ['0.9', '0.15', '0.015', '0.009', '<0.001'] >>> label_pvalue(0.1)(x) ['0.9', '0.1', '<0.1', '<0.1', '<0.1'] >>> label_pvalue(0.1, True)(x) ['p=0.9', 'p=0.1', 'p<0.1', 'p<0.1', 'p<0.1']
- __call__(x: FloatArrayLike) -> Sequence[str]
Format a sequence of inputs
- Parameters
- x
array Input
- Returns
- out
python:list List of strings.
- class mizani.labels.label_ordinal(prefix: str = '', suffix: str = '', big_mark: str = '')
Ordinal number labelling
- Parameters
- prefix
python:str What to put before the value.
- suffix
python:str What to put after the value.
- big_mark
python:str The thousands separator. This is usually a comma or a dot.
Examples
>>> label_ordinal()(range(8)) ['0th', '1st', '2nd', '3rd', '4th', '5th', '6th', '7th'] >>> label_ordinal(suffix=' Number')(range(11, 15)) ['11th Number', '12th Number', '13th Number', '14th Number']
- __call__(x: FloatArrayLike) -> Sequence[str]
Call self as a function.
- class mizani.labels.label_bytes(symbol: Literal['auto'] | BytesSymbol = 'auto', units: Literal['binary', 'si'] = 'binary', fmt: str = '{:.0f} ')
Labelling byte numbers
- Parameters
- symbol
python:str Valid symbols are "B", "kB", "MB", "GB", "TB", "PB", "EB", "ZB", and "YB" for SI units, and the "iB" variants for binary units. Default is "auto" where the symbol to be used is determined separately for each value of 1x.
- units
"binary" | "si" Which unit base to use, 1024 for "binary" or 1000 for "si".
- fmt
python:str, optional Format sting. Default is {:.0f}.
Examples
>>> x = [1000, 1000000, 4e5] >>> label_bytes()(x) ['1000 B', '977 KiB', '391 KiB'] >>> label_bytes(units='si')(x) ['1 kB', '1 MB', '400 kB']
- __call__(x: FloatArrayLike) -> Sequence[str]
Call self as a function.
palettes - Mapping values onto the domain of a scale
Palettes are the link between data values and the values along the dimension of a scale. Before a collection of values can be represented on a scale, they are transformed by a palette. This transformation is knowing as mapping. Values are mapped onto a scale by a palette.
Scales tend to have restrictions on the magnitude of quantities that they can intelligibly represent. For example, the size of a point should be significantly smaller than the plot panel onto which it is plotted or else it would be hard to compare two or more points. Therefore palettes must be created that enforce such restrictions. This is the reason for the *_pal functions that create and return the actual palette functions.
- mizani.palettes.hls_palette(n_colors: int = 6, h: float = 0.01, l: float = 0.6, s: float = 0.65) -> Sequence[TupleFloat3]
Get a set of evenly spaced colors in HLS hue space.
h, l, and s should be between 0 and 1
- Parameters
- n_colors
python:int number of colors in the palette
- h
python:float first hue
- l
python:float lightness
- s
python:float saturation
- Returns
- palette
python:list List of colors as RGB hex strings.
SEE ALSO:
- hsluv_palette
Make a palette using evenly spaced circular hues in the HSLuv system.
Examples
>>> len(hls_palette(2)) 2 >>> len(hls_palette(9)) 9
- mizani.palettes.hsluv_palette(n_colors: int = 6, h: float = 0.01, s: float = 0.9, l: float = 0.65) -> Sequence[TupleFloat3]
Get a set of evenly spaced colors in HSLuv hue space.
h, s, and l should be between 0 and 1
- Parameters
- n_colors
python:int number of colors in the palette
- h
python:float first hue
- s
python:float saturation
- l
python:float lightness
- Returns
- palette
python:list List of colors as RGB hex strings.
SEE ALSO:
- hls_palette
Make a palette using evenly spaced circular hues in the HSL system.
Examples
>>> len(hsluv_palette(3)) 3 >>> len(hsluv_palette(11)) 11
- class mizani.palettes.rescale_pal(range: TupleFloat2 = (0.1, 1))
Rescale the input to the specific output range.
Useful for alpha, size, and continuous position.
- Parameters
- range
python:tuple Range of the scale
- Returns
- out
function Palette function that takes a sequence of values in the range [0, 1] and returns values in the specified range.
Examples
>>> palette = rescale_pal() >>> palette([0, .2, .4, .6, .8, 1]) array([0.1 , 0.28, 0.46, 0.64, 0.82, 1. ])
The returned palette expects inputs in the [0, 1] range. Any value outside those limits is clipped to range[0] or range[1].
>>> palette([-2, -1, 0.2, .4, .8, 2, 3]) array([0.1 , 0.1 , 0.28, 0.46, 0.82, 1. , 1. ])
- class mizani.palettes.area_pal(range: TupleFloat2 = (1, 6))
Point area palette (continuous).
- Parameters
- range
python:tuple Numeric vector of length two, giving range of possible sizes. Should be greater than 0.
- Returns
- out
function Palette function that takes a sequence of values in the range [0, 1] and returns values in the specified range.
Examples
>>> x = np.arange(0, .6, .1)**2 >>> palette = area_pal() >>> palette(x) array([1. , 1.5, 2. , 2.5, 3. , 3.5])
The results are equidistant because the input x is in area space, i.e it is squared.
- class mizani.palettes.abs_area(max: float)
Point area palette (continuous), with area proportional to value.
- Parameters
- max
python:float A number representing the maximum size
- Returns
- out
function Palette function that takes a sequence of values in the range [0, 1] and returns values in the range [0, max].
Examples
>>> x = np.arange(0, .8, .1)**2 >>> palette = abs_area(5) >>> palette(x) array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5])
Compared to area_pal(), abs_area() will handle values in the range [-1, 0] without returning np.nan. And values whose absolute value is greater than 1 will be clipped to the maximum.
- class mizani.palettes.grey_pal(start: float = 0.2, end: float = 0.8)
Utility for creating continuous grey scale palette
- Parameters
- start
python:float grey value at low end of palette
- end
python:float grey value at high end of palette
- Returns
- out
function Continuous color palette that takes a single int parameter n and returns n equally spaced colors.
Examples
>>> palette = grey_pal() >>> palette(5) ['#333333', '#737373', '#989898', '#b4b4b4', '#cccccc']
- class mizani.palettes.hue_pal(h: float = 0.01, l: float = 0.6, s: float = 0.65, color_space: Literal['hls', 'hsluv'] = 'hls')
Utility for making hue palettes for color schemes.
- Parameters
- h
python:float first hue. In the [0, 1] range
- l
python:float lightness. In the [0, 1] range
- s
python:float saturation. In the [0, 1] range
- color_space
'hls' | 'hsluv' Color space to use for the palette. hls for https://en.wikipedia.org/wiki/HSL_and_HSV or hsluv for https://www.hsluv.org/.
- Returns
- out
function A discrete color palette that takes a single int parameter n and returns n equally spaced colors. Though the palette is continuous, since it is varies the hue it is good for categorical data. However if n is large enough the colors show continuity.
Examples
>>> hue_pal()(5) ['#db5f57', '#b9db57', '#57db94', '#5784db', '#c957db'] >>> hue_pal(color_space='hsluv')(5) ['#e0697e', '#9b9054', '#569d79', '#5b98ab', '#b675d7']
- class mizani.palettes.brewer_pal(type: ColorScheme | ColorSchemeShort = 'seq', palette: int | str = 1, direction: Literal[1, -1] = 1)
Utility for making a brewer palette
- Parameters
- type
'sequential' | 'qualitative' | 'diverging' Type of palette. Sequential, Qualitative or Diverging. The following abbreviations may be used, seq, qual or div.
- palette
python:int | python:str Which palette to choose from. If is an integer, it must be in the range [0, m], where m depends on the number sequential, qualitative or diverging palettes. If it is a string, then it is the name of the palette.
- direction
python:int The order of colours in the scale. If -1 the order of colors is reversed. The default is 1.
- Returns
- out
function A color palette that takes a single int parameter n and returns n colors. The maximum value of n varies depending on the parameters.
Examples
>>> brewer_pal()(5) ['#EFF3FF', '#BDD7E7', '#6BAED6', '#3182BD', '#08519C'] >>> brewer_pal('qual')(5) ['#7FC97F', '#BEAED4', '#FDC086', '#FFFF99', '#386CB0'] >>> brewer_pal('qual', 2)(5) ['#1B9E77', '#D95F02', '#7570B3', '#E7298A', '#66A61E'] >>> brewer_pal('seq', 'PuBuGn')(5) ['#F6EFF7', '#BDC9E1', '#67A9CF', '#1C9099', '#016C59']
The available color names for each palette type can be obtained using the following code:
from mizani._colors.brewer import get_palette_names print(get_palette_names("sequential")) print(get_palette_names("qualitative")) print(get_palette_names("diverging"))
- class mizani.palettes.gradient_n_pal(colors: Sequence[str], values: Sequence[float] | None = None)
Create a n color gradient palette
- Parameters
- colors
python:list list of colors
- values
python:list, optional list of points in the range [0, 1] at which to place each color. Must be the same size as colors. Default to evenly space the colors
- Returns
- out
function Continuous color palette that takes a single parameter either a float or a sequence of floats maps those value(s) onto the palette and returns color(s). The float(s) must be in the range [0, 1].
Examples
>>> palette = gradient_n_pal(['red', 'blue']) >>> palette([0, .25, .5, .75, 1]) ['#ff0000', '#bf0040', '#7f0080', '#4000bf', '#0000ff'] >>> palette([-np.inf, 0, np.nan, 1, np.inf]) [None, '#ff0000', None, '#0000ff', None]
- class mizani.palettes.cmap_pal(name: str)
Create a continuous palette using a colormap
- Parameters
- name
python:str Name of colormap
- Returns
- out
function Continuous color palette that takes a single parameter either a float or a sequence of floats maps those value(s) onto the palette and returns color(s). The float(s) must be in the range [0, 1].
Examples
>>> palette = cmap_pal('viridis') >>> palette([.1, .2, .3, .4, .5]) ['#482475', '#414487', '#355f8d', '#2a788e', '#21918c']
- class mizani.palettes.cmap_d_pal(name: str)
Create a discrete palette from a colormap
- Parameters
- name
python:str Name of colormap
- Returns
- out
function A discrete color palette that takes a single int parameter n and returns n colors. The maximum value of n varies depending on the parameters.
Examples
>>> palette = cmap_d_pal('viridis') >>> palette(5) ['#440154', '#3b528b', '#21918c', '#5ec962', '#fde725']
- class mizani.palettes.desaturate_pal(color: str, prop: float, reverse: bool = False)
Create a palette that desaturate a color by some proportion
- Parameters
- color
color html color name, hex, rgb-tuple
- prop
python:float saturation channel of color will be multiplied by this value
- reverse
bool Whether to reverse the palette.
- Returns
- out
function Continuous color palette that takes a single parameter either a float or a sequence of floats maps those value(s) onto the palette and returns color(s). The float(s) must be in the range [0, 1].
Examples
>>> palette = desaturate_pal('red', .1) >>> palette([0, .25, .5, .75, 1]) ['#ff0000', '#e21d1d', '#c53a3a', '#a95656', '#8c7373']
- class mizani.palettes.manual_pal(values: Sequence[Any])
Create a palette from a list of values
- Parameters
- values
python:sequence Values that will be returned by the palette function.
- Returns
- out
function A function palette that takes a single int parameter n and returns n values.
Examples
>>> palette = manual_pal(['a', 'b', 'c', 'd', 'e']) >>> palette(3) ['a', 'b', 'c']
- mizani.palettes.xkcd_palette(colors: Sequence[str]) -> Sequence[RGBHexColor]
Make a palette with color names from the xkcd color survey.
See xkcd for the full list of colors: http://xkcd.com/color/rgb/
- Parameters
- colors
python:list of strings List of keys in the mizani.colors.xkcd_rgb dictionary.
- Returns
- palette
python:list List of colors as RGB hex strings.
Examples
>>> palette = xkcd_palette(['red', 'green', 'blue']) >>> palette ['#E50000', '#15B01A', '#0343DF']
>>> from mizani._colors.named_colors import XKCD >>> list(sorted(XKCD.keys()))[:4] ['xkcd:acid green', 'xkcd:adobe', 'xkcd:algae', 'xkcd:algae green']
- mizani.palettes.crayon_palette(colors: Sequence[str]) -> Sequence[RGBHexColor]
Make a palette with color names from Crayola crayons.
The colors come from http://en.wikipedia.org/wiki/List_of_Crayola_crayon_colors
- Parameters
- colors
python:list of strings List of keys in the mizani.colors.crayloax_rgb dictionary.
- Returns
- palette
python:list List of colors as RGB hex strings.
Examples
>>> palette = crayon_palette(['almond', 'silver', 'yellow']) >>> palette ['#EED9C4', '#C9C0BB', '#FBE870']
>>> from mizani._colors.named_colors import CRAYON >>> list(sorted(CRAYON.keys()))[:3] ['crayon:almond', 'crayon:antique brass', 'crayon:apricot']
- class mizani.palettes.cubehelix_pal(start: int = 0, rotation: float = 0.4, gamma: float = 1.0, hue: float = 0.8, light: float = 0.85, dark: float = 0.15, reverse: bool = False)
Utility for creating discrete palette from the cubehelix system.
This produces a colormap with linearly-decreasing (or increasing) brightness. That means that information will be preserved if printed to black and white or viewed by someone who is colorblind.
- Parameters
- start
python:float (0 <= start <= 3) The hue at the start of the helix.
- rot
python:float Rotations around the hue wheel over the range of the palette.
- gamma
python:float (0 <= gamma) Gamma factor to emphasize darker (gamma < 1) or lighter (gamma > 1) colors.
- hue
python:float (0 <= hue <= 1) Saturation of the colors.
- dark
python:float (0 <= dark <= 1) Intensity of the darkest color in the palette.
- light
python:float (0 <= light <= 1) Intensity of the lightest color in the palette.
- reverse
bool If True, the palette will go from dark to light.
- Returns
- out
function Continuous color palette that takes a single int parameter n and returns n equally spaced colors.
References
Green, D. A. (2011). "A colour scheme for the display of astronomical intensity images". Bulletin of the Astromical Society of India, Vol. 39, p. 289-295.
Examples
>>> palette = cubehelix_pal() >>> palette(5) ['#edd1cb', '#d499a7', '#aa678f', '#6e4071', '#2d1e3e']
- mizani.palettes.identity_pal() -> Callable[[T], T]
Create palette that maps values onto themselves
- Returns
- out
function Palette function that takes a value or sequence of values and returns the same values.
Examples
>>> palette = identity_pal() >>> palette(9) 9 >>> palette([2, 4, 6]) [2, 4, 6]
- class mizani.palettes.none_pal
Discrete palette that returns only None values
transforms - Transforming variables, scales and coordinates
"The Grammar of Graphics (2005)" by Wilkinson, Anand and Grossman describes three types of transformations.
- Variable transformations - Used to make statistical operations on variables appropriate and meaningful. They are also used to new variables.
- Scale transformations - Used to make statistical objects displayed on dimensions appropriate and meaningful.
- Coordinate transformations - Used to manipulate the geometry of graphics to help perceive relationships and find meaningful structures for representing variations.
Variable and scale transformations are similar in-that they lead to plotted objects that are indistinguishable. Typically, variable transformation is done outside the graphics system and so the system cannot provide transformation specific guides & decorations for the plot. The trans is aimed at being useful for scale and coordinate transformations.
- class mizani.transforms.asn_trans(**kwargs: Any)
Arc-sin square-root Transformation
- static transform(x: FloatArrayLike) -> NDArrayFloat
Transform of x
- static inverse(x: FloatArrayLike) -> NDArrayFloat
Inverse of x
- class mizani.transforms.atanh_trans(**kwargs: Any)
Arc-tangent Transformation
transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'arctanh'>
inverse(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'tanh'>
- mizani.transforms.boxcox_trans(p, offset=0, **kwargs)
Boxcox Transformation
The Box-Cox transformation is a flexible transformation, often used to transform data towards normality.
The Box-Cox power transformation (type 1) requires strictly positive values and takes the following form for y \gt 0:
y^{(\lambda)} = \frac{y^\lambda - 1}{\lambda}
When y = 0, the natural log transform is used.
- Parameters
- p
python:float Transformation exponent \lambda.
- offset
python:int Constant offset. 0 for Box-Cox type 1, otherwise any non-negative constant (Box-Cox type 2). The default is 0. modulus_trans() sets the default to 1.
- kwargs
python:dict Keyword arguments passed onto trans_new(). Should not include the transform or inverse.
- SEE ALSO:
modulus_trans()
References
- Box, G. E., & Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society. Series B (Methodological), 211-252. https://www.jstor.org/stable/2984418
- John, J. A., & Draper, N. R. (1980). An alternative family of transformations. Applied Statistics, 190-197. http://www.jstor.org/stable/2986305
- mizani.transforms.modulus_trans(p, offset=1, **kwargs)
Modulus Transformation
The modulus transformation generalises Box-Cox to work with both positive and negative values.
When y \neq 0
y^{(\lambda)} = sign(y) * \frac{(|y| + 1)^\lambda - 1}{\lambda}
and when y = 0
y^{(\lambda)} = sign(y) * \ln{(|y| + 1)}
- Parameters
- p
python:float Transformation exponent \lambda.
- offset
python:int Constant offset. 0 for Box-Cox type 1, otherwise any non-negative constant (Box-Cox type 2). The default is 1. boxcox_trans() sets the default to 0.
- kwargs
python:dict Keyword arguments passed onto trans_new(). Should not include the transform or inverse.
- SEE ALSO:
boxcox_trans()
References
- Box, G. E., & Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society. Series B (Methodological), 211-252. https://www.jstor.org/stable/2984418
- John, J. A., & Draper, N. R. (1980). An alternative family of transformations. Applied Statistics, 190-197. http://www.jstor.org/stable/2986305
- class mizani.transforms.datetime_trans(tz=None, **kwargs)
Datetime Transformation
- Parameters
- tz
python:str | ZoneInfo Timezone information
Examples
>>> from zoneinfo import ZoneInfo >>> UTC = ZoneInfo("UTC") >>> EST = ZoneInfo("EST") >>> t = datetime_trans(EST) >>> x = [datetime(2022, 1, 20, tzinfo=UTC)] >>> x2 = t.inverse(t.transform(x)) >>> list(x) == list(x2) True >>> x[0].tzinfo == x2[0].tzinfo False >>> x[0].tzinfo.key 'UTC' >>> x2[0].tzinfo.key 'EST'
- breaks_: BreaksFunction = <mizani.breaks.breaks_date object>
Callable to calculate breaks
- format: FormatFunction = label_date(fmt='%Y-%m-%d', tz=None)
Function to format breaks
- transform(x: DatetimeArrayLike) -> NDArrayFloat
Transform from date to a numerical format
- inverse(x: FloatArrayLike) -> NDArrayDatetime
Transform to date from numerical format
- property tzinfo
Alias of tz
- mizani.transforms.exp_trans(base: float | None = None, **kwargs: Any)
Create a exponential transform class for base
This is inverse of the log transform.
- Parameters
- base
python:float Base of the logarithm
- kwargs
python:dict Keyword arguments passed onto trans_new(). Should not include the transform or inverse.
- Returns
- out
type Exponential transform class
- class mizani.transforms.identity_trans(**kwargs: Any)
Identity Transformation
Examples
The default trans returns one minor break between every pair of major break
>>> major = [0, 1, 2] >>> t = identity_trans() >>> t.minor_breaks(major) array([0.5, 1.5])
Create a trans that returns 4 minor breaks
>>> t = identity_trans(minor_breaks=minor_breaks(4)) >>> t.minor_breaks(major) array([0.2, 0.4, 0.6, 0.8, 1.2, 1.4, 1.6, 1.8])
- transform_is_linear: bool = True
Whether the transformation over the whole domain is linear. e.g. 2x is linear while 1/x and log(x) are not.
- static transform(param: T) -> T
Return whatever is passed in
- static inverse(param: T) -> T
Return whatever is passed in
- class mizani.transforms.log10_trans(**kwargs: Any)
Log 10 Transformation
- breaks_: BreaksFunction = <mizani.breaks.breaks_log object>
Callable to calculate breaks
- format: FormatFunction = label_log(base=10, exponent_limits=(-4, 4), mathtex=False)
Function to format breaks
- static inverse(x)
Inverse of x
transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'log10'>
- class mizani.transforms.log1p_trans(**kwargs: Any)
Log plus one Transformation
transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'log1p'>
inverse(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'expm1'>
- class mizani.transforms.log2_trans(**kwargs: Any)
Log 2 Transformation
- breaks_: BreaksFunction = <mizani.breaks.breaks_log object>
Callable to calculate breaks
- format: FormatFunction = label_log(base=2, exponent_limits=(-4, 4), mathtex=False)
Function to format breaks
- static inverse(x)
Inverse of x
transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'log2'>
- mizani.transforms.log_trans(base: float | None = None, **kwargs: Any) -> trans
Create a log transform class for base
- Parameters
- base
python:float Base for the logarithm. If None, then the natural log is used.
- kwargs
python:dict Keyword arguments passed onto trans_new(). Should not include the transform or inverse.
- Returns
- out
type Log transform class
- class mizani.transforms.logit_trans(**kwargs: Any)
Logit Transformation
- static inverse(x: FloatArrayLike) -> NDArrayFloat
Inverse of x
- static transform(x: FloatArrayLike) -> NDArrayFloat
Transform of x
- mizani.transforms.probability_trans(distribution: str, *args, **kwargs) -> trans
Probability Transformation
- Parameters
- distribution
python:str Name of the distribution. Valid distributions are listed at scipy.stats. Any of the continuous or discrete distributions.
- args
python:tuple Arguments passed to the distribution functions.
- kwargs
python:dict Keyword arguments passed to the distribution functions.
Notes
Make sure that the distribution is a good enough approximation for the data. When this is not the case, computations may run into errors. Absence of any errors does not imply that the distribution fits the data.
- mizani.transforms.probit_trans
alias of norm_trans
- class mizani.transforms.reverse_trans(**kwargs: Any)
Reverse Transformation
- transform_is_linear: bool = True
Whether the transformation over the whole domain is linear. e.g. 2x is linear while 1/x and log(x) are not.
transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'negative'>
inverse(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'negative'>
- class mizani.transforms.sqrt_trans(**kwargs: Any)
Square-root Transformation
transform(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'sqrt'>
inverse(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'square'>
- class mizani.transforms.symlog_trans(**kwargs: Any)
Symmetric Log Transformation
They symmetric logarithmic transformation is defined as
f(x) = log(x+1) for x >= 0 -log(-x+1) for x < 0
It can be useful for data that has a wide range of both positive and negative values (including zero).
- breaks_: BreaksFunction = <mizani.breaks.breaks_symlog object>
Callable to calculate breaks
- static transform(x: FloatArrayLike) -> NDArrayFloat
Transform of x
- static inverse(x: FloatArrayLike) -> NDArrayFloat
Inverse of x
- class mizani.transforms.timedelta_trans(**kwargs: Any)
Timedelta Transformation
- breaks_: BreaksFunction = <mizani.breaks.breaks_timedelta object>
Callable to calculate breaks
- format: FormatFunction = label_timedelta(units=None, show_units=True, zero_has_units=True, usetex=False, space=True, use_plurals=True)
Function to format breaks
- static transform(x: NDArrayTimedelta | Sequence[timedelta]) -> NDArrayFloat
Transform from Timeddelta to numerical format
- static inverse(x: FloatArrayLike) -> NDArrayTimedelta
Transform to Timedelta from numerical format
- class mizani.transforms.pd_timedelta_trans(**kwargs: Any)
Pandas timedelta Transformation
- breaks_: BreaksFunction = <mizani.breaks.breaks_timedelta object>
Callable to calculate breaks
- format: FormatFunction = label_timedelta(units=None, show_units=True, zero_has_units=True, usetex=False, space=True, use_plurals=True)
Function to format breaks
- static transform(x: TimedeltaSeries) -> NDArrayFloat
Transform from Timeddelta to numerical format
- static inverse(x: FloatArrayLike) -> NDArrayTimedelta
Transform to Timedelta from numerical format
- class mizani.transforms.pseudo_log_trans(sigma=1, base=None, **kwargs)
Pseudo-log transformation
A transformation mapping numbers to a signed logarithmic scale with a smooth transition to linear scale around 0.
- Parameters
- sigma
python:float Scaling factor for the linear part.
- base
python:int Approximate logarithm used. If None, then the natural log is used.
- kwargs
python:dict Keyword arguments passed onto trans_new(). Should not include the transform or inverse.
- transform(x: FloatArrayLike) -> NDArrayFloat
Transform of x
- inverse(x: FloatArrayLike) -> NDArrayFloat
Inverse of x
- minor_breaks(major: FloatArrayLike, limits: TupleFloat2 | None = None, n: int | None = None) -> NDArrayFloat
Calculate minor_breaks
- class mizani.transforms.reciprocal_trans(**kwargs: Any)
Reciprocal Transformation
- static transform(x: FloatArrayLike) -> NDArrayFloat
Transform of x
- static inverse(x: FloatArrayLike) -> NDArrayFloat
Inverse of x
- class mizani.transforms.trans(**kwargs: Any)
Base class for all transforms
This class is used to transform data and also tell the x and y axes how to create and label the tick locations.
The key methods to override are trans.transform() and trans.inverse(). Alternately, you can quickly create a transform class using the trans_new() function.
- Parameters
- kwargs
python:dict Attributes of the class to set/override
- transform_is_linear: bool = False
Whether the transformation over the whole domain is linear. e.g. 2x is linear while 1/x and log(x) are not.
- breaks_: BreaksFunction = <mizani.breaks.breaks_extended object>
Callable to calculate breaks
- format: FormatFunction = label_number(accuracy=None, precision=None, scale=1, prefix='', suffix='', big_mark='', decimal_mark='.', fill='', style_negative='-', style_positive='', align='>', width=None)
Function to format breaks
- property domain_is_numerical: bool
Return True if transformation acts on numerical data. e.g. int, float, and imag are numerical but datetime is not.
- minor_breaks(major: FloatArrayLike, limits: TupleFloat2 | None = None, n: int | None = None) -> NDArrayFloat
Calculate minor_breaks
- abstract static transform(x: TFloatArrayLike) -> TFloatArrayLike
Transform of x
- abstract static inverse(x: TFloatArrayLike) -> TFloatArrayLike
Inverse of x
- breaks(limits: DomainType) -> NDArrayFloat
Calculate breaks in data space and return them in transformed space.
Expects limits to be in transform space, this is the same space as that where the domain is specified.
This method wraps around breaks_() to ensure that the calculated breaks are within the domain the transform. This is helpful in cases where an aesthetic requests breaks with limits expanded for some padding, yet the expansion goes beyond the domain of the transform. e.g for a probability transform the breaks will be in the domain [0, 1] despite any outward limits.
- Parameters
- limits
python:tuple The scale limits. Size 2.
- Returns
- out
numpy:array_like Major breaks
- mizani.transforms.trans_new(name: str, transform: TransformFunction, inverse: InverseFunction, breaks: BreaksFunction | None = None, minor_breaks: MinorBreaksFunction | None = None, _format: FormatFunction | None = None, domain=(-inf, inf), doc: str = '', **kwargs) -> trans
Create a transformation class object
- Parameters
- name
python:str Name of the transformation
- transform
python:callable() f(x) A function (preferably a ufunc) that computes the transformation.
- inverse
python:callable() f(x) A function (preferably a ufunc) that computes the inverse of the transformation.
- breaks
python:callable() f(limits) Function to compute the breaks for this transform. If None, then a default good enough for a linear domain is used.
- minor_breaks
python:callable() f(major, limits) Function to compute the minor breaks for this transform. If None, then a default good enough for a linear domain is used.
- _format
python:callable() f(breaks) Function to format the generated breaks.
- domain
numpy:array_like Domain over which the transformation is valid. It should be of length 2.
- doc
python:str Docstring for the class.
- **kwargs
python:dict Attributes of the transform, e.g if base is passed in kwargs, then t.base would be a valied attribute.
- Returns
- out
trans Transform class
- mizani.transforms.gettrans(t: str | Callable[[], Type[trans]] | Type[trans] | trans | None = None)
Return a trans object
- Parameters
- t
python:str | python:callable() | type | trans Name of transformation function. If None, returns an identity transform.
- Returns
- out
trans.UNINDENT
scale - Implementing a scale
According to On the theory of scales of measurement by S.S. Stevens, scales can be classified in four ways -- nominal, ordinal, interval and ratio. Using current(2016) terminology, nominal data is made up of unordered categories, ordinal data is made up of ordered categories and the two can be classified as discrete. On the other hand both interval and ratio data are continuous.
The scale classes below show how the rest of the Mizani package can be used to implement the two categories of scales. The key tasks are training and mapping and these correspond to the train and map methods.
To train a scale on data means, to make the scale learn the limits of the data. This is elaborate (or worthy of a dedicated method) for two reasons:
- Practical -- data may be split up across more than one object, yet all will be represented by a single scale.
- Conceptual -- training is a key action that may need to be inserted into multiple locations of the data processing pipeline before a graphic can be created.
To map data onto a scale means, to associate data values with values(potential readings) on a scale. This is perhaps the most important concept unpinning a scale.
The apply methods are simple examples of how to put it all together.
- class mizani.scale.scale_continuous
Continuous scale
- classmethod apply(x: FloatArrayLike, palette: ContinuousPalette, na_value: Any = None, trans: Trans | None = None) -> NDArrayFloat
Scale data continuously
- Parameters
- x
numpy:array_like Continuous values to scale
- palette
python:callable() f(x) Palette to use
- na_value
object Value to use for missing values.
- trans
trans How to transform the data before scaling. If None, no transformation is done.
- Returns
- out
numpy:array_like Scaled values
- classmethod train(new_data: FloatArrayLike, old: TupleFloat2 | None = None) -> TupleFloat2
Train a continuous scale
- Parameters
- new_data
numpy:array_like New values
- old
numpy:array_like Old range
- Returns
- out
python:tuple Limits(range) of the scale
- classmethod map(x: FloatArrayLike, palette: ContinuousPalette, limits: TupleFloat2, na_value: Any = None, oob: Callable[[TVector], TVector] = <function censor>) -> NDArrayFloat
Map values to a continuous palette
- Parameters
- x
numpy:array_like Continuous values to scale
- palette
python:callable() f(x) palette to use
- na_value
object Value to use for missing values.
- oob
python:callable() f(x) Function to deal with values that are beyond the limits
- Returns
- out
numpy:array_like Values mapped onto a palette
- class mizani.scale.scale_discrete
Discrete scale
- classmethod apply(x: AnyArrayLike, palette: DiscretePalette, na_value: Any = None)
Scale data discretely
- Parameters
- x
numpy:array_like Discrete values to scale
- palette
python:callable() f(x) Palette to use
- na_value
object Value to use for missing values.
- Returns
- out
numpy:array_like Scaled values
- classmethod train(new_data: AnyArrayLike, old: Sequence[Any] | None = None, drop: bool = False, na_rm: bool = False) -> Sequence[Any]
Train a continuous scale
- Parameters
- new_data
numpy:array_like New values
- old
numpy:array_like Old range. List of values known to the scale.
- drop
bool Whether to drop(not include) unused categories
- na_rm
bool If True, remove missing values. Missing values are either NaN or None.
- Returns
- out
python:list Values covered by the scale
- classmethod map(x: AnyArrayLike, palette: DiscretePalette, limits: Sequence[Any], na_value: Any = None) -> AnyArrayLike
Map values to a discrete palette
- Parameters
- palette
python:callable() f(x) palette to use
- x
numpy:array_like Continuous values to scale
- na_value
object Value to use for missing values.
- Returns
- out
numpy:array_like Values mapped onto a palette
Installation
mizani can be can be installed in a couple of ways depending on purpose.
Official release installation
For a normal user, it is recommended to install the official release.
$ pip install mizani
Development installation
To do any development you have to clone the mizani source repository and install the package in development mode. These commands do all of that:
$ git clone https://github.com/has2k1/mizani.git $ cd mizani $ pip install -e .
If you only want to use the latest development sources and do not care about having a cloned repository, e.g. if a bug you care about has been fixed but an official release has not come out yet, then use this command:
$ pip install git+https://github.com/has2k1/mizani.git
Changelog
v0.12.2
2024-09-04
Bug Fixes
- Fixed squish and squish_infinite to work for non writeable pandas series. This is broken for numpy 2.1.0.
v0.12.1
2024-08-19
Enhancements
- Renamed "husl" color palette type to "hsluv". "husl" is the old name but we still work although not part of the API.
v0.12.0
2024-07-30 .SS API Changes
- mizani now requires python 3.9 and above.
Bug Fixes
- Fixed bug where a date with a timezone could lose the timezone. #45.
v0.11.4
2024-05-24 .SS Bug Fixes
Fixed squish and squish_infinite so that they do not reuse numpy arrays. The users object is not modified.
This also prevents exceptions where the numpy array backs a pandas object and it is protected by copy-on-write.
v0.11.3
2024-05-09 .SS Bug Fixes
- Fixed bug when calculating monthly breaks where when the limits are narrow and do not align with the start and end of the month, there were no dates returned. (#42)
v0.11.2
2024-04-26 .SS Bug Fixes
- Added the ability to create reversed colormap for cmap_pal and cmap_d_pal using the matplotlib convention of name_r.
v0.11.1
2024-03-27 .SS Bug Fixes
- Fix mizani.palettes.brewer_pal to return exact colors in the when the requested colors are less than or equal to those in the palette.
- Add all matplotlib colormap and make them avalaible from cmap_pal and cmap_d_pal (#39).
New
- Added breaks_symlog to calculate breaks for the symmetric logarithm transformation.
Changes
- The default big_mark for label_number has been changed from a comma to nothing.
v0.11.0
2024-02-12 .SS Enhancements
- Removed FutureWarnings when using pandas 2.1.0
New
- Added breaks_symlog to calculate breaks for the symmetric logarithm transformation.
Changes
- The default big_mark for label_number has been changed from a comma to nothing.
v0.10.0
2023-07-28 .SS API Changes
- mpl_format has been removed, number_format takes its place.
- mpl_breaks has been removed, extended_breaks has always been the default and it is sufficient.
- matplotlib has been removed as a dependency of mizani.
- mizani now requires python 3.9 and above.
- The units parameter for of timedelta_format now accepts the values "min", "day", "week", "month", instead of "m", "d", "w", "M".
The naming convention for break formatting methods has changed from *_format to label_*. Specifically these methods have been renamed.
- comma_format is now label_comma
- custom_format is now label_custom
- currency_format is now label_currency
- label_dollar is now label_dollar
- percent_format is now label_percent
- scientific_format is now label_scientific
- date_format is now label_date
- number_format is now label_number
- log_format is now label_log
- timedelta_format is now label_timedelta
- pvalue_format is now label_pvalue
- ordinal_format is now label_ordinal
- number_bytes_format is now label_bytes
The naming convention for break calculating methods has changed from *_breaks to breaks_*. Specifically these methods have been renamed.
- log_breaks is now breaks_log
- trans_minor_breaks is now minor_breaks_trans
- date_breaks is now breaks_date
- timedelta_breaks is now breaks_timedelta
- extended_breaks is now breaks_extended
- dataspace_is_numerical has changed to domain_is_numerical and it is now determined dynamically.
- The default minor_breaks for all transforms that are not linear are now calculated in dataspace. But only if the dataspace is numerical.
New
- symlog_trans for symmetric log transformation
v0.9.2
2023-05-25 .SS Bug Fixes
- Fixed regression in but in date_format where it cannot deal with UTC timezone from timezone #30.
v0.9.1
2023-05-19 .SS Bug Fixes
- Fixed but in date_format to handle datetime sequences within the same timezone but a mixed daylight saving state. (plotnine #687)
v0.9.0
2023-04-15 .SS API Changes
- palettable dropped as a dependency.
Bug Fixes
- Fixed bug in datetime_trans where a pandas series with an index that did not start at 0 could not be transformed.
- Install tzdata on pyiodide/emscripten. #27
v0.8.1
2022-09-28 .SS Bug Fixes
- Fixed regression bug in log_format for where formatting for bases 2, 8 and 16 would fail if the values were float-integers.
Enhancements
- log_format now uses exponent notation for bases other than base 10.
v0.8.0
2022-09-26 .SS API Changes
- The lut parameter of cmap_pal and cmap_d_pal has been deprecated and will removed in a future version.
- datetime_trans gained parameter tz that controls the timezone of the transformation.
- log_format gained boolean parameter mathtex for TeX values as understood matplotlib instead of values in scientific notation.
Bug Fixes
- Fixed bug in zero_range where uint64 values would cause a RuntimeError.
v0.7.4
2022-04-02 .SS API Changes
- comma_format is now imported automatically when using *.
- Fixed issue with scale_discrete so that if you train on data with Nan and specify and old range that also has NaN, the result range does not include two NaN values.
v0.7.3
(2020-10-29) .SS Bug Fixes
- Fixed log_breaks for narrow range if base=2 (#76).
v0.7.2
(2020-10-29) .SS Bug Fixes
- Fixed bug in rescale_max() to properly handle values whose maximum is zero (#16).
v0.7.1
(2020-06-05) .SS Bug Fixes
- Fixed regression in mizani.scales.scale_discrete.train() when trainning on values with some categoricals that have common elements.
v0.7.0
(2020-06-04) .SS Bug Fixes
- Fixed issue with mizani.formatters.log_breaks where non-linear breaks could not be generated if the limits where greater than the largest integer sys.maxsize.
- Fixed mizani.palettes.gradient_n_pal() to return nan for nan values.
- Fixed mizani.scales.scale_discrete.train() when training categoricals to maintain the order. (plotnine #381)
v0.6.0
(2019-08-15) .SS New
- Added pvalue_format
- Added ordinal_format
- Added number_bytes_format
- Added pseudo_log_trans()
- Added reciprocal_trans
- Added modulus_trans()
Enhancements
- mizani.breaks.date_breaks now supports intervals in the
order of seconds.
- mizani.palettes.brewer_pal now supports a direction argument to control the order of the returned colors.
API Changes
- boxcox_trans() now only accepts positive values. For both positive and negative values, modulus_trans() has been added.
v0.5.4
(2019-03-26) .SS Enhancements
- mizani.formatters.log_format now does a better job of approximating labels for numbers like 3.000000000000001e-05.
API Changes
- exponent_threshold parameter of mizani.formatters.log_format has been deprecated.
v0.5.3
(2018-12-24) .SS API Changes
- Log transforms now default to base - 2 minor breaks. So base 10 has 8 minor breaks and 9 partitions, base 8 has 6 minor breaks and 7 partitions, ..., base 2 has 0 minor breaks and a single partition.
v0.5.2
(2018-10-17) .SS Bug Fixes
- Fixed issue where some functions that took pandas series would return output where the index did not match that of the input.
v0.5.1
(2018-10-15) .SS Bug Fixes
- Fixed issue with log_breaks, so that it does not fail needlessly when the limits in the (0, 1) range.
Enhancements
- Changed log_format to return better formatted breaks.
v0.5.0
(2018-11-10) .SS API Changes
- Support for python 2 has been removed.
- call() and
meth:~mizani.breaks.trans_minor_breaks.call now accept optional parameter n which is the number of minor breaks between any two major breaks.
- The parameter nan_value has be renamed to na_value.
- The parameter nan_rm has be renamed to na_rm.
Enhancements
- Better support for handling missing values when training discrete scales.
- Changed the algorithm for log_breaks, it can now return breaks that do not fall on the integer powers of the base.
v0.4.6
(2018-03-20) .INDENT 0.0
- Added squish
v0.4.5
(2018-03-09) .INDENT 0.0
- Added identity_pal
- Added cmap_d_pal
v0.4.4
(2017-12-13) .INDENT 0.0
- Fixed date_format to respect the timezones of the dates (#8).
v0.4.3
(2017-12-01) .INDENT 0.0
- Changed date_breaks to have more variety in the spacing between the breaks.
- Fixed date_format to respect time part of the date (#7).
v0.4.2
(2017-11-06) .INDENT 0.0
- Fixed (regression) break calculation for the non ordinal transforms.
v0.4.1
(2017-11-04) .INDENT 0.0
- trans objects can now be instantiated with parameter to override attributes of the instance. And the default methods for computing breaks and minor breaks on the transform instance are not class attributes, so they can be modified without global repercussions.
v0.4.0
(2017-10-24) .SS API Changes
- Breaks and formatter generating functions have been converted to classes, with a __call__ method. How they are used has not changed, but this makes them move flexible.
- ExtendedWilkson class has been removed. extended_breaks() now contains the implementation of the break calculating algorithm.
v0.3.4
(2017-09-12) .INDENT 0.0
- Fixed issue where some formatters methods failed if passed empty breaks argument.
Fixed issue with log_breaks() where if the limits were with in the same order of magnitude the calculated breaks were always the ends of the order of magnitude.
Now log_breaks()((35, 50)) returns [35, 40, 45, 50] as breaks instead of [1, 100].
v0.3.3
(2017-08-30) .INDENT 0.0
- Fixed SettingWithCopyWarnings in squish_infinite().
- Added log_format().
API Changes
- Added log_trans now uses log_format() as the formatting method.
v0.3.2
(2017-07-14) .INDENT 0.0
- Added expand_range_distinct()
v0.3.1
(2017-06-22) .INDENT 0.0
- Fixed bug where using log_breaks() with Numpy 1.13.0 led to a ValueError.
v0.3.0
(2017-04-24) .INDENT 0.0
- Added xkcd_palette(), a palette that selects from 954 named colors.
- Added crayon_palette(), a palette that selects from 163 named colors.
- Added cubehelix_pal(), a function that creates a continuous palette from the cubehelix system.
- Fixed bug where a color palette would raise an exception when passed a single scalar value instead of a list-like.
- extended_breaks() and mpl_breaks() now return a single break if the limits are equal. Previous, one run into an Overflow and the other returned a sequence filled with n of the same limit.
API Changes
- mpl_breaks() now returns a function that (strictly) expects a tuple with the minimum and maximum values.
v0.2.0
(2017-01-27) .INDENT 0.0
- Fixed bug in censor() where a sequence of values with an irregular index would lead to an exception.
- Fixed boundary issues due internal loss of precision in ported function seq().
- Added mizani.breaks.extended_breaks() which computes breaks using a modified version of Wilkinson's tick algorithm.
- Changed the default function mizani.transforms.trans.breaks_() used by mizani.transforms.trans to compute breaks from mizani.breaks.mpl_breaks() to mizani.breaks.extended_breaks().
- mizani.breaks.timedelta_breaks() now uses mizani.breaks.extended_breaks() internally instead of mizani.breaks.mpl_breaks().
- Added manual palette function mizani.palettes.manual_pal().
- Requires pandas version 0.19.0 or higher.
v0.1.0
(2016-06-30)
First public release
Author
Hassan Kibirige
Copyright
2024, Hassan Kibirige