coniii.solvers module¶
-
class
coniii.solvers.
ClusterExpansion
(sample, model=None, calc_observables=None, sample_size=1000, iprint=True, **default_model_kwargs)¶ Bases:
coniii.solvers.Solver
Implementation of Adaptive Cluster Expansion for solving the inverse Ising problem, as described in John Barton and Simona Cocco, J. of Stat. Mech. P03002 (2013).
Specific to pairwise Ising constraints.
-
S
(cluster, coocMat, deltaJdict={}, useAnalyticResults=False, priorLmbda=0.0, numSamples=None)¶ Calculate pairwise entropy of cluster. (First fits pairwise Ising model.)
- cluster : list
- List of indices belonging to each cluster.
- coocMat : ndarray
- Pairwise correlations.
deltaJdict : dict, {} useAnalyticResults : bool, False
Probably want False until analytic formulas are changed to include prior on Jentropy : float Jfull : ndarray
Matrix of couplings.
-
Sindependent
(cluster, coocMat)¶ Entropy approximation assuming that each cluster appears independently of the others.
cluster : list coocMat : ndarray
Pairwise correlations.- float
- Sind, independent entropy.
- ndarray
- Pairwise couplings.
-
clusterID
(cluster)¶
-
deltaS
(cluster, coocMat, deltaSdict=None, deltaJdict=None, iprint=True, meanFieldRef=False, priorLmbda=0.0, numSamples=None, independentRef=False, meanFieldPriorLmbda=None)¶ - cluster : list
- List of indices in cluster
coocMat : ndarray deltaSdict : dict, None deltaJdict : dict, None iprint : bool, True meanFieldRef : bool, False numSamples : int, None independentRef : bool, False
If True, expand about independent entropy- meanFieldRef : bool, False
- If True, expand about mean field entropy
- float
- deltaScluster
- float
- deltaJcluster
-
solve
(threshold, cluster=None, deltaSdict=None, deltaJdict=None, iprint=True, priorLmbda=0.0, numSamples=None, meanFieldRef=False, independentRef=True, veryVerbose=False, meanFieldPriorLmbda=None, full_output=False)¶ threshold : float meanFieldRef : bool, False
Expand about mean-field reference.- independentRef : bool, True
- Expand about independent reference.
- priorLmbda : float, 0.
- Strength of non-interacting prior.
- meanFieldPriorLmbda : float, None
- Strength of non-interacting prior in mean field calculation (defaults to priorLmbda).
- ndarray
- Solved multipliers (parameters). For Ising problem, these can be converted into matrix format using utils.vec2mat.
- float (optional, only if full_output=True)
- Estimated entropy.
- ndarray
- Solved multipliers (parameters). For Ising problem, these can be converted into matrix format using utils.vec2mat.
- list (optional, only if full_output=True)
- List of clusters.
- dict (optional, only if full_output=True)
- deltaSdict
- dict (optional, only if full_output=True)
- deltaJdict
-
subsets
(thisSet, size, sort=False)¶ Given a list, returns a list of all unique subsets of that list with given size.
thisSet : list size : int sort : bool, False
- list
- All subsets of given size.
-
-
class
coniii.solvers.
Enumerate
(sample=None, model=None, calc_observables=None, iprint=True, **default_model_kwargs)¶ Bases:
coniii.solvers.Solver
Class for solving fully-connected inverse Ising model problem by enumeration of the partition function and then using gradient descent.
-
solve
(initial_guess=None, constraints=None, max_param_value=50, full_output=False, use_root=True, scipy_solver_kwargs={'method': 'krylov', 'options': {'fatol': 1e-13, 'xatol': 1e-13}})¶ Must specify either constraints (the correlations) or samples from which the correlations will be calculated using self.calc_observables. This routine by default uses scipy.optimize.root to find the solution. This is MUCH faster than the scipy.optimize.minimize routine which can be used instead.
If still too slow, try adjusting the accuracy.
If not converging, try increasing the max number of iterations.
If receiving Jacobian error (or some other numerical estimation error), parameter values may be too large for faithful evaluation. Try decreasing max_param_value.
- initial_guess : ndarray, None
- Initial starting guess for parameters. By default, this will start with all zeros if left unspecified.
- constraints : ndarray, None
- Can specify constraints directly instead of using the ones calculated from the sample. This can be useful when the pairwise correlations are known exactly. This will override the self.constraints data member.
- max_param_value : float, 50
- Absolute value of max parameter value. Bounds can also be set in the kwargs passed to the minimizer, in which case this should be set to None.
- full_output : bool, False
- If True, return output from scipy.optimize.minimize.
- use_root : bool, True
- If False, use scipy.optimize.minimize instead. This is typically much slower.
- scipy_solver_kwargs : dict, {‘method’:’krylov’, ‘options’:{‘fatol’:1e-13,’xatol’:1e-13}}
- High accuracy is slower. Although default accuracy may not be so good, lowering these custom presets will speed things up. Choice of the root finding method can also change runtime and whether a solution is found or not. Recommend playing around with different solvers and tolerances or getting a close approximation using a different method if solution is hard to find.
- ndarray
- Solved multipliers (parameters). For Ising problem, these can be converted into matrix format using utils.vec2mat.
- dict, optional
- Output from scipy.optimize.root.
-
-
class
coniii.solvers.
MCH
(sample, model=None, calc_observables=None, sample_size=1000, sample_method='metropolis', mch_approximation=None, iprint=True, sampler_kw={}, **default_model_kwargs)¶ Bases:
coniii.solvers.Solver
Class for solving maxent problems using the Monte Carlo Histogram method.
Broderick, T., Dudik, M., Tkacik, G., Schapire, R. E. & Bialek, W. Faster solutions of the inverse pairwise Ising problem. arXiv 1-8 (2007).
-
estimate_jac
(eps=0.001)¶ Approximation Jacobian using the MCH approximation.
eps : float, 1e-3
- jac : ndarray
- Jacobian is an n x n matrix where each row corresponds to the behavior of fvec wrt to a single parameter.
-
learn_parameters_mch
(estConstraints, constraints, maxdlamda=1, maxdlamdaNorm=1, maxLearningSteps=50, eta=1)¶ - estConstraints : ndarray
- Constraints estimated from MCH approximation.
constraints : ndarray maxdlamda : float, 1
Max allowed magnitude for any element of dlamda vector before exiting.- maxdlamdaNorm : float, 1
- Max allowed norm of dlamda vector before exiting.
- maxLearningSteps : int
- max learning steps before ending MCH
- eta : float, 1
- factor for changing dlamda
- ndarray
- MCH estimate for constraints from parameters lamda+dlamda.
-
solve
(initial_guess=None, constraints=None, tol=None, tolNorm=None, n_iters=30, burn_in=30, maxiter=10, custom_convergence_f=None, iprint=False, full_output=False, learn_params_kwargs={'eta': 1, 'maxdlamda': 1}, generate_kwargs={})¶ Solve for maxent model parameters using MCH routine.
- initial_guess : ndarray, None
- Initial starting point.
- constraints : ndarray, None
- For debugging! Vector of correlations to fit.
- tol : float, None
- Maximum error allowed in any observable.
- tolNorm : float, None
- Norm error allowed in found solution.
- n_iters : int, 30
- Number of iterations to make between samples in MCMC sampling.
- burn_in : int, 30
- Initial burn in from random sample when MC sampling.
- max_iter : int, 10
- Max number of iterations of MC sampling and MCH approximation.
- custom_convergence_f : function, None
Function for determining convergence criterion. At each iteration, this function should return the next set of learn_params_kwargs and optionally the sample size.
As an example: def learn_settings(i):
‘’’ Take in the iteration counter and set the maximum change allowed in any given parameter (maxdlamda) and the multiplicative factor eta, where d(parameter) = (error in observable) * eta.
Additional option is to also return the sample size for that step by returning a tuple. Larger sample sizes are necessary for higher accuracy. ‘’’ if i<10:
return {‘maxdlamda’:1,’eta’:1}- else:
- return {‘maxdlamda’:.05,’eta’:.05}
iprint : bool, False full_output : bool, False
If True, also return the errflag and error history.learn_parameters_kwargs : dict, {‘maxdlamda’:1,’eta’:1} generate_kwargs : dict, {}
- ndarray
- Solved multipliers (parameters). For Ising problem, these can be converted into matrix format using utils.vec2mat.
- int
- Error flag. 0, converged within given criterion 1, max iterations reached
- ndarray
- Log of errors in matching constraints at each step of iteration.
-
-
class
coniii.solvers.
MCHIncompleteData
(*args, **kwargs)¶ Bases:
coniii.solvers.MCH
Class for solving maxent problems using the Monte Carlo Histogram method on incomplete data where some spins may not be visible.
Broderick, T., Dudik, M., Tkacik, G., Schapire, R. E. & Bialek, W. Faster solutions of the inverse pairwise Ising problem. arXiv 1-8 (2007).
- NOTE: This only works for Ising model.
- Not ready for release.
-
generate_samples
(n_iters, burn_in, uIncompleteStates=None, f_cond_sample_size=None, f_cond_sample_iters=None, sample_size=None, sample_method=None, initial_sample=None, run_regular_sampler=True, run_cond_sampler=True, disp=0, generate_kwargs={})¶ Wrapper around generate_samples_parallel() from available samplers.
n_iters : int burn_in : int
I think burn in is handled automatically in REMC.uIncompleteStates : list of unique states f_cond_sample_size : lambda function
Given the number of hidden spins, return the number of samples to take.- f_cond_sample_iters : lambda function
- Given the number of hidden spins, return the number of MC iterations to make.
sample_size : int sample_method : str initial_sample : ndarray generate_kwargs : dict
-
learn_parameters_mch
(estConstraints, fullFraction, uIncompleteStates, uIncompleteStatesCount, maxdlamda=1, maxdlamdaNorm=1, maxLearningSteps=50, eta=1)¶ Update parameters with MCH step. Update is proportional to the difference between the observables and the predicted observables after a small change to the parameters. This is calculated from likelihood maximization, and for the incomplete data points this corresponds to the marginal probability distribution weighted with the number of corresponding data points.
estConstraints : ndarray fullFraction : float
Fraction of data points that are complete.- uIncompleteStates : list-like
- Unique incomplete states in data.
- uIncompleteStatesCount : list-like
- Frequency of each unique data point.
maxdlamda : float,1 maxdlamdaNorm : float,1 maxLearningSteps : int
max learning steps before ending MCH- eta : float,1
- factor for changing dlamda
estimatedConstraints : ndarray
-
solve
(X=None, constraints=None, initial_guess=None, cond_sample_size=100, cond_sample_iters=100, tol=None, tolNorm=None, n_iters=30, burn_in=30, maxiter=10, disp=False, full_output=False, learn_params_kwargs={}, generate_kwargs={})¶ Solve for parameters using MCH routine.
X : ndarray constraints : ndarray
Constraints calculated from the incomplete data (accounting for missing data points).- initial_guess : ndarray=None
- initial starting point
- cond_sample_size : int or function
- Number of samples to make for conditional distribution. If function is passed in, it will be passed number of missing spins and must return an int.
- cond_sample_iters : int or function
- Number of MC iterations to make between samples.
- tol : float=None
- maximum error allowed in any observable
- tolNorm : float
- norm error allowed in found solution
- n_iters : int=30
- Number of iterations to make between samples in MCMC sampling.
burn_in (int=30) disp : int=0
0, no output 1, some detail 2, most detail- full_output : bool,False
- Return errflag and errors at each iteration if True.
learn_parameters_kwargs : dict generate_kwargs : dict
- parameters : ndarray
- Found solution.
errflag : int errors : ndarray
Errors in matching constraints at each step of iteration.
-
class
coniii.solvers.
MPF
(sample, model=None, calc_observables=None, calc_de=None, adj=None, iprint=True, **default_model_kwargs)¶ Bases:
coniii.solvers.Solver
-
K
(Xuniq, Xcount, adjacentStates, params)¶ Compute objective function.
- Xuniq : ndarray
- (ndata x ndims) unique states that appear in the data
- Xcount : ndarray of int
- number of times that each unique state appears in the data
- adjacentStates : list of ndarray
- list of adjacent states for each given unique state
- params : ndarray
- parameters for computation of energy
K : float
-
list_adjacent_states
(Xuniq, all_connected)¶ Use self.adj to evaluate all adjacent states in Xuniq.
Xuniq : ndarray all_connected : bool
adjacentStates
-
logK
(Xuniq, Xcount, adjacentStates, params)¶ Compute log of objective function.
- Xuniq : ndarray
- (n_samples, n_dim) unique states that appear in the data
- Xcount : ndarray of int
- number of times that each unique state appears in the data
- adjacentStates : list of ndarray
- list of adjacent states for each given unique state
- params : ndarray
- parameters for computation of energy
logK : float
-
solve
(initial_guess=None, method='L-BFGS-B', full_output=False, all_connected=True, parameter_limits=100, solver_kwargs={'disp': False, 'ftol': 1e-15, 'maxiter': 100}, uselog=True)¶ Minimize MPF objective function using scipy.optimize.minimize.
initial_guess : ndarray, None method : str, ‘L-BFGS-B’
Option for scipy.optimize.minimize.full_output : bool, False all_connected : bool, True
Switch for summing over all states that data sets could be connected to or just summing over non-data states (second summation in Eq 10 in Sohl-Dickstein 2011).- parameter_limits : float, 100
- Maximum allowed magnitude of any single parameter.
- solver_kwargs : dict, {‘maxiter’:100,’disp’:False,’ftol’:1e-15}
- For scipy.optimize.minimize.
- uselog : bool, True
- If True, calculate log of the objective function. This can help with numerical precision errors.
- ndarray
- Solved multipliers (parameters). For Ising problem, these can be converted into matrix format using utils.vec2mat.
- dict (optional)
- Output from scipy.optimize.minimize returned if full_output is True.
-
static
worker_objective_task
(s, Xcount, adjacentStates, params, calc_e)¶
-
-
coniii.solvers.
MonteCarloHistogram
¶ alias of
coniii.solvers.MCH
-
class
coniii.solvers.
Pseudo
(sample, model=None, calc_observables=None, get_multipliers_r=None, calc_observables_r=None, k=2, iprint=True, **default_model_kwargs)¶ Bases:
coniii.solvers.Solver
Pseudolikelihood approximation to solving the inverse Ising problem as described in Aurell and Ekeberg, PRL 108, 090201 (2012).
-
cond_hess
(r, X, Jr, pairCoocRhat=None)¶ Returns d^2 cond_log_likelihood / d Jri d Jrj, with shape (dimension of system)x(dimension of system)
Current implementation uses more memory for speed. For large sample size, it may make sense to break up differently if too much memory is being used.
Deprecated.
- pairCooc : ndarray, None
- Pass pair_cooc_mat(X) to speed calculation.
-
cond_jac
(r, X, Jr)¶ Returns d cond_log_likelihood / d Jr, with shape (dimension of system)
Deprecated.
-
cond_log_likelihood
(r, X, Jr)¶ Equals the conditional log likelihood -L_r.
Deprecated.
- r : int
- individual index
- X : ndarray
- binary matrix, (# X) x (dimension of system)
- Jr : ndarray
- (dimension of system) x (1)
float
-
pair_cooc_mat
(X)¶ Returns matrix of shape (self.n)x(# X)x(self.n).
For use with cond_hess.
Slow because I haven’t thought of a better way of doing it yet.
Deprecated.
-
pseudo_log_likelihood
(X, J)¶ TODO: Could probably be made more efficient.
Deprecated.
- X : ndarray
- binary matrix, (# of samples) x (dimension of system)
- J : ndarray
- (dimension of system) x (dimension of system) J should be symmetric
-
solve
(force_general=False, **kwargs)¶ Uses a general all-purpose optimization to solve the problem using functions defined in self.get_multipliers_r and self.calc_observables_r.
- force_general : bool, False
- If True, force use of “general” algorithm.
- initial_guess : ndarray, None
- Initial guess for the parameter values.
- solver_kwargs : dict, {}
- kwargs for scipy.minimize().
- ndarray
- Solved multipliers (parameters). For Ising problem, these can be converted into matrix format using utils.vec2mat.
-
-
class
coniii.solvers.
RegularizedMeanField
(sample, model=None, calc_observables=None, sample_size=1000, iprint=False, **default_model_kwargs)¶ Bases:
coniii.solvers.Solver
Implementation of regularized mean field method for solving the inverse Ising problem, as described in Daniels, Bryan C., David C. Krakauer, and Jessica C. Flack. ``Control of Finite Critical Behaviour in a Small-Scale Social System.’’ Nature Communications 8 (2017): 14301. doi:10.1038/ncomms14301
Specific to pairwise Ising constraints.
-
bracket1d
(xList, funcList)¶ Assumes xList is monotonically increasing
Get bracketed interval (a,b,c) with a < b < c, and f(b) < f(a) and f(c). (Choose b and c to make f(b) and f(c) as small as possible.)
If minimum is at one end, raise error.
-
solve
(n_grid_points=200, min_size=0, reset_rng=True, min_covariance=False, min_independent=True, cooc_cov=None, priorLmbda=0.0, bracket=None)¶ Varies the strength of regularization on the mean field J to best fit given cooccurrence data.
- n_grid_points : int, 200
- If bracket is given, first test at n_grid_points points evenly spaced in the bracket interval, then give the lowest three points to scipy.optimize.minimize_scalar
- min_size : int, 0
- Use a modified model in which samples with fewer ones than min_size are not allowed.
- reset_rng: bool, True
- Reset random number generator seed before sampling to ensure that objective function does not depend on generator state.
- min_covariance : bool, False
- ** As of v1.0.3, not currently supported ** Minimize covariance from emperical frequencies (see notes); trying to avoid biases, as inspired by footnote 12 in TkaSchBer06
- min_independent : bool, True
- ** As of v1.0.3, min_independent is the only mode currently supported ** Each <xi> and <xi xj> residual is treated as independent
- cooc_cov : ndarray, None
- ** As of v1.0.3, not currently supported ** Provide a covariance matrix for residuals. Should typically be coocSampleCovariance(samples). Only used if min_covariance and min_independent are False.
- priorLmbda : float,0.
- ** As of v1.0.3, not currently implemented ** Strength of noninteracting prior.
- ndarray
- Solved multipliers (parameters). For Ising problem, these can be converted into matrix format using utils.vec2mat.
-
-
class
coniii.solvers.
Solver
¶ Bases:
object
Base class for declaring common methods and attributes for inverse maxent algorithms.
-
basic_setup
(sample_or_n=None, model=None, calc_observables=None, iprint=True, model_kwargs={})¶ General routine for setting up a Solver instance.
- sample_or_n : ndarray or int, None
If ndarray, of dimensions (samples, dimension).
If int, specifies system size.
If None, many of the default class members cannot be set and then must be set manually.
- model : class like one from models.py, None
- By default, will be set to solve Ising model.
- calc_observables : function, None
- For calculating observables from a set of samples.
- iprint : str, True
- If empty, do not display warning messages.
- model_kwargs : dict, {}
- Additional arguments that will be passed to Ising class. These only matter if model is None. Important ones include “n_cpus” and “rng”.
-
fill_in
(x, fill_value=0)¶ Helper function for filling in missing parameter values.
x : ndarray fill_value : float, 0
- ndarray
- With missing entries filled in.
-
set_insertion_ix
()¶ Calculate indices to fill in with zeros to “fool” code that takes full set of params.
-
solve
()¶ To be defined in derivative classes.
-
-
class
coniii.solvers.
SparseEnumerate
(sample=None, model=None, calc_observables=None, parameter_ix=None, iprint=True, **default_model_kwargs)¶ Bases:
coniii.solvers.Solver
Class for solving Ising model with a sparse parameter set by enumeration of the partition function and then using gradient descent. Unspecified parameters are implicitly fixed to be zero, which corresponds to leaving the corresponding correlation function unconstrained.
-
solve
(initial_guess=None, constraints=None, max_param_value=50, full_output=False, use_root=True, scipy_solver_kwargs={'method': 'krylov', 'options': {'fatol': 1e-13, 'xatol': 1e-13}})¶ Must specify either constraints (the correlations) or samples from which the correlations will be calculated using self.calc_observables. This routine by default uses scipy.optimize.root to find the solution. This is MUCH faster than the scipy.optimize.minimize routine which can be used instead.
If still too slow, try adjusting the accuracy.
If not converging, try increasing the max number of iterations.
If receiving Jacobian error (or some other numerical estimation error), parameter values may be too large for faithful evaluation. Try decreasing max_param_value.
- initial_guess : ndarray, None
- Initial starting guess for parameters. By default, this will start with all zeros if left unspecified.
- constraints : ndarray, None
- Can specify constraints directly instead of using the ones calculated from the sample. This can be useful when the pairwise correlations are known exactly. This will override the self.constraints data member.
- max_param_value : float, 50
- Absolute value of max parameter value. Bounds can also be set in the kwargs passed to the minimizer, in which case this should be set to None.
- full_output : bool, False
- If True, return output from scipy.optimize.minimize.
- use_root : bool, True
- If False, use scipy.optimize.minimize instead. This is typically much slower.
- scipy_solver_kwargs : dict, {‘method’:’krylov’, ‘options’:{‘fatol’:1e-13,’xatol’:1e-13}}
- High accuracy is slower. Although default accuracy may not be so good, lowering these custom presets will speed things up. Choice of the root finding method can also change runtime and whether a solution is found or not. Recommend playing around with different solvers and tolerances or getting a close approximation using a different method if solution is hard to find.
- ndarray
- Solved multipliers (parameters).
- dict, optional
- Output from scipy.optimize.root.
-
-
class
coniii.solvers.
SparseMCH
(sample, model=None, calc_observables=None, sample_size=1000, sample_method='metropolis', mch_approximation=None, parameter_ix=None, iprint=True, sampler_kw={}, **default_model_kwargs)¶ Bases:
coniii.solvers.Solver
Class for solving maxent problems on sparse constraints using the Monte Carlo Histogram method.
See MCH class.
-
learn_parameters_mch
(estConstraints, constraints, maxdlamda=1, maxdlamdaNorm=1, maxLearningSteps=50, eta=1)¶ - estConstraints : ndarray
- Constraints estimated from MCH approximation.
constraints : ndarray maxdlamda : float, 1
Max allowed magnitude for any element of dlamda vector before exiting.- maxdlamdaNorm : float, 1
- Max allowed norm of dlamda vector before exiting.
- maxLearningSteps : int
- max learning steps before ending MCH
- eta : float, 1
- factor for changing dlamda
- ndarray
- MCH estimate for constraints from parameters lamda+dlamda.
-
solve
(initial_guess=None, constraints=None, tol=None, tolNorm=None, n_iters=30, burn_in=30, maxiter=10, custom_convergence_f=None, iprint=False, full_output=False, learn_params_kwargs={'eta': 1, 'maxdlamda': 1}, generate_kwargs={})¶ Solve for maxent model parameters using MCH routine.
- initial_guess : ndarray, None
- Initial starting point.
- constraints : ndarray, None
- For debugging! Vector of correlations to fit.
- tol : float, None
- Maximum error allowed in any observable.
- tolNorm : float, None
- Norm error allowed in found solution.
- n_iters : int, 30
- Number of iterations to make between samples in MCMC sampling.
- burn_in : int, 30
- Initial burn in from random sample when MC sampling.
- max_iter : int, 10
- Max number of iterations of MC sampling and MCH approximation.
- custom_convergence_f : function, None
Function for determining convergence criterion. At each iteration, this function should return the next set of learn_params_kwargs and optionally the sample size.
As an example: def learn_settings(i):
‘’’ Take in the iteration counter and set the maximum change allowed in any given parameter (maxdlamda) and the multiplicative factor eta, where d(parameter) = (error in observable) * eta.
Additional option is to also return the sample size for that step by returning a tuple. Larger sample sizes are necessary for higher accuracy. ‘’’ if i<10:
return {‘maxdlamda’:1,’eta’:1}- else:
- return {‘maxdlamda’:.05,’eta’:.05}
iprint : bool, False full_output : bool, False
If True, also return the errflag and error history.learn_parameters_kwargs : dict, {‘maxdlamda’:1,’eta’:1} generate_kwargs : dict, {}
- ndarray
- Solved multipliers (parameters). For Ising problem, these can be converted into matrix format using utils.vec2mat.
- int
- Error flag. 0, converged within given criterion 1, max iterations reached
- ndarray
- Log of errors in matching constraints at each step of iteration.
-
-
coniii.solvers.
unwrap_self_worker_obj
(arg, **kwarg)¶