To Jaynes, in his original paper [1], maxent is ‘a method of reasoning which ensures that no unconscious arbitrary assumptions have been introduced’, while fifty years later, the MAXENT conference home page suggests that the method ‘is not yet fully available to the statistics community at large’. In fact, it is possible to see that generalized maxent problems, often in disguise, do play a significant role in machine learning and statistics. Deviations from the classic form of the problem are typically used to incorporate some form of prior knowledge. Sometimes that knowledge would be difficult or impossible to represent with only linear constraints or an initial guess for the density.
To clarify these connections, a good place to start is the classic maxent problem. This can then be generalized until the problem encompasses a large class of problems studied by the machine learning community. Relaxed constraints, generalizations of Shannon‐Boltzmann‐Gibbs (SBG) entropy and a few tools from convex analysis make the task relatively straightforward. In the examples discussed, the original maxent problem remains embedded as a special case. Providing a trail back to the original maxent problem will highlight the potential for cross‐fertilization between the two fields.