Information Physics: The Next Frontier

Page 1. Kevin H. Knuth. Departments of Physics and Informatics. University at Albany. Information Physics: The Next Frontier. Page 2 ..... Page 79 ...
3MB taille 3 téléchargements 368 vues
Information Physics: The Next Frontier

Kevin H. Knuth Departments of Physics and Informatics University at Albany

19th Century

George Boole George Boole was the inventor of Boolean Logic. In 1854 he published: “An Investigation of the Laws of Thought, on Which are Founded the Mathematical Theories of Logic and Probabilities”

George Boole (1815 – 1874)

Since that time, HUNDREDS OF THOUSANDS of papers have been published on this topic.

20th Century

Claude E. Shannon Claude Shannon realized that Boolean logic could be used to optimize arrays of electromagnetic relays used in switching telephone systems. In 1949, he developed information theory, which enabled him to quantify the effectiveness of a communication channel using the information entropy (Shannon entropy).

Claude E. Shannon (1916 – 2001)

Richard T. Cox In 1946, Cox shows that Bayesian probability theory is the unique generalization of Boolean algebra to degrees of plausibility. It is the first example of such a generalization of an algebra to a calculus. Richard T. Cox (1898 – 1991)

Edwin T. Jaynes Around 1956, Jaynes realizes that the connection between Shannon entropy and thermodynamic entropy is not that of a mere analogy, but is due to the fact that they derive from similar underlying ideas. Edwin T. Jaynes (1922 – 1998)

21St cENTURY

Symmetry

Order

Equivalence Relations among Partitioned Sets

Order Relations among Ordered Sets

Group Theory

Lattice Theory

Order

In the beginning…

This caveman finds it easy to order these rocks in terms of how heavy they are to lift.

Heavier

He uses a binary weight comparison to order his rocks

Sophisticated Rock Hunting

Heavier

Totally ordered elements form a CHAIN

Is Less than or Equal to

Heavier

Isomorphisms 5

4

3

2

1

Incomparable elements form an ANTICHAIN

Partitions exhibit both chain-like and antichain-like properties

4 3 2 1

8 Divides

Is Less than or Equal to

Two Posets with Integers

4 6 9 2

3 5 7 1

The Powerset of {a, b, c} P = ({ ∅, {a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c} }, ⊆ )

{a, b, c} {a, b} {a, c} {b, c}

{a} {b} ∅

{c}

Is a subset of



Upper Bound Upper Bound of {b} and {c}

{a, b, c} {a, b} {a, c} {b, c}

{a} {b} ∅

{c}

Join Least Upper Bound of {b} and {c}

{a, b, c} {a, b} {a, c} {b, c}

{a} {b} ∅

{b} ∨ {c} = {b, c}

{c} The join of two elements is their least upper bound

Meet Greatest Lower Bound of {a,b} and {a,c}

{a, b, c} {a, b} {a, c} {b, c}

{a, b} ∧ {a, c} = {a}

{a} {b} ∅

{c} The meet of two elements is their least upper bound

Posets versus Lattices Lattices are posets where every join and meet is unique.

(x ∨ y)

Poset

Lattice

x

y

(x ∧ y)

Lattice Identities The Lattice Identities L1. x ∧ x = x,

x∨x = x

L2. x ∧ y = y ∧ x,

Idempotent

x∨ y = y ∨ x

L3. x ∧ ( y ∧ z ) = ( x ∧ y ) ∧ z ,

Commutative

x ∨ ( y ∨ z ) = ( x ∨ y ) ∨ z Associative

L4. x ∧ ( x ∨ y ) = x ∨ ( x ∧ y ) = x

Absorption

If x ≤ y the meet and join follow the Consistency Relations C1. x ∧ y = x C2 . x ∨ y = y

(x is the greatest lower bound of x and y) (y is the least upper bound of x and y)

Lattices are Algebras Structural Viewpoint

Operational Viewpoint

max(a, b) = b a≤b ⇔ min(a, b) = a Integers, Is less than or equal to

Lattices are Algebras Structural Viewpoint

Operational Viewpoint

lcm(a, b) = b a |b ⇔ gcd(a, b) = a Positive Integers, Divides

Lattices are Algebras Structural Viewpoint

Operational Viewpoint

a ∪b = b a⊆b ⇔ a∩b = a Sets, Is a subset of

Lattices are Algebras Structural Viewpoint

Operational Viewpoint

a∨b = b a→b ⇔ a ∧b = a Assertions, Implies

Quantification Via Valuations

Algebra to Calculus

Algebra

Calculus

Quantification quantify the partial order  = assign real numbers to the elements { a, b, c } { a, b }       { a, c }       { b, c } { a }            { b }            { c } Any quantification must be consistent with the lattice structure. Otherwise, information about the partial order is lost.

Local Consistency Any general rule must hold for special cases Look at special cases to constrain general rule Enforce local consistency

f(x ∨ y) ↔ f(x) and f(y)

x ¤ y  x

y

This implies that:

f(x ∨ y) = S[f(x), f(y)]

f : x ∈ L → IR

Where S is an unknown function to be determined.

Associativity of Join Write the same element two different ways

x ∨ (y ∨ z)

=

(x ∨ y) ∨ z

which implies

S[f(x), S[f(y), f(z)]] =

S[S[f(x), f(y)], f(z)]

Note that the unknown function S is nested in  two distinct ways, which reflects associativity

Associativity Equation S[f(x), S[f(y), f(z)]] =

S[S[f(x), f(y)], f(z)]

The general solution (Aczel 1966) is:

F(S[f(x), f(y)]) = F(f(x)) + F(f(y)) where F is an arbitrary function. Define v(x) = F(f(x)) so that we have straightforward summation.

v(x ∨ y) = v(x) + v(y) DERIVATION OF THE  SUMMATION AXIOM IN MEASURE THEORY!

Valuation VALUATION

v:x∈L → R

If y ≥ x then v(y) ≥ v(x) x ¤ y  x

y

v(x ∨ y) = v(x) + v(y)

General Case x ¤ y  x

y x ⁄ y

z

General Case x ¤ y  x

y x ⁄ y

v(y) = v(x ∧ y) + v(z)

z

General Case x ¤ y  x

y x ⁄ y

v(y) = v(x ∧ y) + v(z)

z

v(x ∨ y) = v(x) + v(z)

General Case x ¤ y  x

y x ⁄ y

v(y) = v(x ∧ y) + v(z)

z

v(x ∨ y) = v(x) + v(z)

v(x ∨ y) = v(x) + v(y) − v(x ∧ y)

Sum Rule v(x ∨ y) = v(x) + v(y) − v(x ∧ y)

v(x) + v(y) = v(x ∨ y) + v(x ∧ y) symmetric form (self‐dual)

Sum Rule p( x ∨ y | i) = p( x | i) + p( y | i) − p( x ∧ y | i) I ( X ; Y ) = H ( X ) + H (Y ) − H ( X , Y )

max( x, y ) = x + y − min( x, y )

χ =V − E + F log(gcd( x, y )) = log( x) + log( y ) − log(lcm( x, y ))

Lattice Products

x

=

Direct (Cartesian) product of two spaces

Direct Product Rule The lattice product is also associative

A × (B × C)

=

(A × B) × C

After the sum rule, the only freedom left is rescaling

v((a, b))

=

v(a) v(b)

which is again summation (after taking the logarithm) 

Constraints on Valuations Sum Rule

v(x ∨ y) = v(x) + v(y) − v(x ∧ y) Direct Product Rule

v((a, b))

=

v(a) v(b)

Quantification Via Bi-Valuations

Context and Bi-Valuations BI‐VALUATION

w : x, i ∈ L → IR Valuation

Bi‐Valuation

w(x | i)

vi (x)

v(x)

Context i is explicit

Measure of x with respect to  Context i

Context i is implicit

Bi‐valuations generalize lattice inclusion to  degrees of inclusion

Inherited Constraints Sum Rule

w(x | i) + w(y | i) = w(x ∨ y | i) + w(x ∧ y | i) Direct Product Rule

w((a, b) | (i, j))

=

w(a | i) w(b | j)

Inherited from valuations

Associativity of Context

=

Chain Rule c w(a | c) = w(a | b) w(b | c)

b a

Extending the Chain Rule w(x | x) + w(y | x) = w(x ∨ y | x) + w(x ∧ y | x) Since x§x and x § x¤y, w(x|x)=1 and w(x¤y |x)=1 x ¤ y

y

x

x ⁄ y

w(y | x) = w(x ∧ y | x)

Extending the Chain Rule w(x ∧ y ∧ z | x) = w(x ∧ y | x)w(x ∧ y ∧ z | x ∧ y)

y

x x⁄y

z y⁄z

x⁄y⁄z

Extending the Chain Rule w(x ∧ y ∧ z | x) = w(x ∧ y | x)w(x ∧ y ∧ z | x ∧ y) w(y ∧ z | x) = w(y | x)w(z | x ∧ y) y

x x⁄y

z y⁄z

x⁄y⁄z

Extending the Chain Rule w(x ∧ y ∧ z | x) = w(x ∧ y | x)w(x ∧ y ∧ z | x ∧ y) w(y ∧ z | x) = w(y | x)w(z | x ∧ y) y

x x⁄y

z y⁄z

x⁄y⁄z

Extending the Chain Rule w(x ∧ y ∧ z | x) = w(x ∧ y | x)w(x ∧ y ∧ z | x ∧ y) w(y ∧ z | x) = w(y | x)w(z | x ∧ y) y

x x⁄y

z y⁄z

x⁄y⁄ z

Extending the Chain Rule w(x ∧ y ∧ z | x) = w(x ∧ y | x)w(x ∧ y ∧ z | x ∧ y) w(y ∧ z | x) = w(y | x)w(z | x ∧ y) y

x x⁄y

z y⁄z

x⁄y⁄ z

Constraint Equations Sum Rule

w(x | i) + w(y | i) = w(x ∨ y | i) + w(x ∧ y | i) Direct Product Rule

w((a, b) | (i, j))

=

w(a | i) w(b | j)

Product Rule

w(y ∧ z | x) = w(y | x)w(z | x ∧ y)

Probability Theory

States

apple

banana

cherry

states of a piece of fruit  picked from my grocery basket

Statements (States of Knowledge) { a, b, c }

subset inclusion

powerset { a, b }       { a, c }       { b, c } a          b         c

{ a }            { b }            { c }

states  of a piece of fruit

statements  about a piece of fruit

statements describe potential states

Implication { a, b, c } implies

{ a, b }       { a, c }       { b, c } { a }            { b }            { c } statements  about a piece of fruit

ordering encodes implication

Inference { a, b, c } { a, b }       { a, c }       { b, c } { a }            { b }            { c }

Quantify to what degree  knowing that the system is in one of three states {a, b, c} implies knowing that it is  in some other set of states

statements  about a piece of fruit

inference works backwards

Inference { a, b, c }

a ¤ b ¤ c

{ a, b }       { a, c }       { b, c }

a ¤ b             a ¤ c              b ¤ c

{ a }            { b }            { c }

a                 b                c 

set notation

logic notation

Change notation

Inference a ¤ b ¤ c a ¤ b             a ¤ c              b ¤ c a                 b                c  statements  about a piece of fruit

Quantify to what degree  knowing that the system is in one of three states {a, b, c} implies knowing that it is  in some other set of states

Inference a ¤ b ¤ c

p(c | a ¤ b ¤ c)

a ¤ b             a ¤ c              b ¤ c a                 b                c  statements  about a piece of fruit Quantify to what degree  knowing that the system is in one of three states {a, b, c} implies knowing that it is  in some other set of states

Constraint Equations Sum Rule

p(x | i) + p(y | i) = p(x ∨ y | i) + p(x ∧ y | i) Direct Product Rule

p((a, b) | (i, j)) Product Rule

=

p(a | i) p(b | j)

p(y ∧ z | x) = p(y | x) p(z | x ∧ y)

Commutativity Commutativity x ∧ y = y ∧ x leads to Bayes Theorem…

p(y | i) p(x | y ∧ i) = p(x | i) p(y | x ∧ i)

p(x | i) p(x | y ∧ i) = p(y | x ∧ i) p(y | i)

p(x | i) p(x | y) = p(y | x) p(y | i) Bayes Theorem involves a change of context.

Bayesian Probability Theory Sum Rule

p(x | i) + p(y | i) = p(x ∨ y | i) + p(x ∧ y | i) Direct Product Rule

p((a, b) | (i, j))

=

p(a | i) w(b | j)

Product Rule

p(y ∧ z | x) = p(y | x) p(z | x ∧ y)

Bayes Theorem

p(x | i) p(x | y) = p(y | x) p(y | i)

Inquiry (Information Theory)

Three Spaces FD(N ) 2

N

N

a∨b∨c

powerset

a∨b a∨c

a

a

b

States

c

b

exp b∨c

c

log Statements (sets of states) (potential states)

Questions (sets of statements) (potential statements)

answers

Inquiry Space

Questions are sets of Statements Free Distributive Lattice

Relevance

“Is it an Apple?”

Relevance Decreases

answers

“Is it an Apple or Cherry, or is it a Banana or Cherry?”

Central Issue “Is it an Apple, Banana, or Cherry?”

Relevance and Entropy d ( I | Q)

H ( pa , pb∨ c )

− pa log 2 pa

H ( I ) = − pa log 2 pa − pv log 2 pv − pm log 2 pm

Inquiry Sum Rule

d(X | I) + d(Y | I) = d(X ∨ Y | I) + d(X ∧ Y | I) Direct Product Rule

d((A, B) | (I, J))

=

d(A | I) d(B | J)

Product Rule

d(Y ∧ Z | X) = d(Y | X) d(Z | X ∧ Y)

Bayes Theorem

d(X | I) d(X | Y) = d(Y | X) d(Y | I)

Inquiry and Information Theory Sum Rule

d(X | I) + d(Y | I) = d(X ∨ Y | I) + d(X ∧ Y | I) d(X ∨ Y | I) = d(X | I) + d(Y | I) − d(X ∧ Y | I)

I(X; Y) = H(X) + H(Y) − H(X, Y)

2

MaxEnt

N

FD(N )

a∨b∨c a∨b a∨c

a

b

exp b∨c

c

log One way to assign priors is to note that we have chosen the atomic hypotheses for a reason… they are most relevant. We can then assign probabilities, so that the relevance of the Central Issue is maximized!

Quantum Theory

Measurement Sequences

Quantify a quantum mechanical measurement sequence [m1, m2, m3] with a pair of real numbers.

Parallel

Serial

Quantification Via Distinguished Sets

Quantification of a Poset

Consider a poset comprised of an enormous number of elements.

Distinguished Chains We will distinguish one or more subsets of elements and use them for quantification. The method discussed here relies on Identifying chains.

Simplicity Chains will be quantified so that they look simple

Projection of an Element onto a Chain

The projection of an event x onto a chain P is given by the least event px on the chain that can be informed about x.

px x

P

Quantifying a Projection 9 8 Projections can be quantified by selecting particular elements on the chain and assigning a numeric value to each event on the chain.

7 6 px = 5 4

x

3 2 1

P

Two Observer Chains

qx px x

P

Q

Observers must be Coordinated Chains are coordinated by carefully selecting which events along the chain are used for quantification. Events are selected so that successive events on one chain project to successive events on the other.

P

Q

Quantification with Pairs Event x can be quantified by a pair of numbers derived from the labels of the events on each chain that are first informed about x.

qx px

The pair (px, qx) quantifies event x Technically this pair represents the direct product of measures of the two chains

x

P

Q

Intervals We consider two events and quantify the interval between them. This results in two pairs, or two degrees of freedom, since the origin of the labels on each chain is arbitrary.

p2

q2 q1 2

p1 1 P

Q

Quantification p2

q2

Δp = p 2 − p1

Δq = q2 − q1

q1 2 p1 1 P

(Δp, Δq) =

Q

(p 2 − p1, q2 − q1 )

Two Fundamental Configurations

p2

q2

p2

q1

p1

q1

p1

q2

2

1

2

1 P

Q

symmetric (chain‐like)

P

Q

antisymmetric (antichain‐like)

Decomposition p2

q2

Δp = p 2 − p1

Δq = q2 − q1

q1 2 p1 1 P

Q

⎛ Δp + Δq Δp + Δq ⎞ ⎛ (Δp − Δq) − (Δp − Δq) ⎞ (Δp, Δq) = ⎜ , , ⎟ + ⎜ ⎟ 2 2 2 2 ⎠ ⎝ ⎠ ⎝

Decomposition p2

q2

Δp = p 2 − p1

Δq = q2 − q1

q1 2 p1 1 P

Q

⎛ Δp + Δq Δp + Δq ⎞ ⎛ (Δp − Δq) − (Δp − Δq) ⎞ (Δp, Δq) = ⎜ , , ⎟ + ⎜ ⎟ 2 2 2 2 ⎠ ⎝ ⎠ ⎝

symmetric

antisymmetric

Coordinates We can define coordinates by Δt =

Δp + Δq 2

Δx =

Δp − Δq 2

The pair ⎛ Δp + Δq Δp + Δq ⎞ ⎛ (Δp − Δq) − (Δp − Δq) ⎞ (Δp, Δq) = ⎜ , , ⎟ + ⎜ ⎟ 2 2 2 2 ⎝ ⎠ ⎝ ⎠

can be rewritten as (Δt + Δx, Δt − Δx ) =

(Δt, Δt )

+ (Δx,− Δx )

Measuring Intervals Given an interval quantified by a pair (Δt + Δx, Δt − Δx ) =

(Δt, Δt )

+ (Δx,− Δx )

there are two scalar measures: Pair Component Sum (Δt + Δx ) + (Δt − Δx ) = 2Δt

Scalar Interval (Δt + Δx )(Δt − Δx ) = Δt 2 − Δx 2

Which is the MINKOWSKI METRIC! Δs 2

= Δt 2 − Δx 2

Special Relativity (see my poster)

CONCLUSIONS Order remains an untapped resource in physics Quantification of partially ordered sets leads to constraint equations which play a significant role in determining many physical laws Inspired by the ideas of Cox and Jaynes we derive: measure theory probability theory information theory quantum mechanics special relativity

Information Physics views the laws of physics as arising from our descriptions of the universe, not the universe itself.

Special thanks to: Newshaw Bahreyni Ariel Caticha Seth Chaiken Keith Earle

Adom Giffin Philip Goyal Carlos Rodriguez John Skilling