Inferring DFA without Negative Examples Florent Avellaneda & Alexandre Petrenko Computer Research Institute of Montr´ eal (Canada)
ICGI 2018 - 6 September 2018
Florent Avellaneda
Inferring DFA without Negative Examples
ICGI 2018 - 6 September 2018
1 / 33
Inference problem
Input: Positive observations S+ and negative observations S−
Output: The ”best” model consistent with observations
Florent Avellaneda
Inferring DFA without Negative Examples
ICGI 2018 - 6 September 2018
2 / 33
What means ”best”?
”best”: The model most likely to be the real one
Law of parsimony Among competing hypotheses, the one with the fewest assumptions should be selected
The law implies that the simplest conjecture should be the best. For DFA, we generally use the number of states as the unit of measurement for the complexity
Florent Avellaneda
Inferring DFA without Negative Examples
ICGI 2018 - 6 September 2018
3 / 33
Problem to Solve Input: Positive observations S+ and negative observations S−
Output: The ”best” model consistent with observations
Constraint: It makes no sense to look for a DFA with a minimal number of states a∈Σ
q0
Florent Avellaneda
Inferring DFA without Negative Examples
ICGI 2018 - 6 September 2018
4 / 33
Ideas for defining ”simplest”
Idea 1: Try to minimize the recognized language A ≤ A0 if and only if L(A) ⊆ L(A0 )
Idea 2: Set a maximum number of states smaller than that in the given positive examples
Florent Avellaneda
Inferring DFA without Negative Examples
ICGI 2018 - 6 September 2018
5 / 33
Overview
1
Inferring Simplest DFA from Positive Examples
2
Checking the Uniqueness of a Solution
3
Finding Characteristics Positive Examples
4
Case Study
5
Conclusion & Perspectives
Florent Avellaneda
Inferring DFA without Negative Examples
ICGI 2018 - 6 September 2018
6 / 33
Overview
1
Inferring Simplest DFA from Positive Examples
2
Checking the Uniqueness of a Solution
3
Finding Characteristics Positive Examples
4
Case Study
5
Conclusion & Perspectives
Florent Avellaneda
Inferring DFA without Negative Examples
ICGI 2018 - 6 September 2018
7 / 33
Definitions
Definition (n-conjecture) A DFA A is an n-conjecture for S+ if S+ ⊆ L(A) and A has at most n states
Definition (simplest) A minimal DFA A is a simplest n-conjecture for S+ if for each n-conjecture A0 for S+ , we have L(A0 ) 6⊂ L(A)
Problem statement: Given an integer n and a set of positive examples S+ , find a simplest n-conjecture for S+
Florent Avellaneda
Inferring DFA without Negative Examples
ICGI 2018 - 6 September 2018
8 / 33
Method Let A be the chaos DFA and A0 the empty DFA
1
2
3
If S+ * L(A0 ): Let w ∈ S+ \ L(A0 ) and add the constraint “w has to be accepted” If S+ ⊆ L(A0 ) and L(A0 ) * L(A): Let w ∈ L(A0 ) \ L(A) and add the constraint “w must not be accepted” If S+ ⊆ L(A0 ) and L(A) ⊆ L(A0 ): Replace A0 by A and add constraint “excluding solution A0 ”
Florent Avellaneda
Inferring DFA without Negative Examples
ICGI 2018 - 6 September 2018
9 / 33
Illustration: S+ = {a, aa, aaa, b, bb, bbb}
A ∅
Florent Avellaneda
'
A
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
10 / 33
Illustration: S+ = {a, aa, aaa, b, bb, bbb}
A ∅
'
A
a
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
10 / 33
Illustration: S+ = {a, aa, aaa, b, bb, bbb}
A
'
A
a
aa
aaaa
a* a
Florent Avellaneda
aaa
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
10 / 33
Illustration: S+ = {a, aa, aaa, b, bb, bbb}
A
'
A
a
aa
aaaa
a* a
Florent Avellaneda
aaa
b
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
10 / 33
Illustration: S+ = {a, aa, aaa, b, bb, bbb}
A
'
ab
A
aab aa aaaa
a ba
a* a
bb b
aaa
b
aa Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
10 / 33
Illustration: S+ = {a, aa, aaa, b, bb, bbb}
A
ab aab
aaaa
a* a
bb b
aaa aa
Florent Avellaneda
a
ba
aa
'
A
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
b
10 / 33
Illustration: S+ = {a, aa, aaa, b, bb, bbb}
A
ab aab
aaaa
a* a
bb b
aaa aa
Florent Avellaneda
a
ba
aa
'
A
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
b
10 / 33
Illustration: S+ = {a, aa, aaa, b, bb, bbb}
A
ab aab
aaaa
a* a
bb b
aaa aa
Florent Avellaneda
a a
ba
aa
'
A
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
b
10 / 33
Illustration: S+ = {a, aa, aaa, b, bb, bbb}
A
ab aab
aaaa
a* a
a a
ba
aa
bb b
aaa
b
aa Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
10 / 33
Illustration: S+ = {a, aa, aaa, b, bb, bbb} a
A
ab aab
'
A
aa aaaa
a* a
a
ba bb b
aaa
b
aa Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
10 / 33
Illustration: S+ = {a, aa, aaa, b, bb, bbb} a ab
A
aab
aaaa
a* a
a
ba
aa
bb b
aaa
b
aa Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
10 / 33
Illustration: S+ = {a, aa, aaa, b, bb, bbb} a ab aab aa aaaa
'
A
A
a* a
a
ba
b
bb b
aaa aa
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
10 / 33
Illustration: S+ = {a, aa, aaa, b, bb, bbb} a ab aab aa aaaa
'
A
A
a* a
a
ba
b
bb b
aaa aa
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
10 / 33
Illustration: S+ = {a, aa, aaa, b, bb, bbb}
'
ab
A
aab aa aaaa
b
ba
a* a
a
A
a
b
bb b
aaa aa
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
10 / 33
Illustration: S+ = {a, aa, aaa, b, bb, bbb}
'
ab
A
aab aa aaaa
b
ba
a* a
a
A
a
b
bb b
aaa aa
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
10 / 33
Illustration: S+ = {a, aa, aaa, b, bb, bbb} a ab
A
aab aa aaaa
a* a
a
ba bb
aaa
b
b
aa Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
10 / 33
Input: Positive examples S+ and an integer n Output: A simplest n-conjecture for S+ and negative examples S− Initialize C to ∅, S− to ∅ and A to AChaos while C is satisfiable do Let A0 be a DFA of a solution of C. if S+ * L(A0 ) then Let w be a shortest string in S+ \ L(A0 ). C ← C ∧ Cw , where Cw is clauses encoding the requirement that w must be in the conjecture. else if L(A0 ) ⊆ L(A) then C ← C ∧ CA , where CA is a clause to further exclude the current solution. if L(A0 ) ⊂ L(A) then Let w be a shortest string in L(A) \ L(A0 ). C ← C ∧ Cw , where Cw is clauses encoding the requirement that w must not be in the conjecture. S− ← S− ∪ {w } A ← A0 end else Let w be a shortest string in L(A0 ) \ L(A). C ← C ∧ Cw , where Cw is clauses encoding the requirement that w must not be in the conjecture. S− ← S− ∪ {w } end end end return min(A), S−
Definition (characteristic sample) We say that S = (S+ , S− ) is a characteristic sample for a minimal DFA A if A is consistent with S and if for each A0 consistent with S such that |A0 | ≤ |A| we have that A0 is isomorphic to A
Theorem The algorithm return a simplest n-conjecture A for a given S+ and (S+ , S− ) is a characteristic sample for A
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
12 / 33
Overview
1
Inferring Simplest DFA from Positive Examples
2
Checking the Uniqueness of a Solution
3
Finding Characteristics Positive Examples
4
Case Study
5
Conclusion & Perspectives
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
13 / 33
Warning: Many simplest n-conjectures may exist
Remark: Knowing that there is only one solution will guarantee the best quality of the inferred model
Question: Can we check the uniqueness of the solution?
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
14 / 33
Algorithm Input: An n-conjecture A and a characteristic sample (S+ , S− ) for A Output: Return True if A is the only simplest n-conjecture for S+ and return a distinguishing string otherwise Function CheckUniqueness (A, (S+ , S− )): foreach w ∈ S− do 0 ) ← infer (S ∪ {w }, |A|) (A0 , S− + if L(A) 6⊂ L(A0 ) then return w ; end return True
Theorem If there exists a single simplest n-conjecture for S+ this algorithm determines its uniqueness, otherwise it returns a distinguishing string
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
15 / 33
Illustration
If we consider S+ = {a, aa, aaa, b, bb, bbb} ∪ {ab}, the algorithm infer will find a second solution: a
q0
Florent Avellaneda
b b
q1
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
16 / 33
Illustration
If we consider S+ = {a, aa, aaa, b, bb, bbb} ∪ {ab}, the algorithm infer will find a second solution: b
q0
a
a
a
q1
q0
b b
q1
6⊂
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
16 / 33
Illustration
If we consider S+ = {a, aa, aaa, b, bb, bbb} ∪ {ab}, the algorithm infer will find a second solution: b
q0
a
a
a
q1
q0
b b
q1
6⊂
Solution is not unique and ab is a distinguishing string
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
16 / 33
Overview
1
Inferring Simplest DFA from Positive Examples
2
Checking the Uniqueness of a Solution
3
Finding Characteristics Positive Examples
4
Case Study
5
Conclusion & Perspectives
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
17 / 33
Definition (Characteristic positive examples) Positive examples S+ are characteristic positive examples for A if the simplest |A|-conjecture for S+ is A and it is unique
Question: Can we find a characteristic positive examples for each DFA?
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
18 / 33
Algorithm
Input: A DFA A Output: Characteristic positive examples for A Function GenerateCharacteristicPositiveExamples (A): S+ ← ∅ while S+ is not a characteristic positive examples for A do Let A0 be a simplest |A|-conjecture for S+ for which there exists w ∈ L(A) such that w ∈ / L(A0 ). S+ ← S+ ∪ {w } end return S+
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
19 / 33
Theorem For each DFA A, the algorithm GenerateCharacteristicPositiveExamples returns characteristic positive examples
Theorem 0 such that If S+ is characteristic positive examples for A, then each S+ 0 S+ ⊆ S+ ⊆ L(A) is also characteristic positive examples for A
Corollary The languages generated by DFAs with n states are identifiable in the limit from positive examples by searching the simplest n-conjectures
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
20 / 33
Overview
1
Inferring Simplest DFA from Positive Examples
2
Checking the Uniqueness of a Solution
3
Finding Characteristics Positive Examples
4
Case Study
5
Conclusion & Perspectives
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
21 / 33
A
Florent Avellaneda
B
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
22 / 33
A
Florent Avellaneda
B
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
22 / 33
A
Florent Avellaneda
B
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
22 / 33
A
Florent Avellaneda
B
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
22 / 33
Communication protocols used: B:accept, B:refuse
q0
A:request
q1
B:challenge
q2
A:justify
A:retract
q6
B:accept
A:retract
B:veto B:challenge
q3
A:request
q4
A:justify
q5
Input: 50 traces from this protocol generated with a random walk
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
23 / 33
n=1
A:justify, A:request, A:retract, B:accept, B:challenge, B:refuse, B:veto
q0
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
24 / 33
n=2
A:justify, A:request, B:challenge, B:veto
q0
A:retract,
q1
B:accept, B:refuse
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
25 / 33
n=3
A:justify, B:challenge B:veto
q0
q2 A:request
A:retract, B:accept, B:refuse
q1
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
26 / 33
n=4
A:justify
B:veto
q0
A:request
q2
B:accept, B:refuse
q3 B:challenge A:retract
q1
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
27 / 33
n=5
A:justify
B:veto
q0
A:request
q2
B:accept, B:refuse
q3 B:challenge A:retract
q1
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
28 / 33
n=6
B:accept, B:refuse
q0
A:request
q1
B:challenge
q2
A:retract
q5
A:justify B:accept
B:veto
B:challenge
q3
Florent Avellaneda
A:request
q4
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
29 / 33
n=7
B:accept, B:refuse
q0
A:request
q1
B:challenge
q2
A:justify
A:retract
q6
B:accept
A:retract
B:veto B:challenge
q3
A:request
q4
A:justify
q5
This solution is unique
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
30 / 33
Overview
1
Inferring Simplest DFA from Positive Examples
2
Checking the Uniqueness of a Solution
3
Finding Characteristics Positive Examples
4
Case Study
5
Conclusion & Perspectives
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
31 / 33
Conclusion New approach to solving DFA inference problem without negative example Applicable in practice for small models Results that make sense
Perspectives Improving SAT formulas Search for heuristics Apply this approach to more specific models Link to probabilistic approaches?
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
32 / 33
Thank you
Florent Avellaneda
Inferring DFA without Negative Examples ICGI 2018 - 6 September 2018
33 / 33