an application to a depollution problem

interpretable RBs are designed for classification problems). How? (our proposition) .... Numerical accuracy measured by classical RMSE on active data. PI = √. 1.
1MB taille 0 téléchargements 352 vues
Introduction OLS : Original method and modifications Application

Using the OLS algorithm to build interpretable rule bases: an application to a depollution problem S. Destercke 1 1 IRSN

S. Guillaume 2 and B. Charnomordic 3

(Institute of Radioprotection and Nuclear Safety), DPAM, SEMIC, LIMSI Cadarache, France

2 Cemagref

3 INRA

(Agricultural and Environmental Engineering Research), TEMO Montpellier, France (French National Institute for Agricultural Research), LASB Montpellier, France

FUZZ’IEEE 2007 S. Destercke, S. Guillaume, B. Charnomordic

OLS and interpretability : application to depoll. problem

Introduction OLS : Original method and modifications Application

Why and how? Why? (motivations) Among fuzzy learning methods, difficult to find one which . . . treats regression problem, is numerically efficient (good predictive abilities and not require too many resources), builds an interpretable rule base (RB), . . . at the same time (most efficient algorithms giving interpretable RBs are designed for classification problems). How? (our proposition) Take a numerically efficient algorithm designed for regression problems, the OLS (Orthogonal Least Squares), and make it interpretable S. Destercke, S. Guillaume, B. Charnomordic

OLS and interpretability : application to depoll. problem

Introduction OLS : Original method and modifications Application

Interpretability criteria: a reminder Retained interpretability criteria: Interpretable input fuzzy partitions (domain coverage, reasonable number, distinguishable) here, we take standardized fuzzy partitions with triangular membership functions. 1

Reasonable number of rules in the RB Limited number of distinct rule conclusions

S. Destercke, S. Guillaume, B. Charnomordic

OLS and interpretability : application to depoll. problem

Introduction OLS : Original method and modifications Application

Building a fuzzy system Given N samples, to optimize a zero order Sugeno fuzzy system by Least Squares comes down to the problem 

min (yb − y )2 ≡ min 

N P

k =1

P r



p V

i=1

« µ(xik ) θi

p PV

r i=1

µ(

xik

)

2

− y k  where p is

the number of premises (input space dimension) Solving it require to optimize r (RB),µ(xik ) (membership fc.) and θi (rule conclusions) → difficult and non-linear problem! S. Destercke, S. Guillaume, B. Charnomordic

OLS and interpretability : application to depoll. problem

Introduction OLS : Original method and modifications Application

OLS: how it works

Original algorithm

Linearize by fixing membership functions (µ(xik ))

Select most important rules by orthogonal variance decomposition (r )

Optimize conclusions by Least Square fitting (θi )

Rule base with optimized conclusions

Modified algorithm

S. Destercke, S. Guillaume, B. Charnomordic

OLS and interpretability : application to depoll. problem

Introduction OLS : Original method and modifications Application

OLS: how it works Original algorithm One gauss. MF per sample 1

0.8

0.6

0.4

0.2

0

1

2

3

4

5

Select rules that explain the most variance by GramSchmidt decomposition

By L-S optimization, each rule has a distinct conclusion

Select most important rules by orthogonal variance decomposition (r )

Optimize conclusions by Least Square fitting (θi )

6

Linearize by fixing membership functions (µ(xik ))

Rule base with optimized conclusions

Modified algorithm

S. Destercke, S. Guillaume, B. Charnomordic

OLS and interpretability : application to depoll. problem

Introduction OLS : Original method and modifications Application

OLS: how it works Original algorithm One gauss. MF per sample

le tab e r p

1

0.8

ter

in ot

0.6

0.4

N

0.2

0

1

2

3

4

5

Select rules that explain the most s variance ru byleGramw Schmidt Fe decomposition

le By L-S optimizale ab thas e tab tion, each rule r e p r r a distinct erp nte concluint yi l t r sion oo No P

6

Linearize by fixing membership functions (µ(xik ))

Select most important rules by orthogonal variance decomposition (r )

Optimize conclusions by Least Square fitting (θi )

Rule base with optimized conclusions

Modified algorithm

S. Destercke, S. Guillaume, B. Charnomordic

OLS and interpretability : application to depoll. problem

Introduction OLS : Original method and modifications Application

OLS: how it works Original algorithm One gauss. MF per sample

le ab t e rpr

1

0.8

0.6

t No

0.4

0.2

0

1

2

e

int 3

4

5

Select rules that explain the most s variance ru byleGramw Schmidt Fe decomposition

le By L-S optimizale ab t ab e t tion, each rule has r e p pr er a distinct ter int conclun i y l t r sion oo No P

6

Linearize by fixing membership functions (µ(xik ))

Select most important rules by orthogonal variance decomposition (r )

Optimize conclusions by Least Square fitting (θi )

Build interpretable partitions fitting data by hierarchical process. 1

Eventually restrict number of selected rules

After LS optimization, reduce number of distinct conclusions by kmeans algorithm

Rule base with optimized conclusions

Int

e

et rpr

le ab

Modified algorithm S. Destercke, S. Guillaume, B. Charnomordic

OLS and interpretability : application to depoll. problem

Introduction OLS : Original method and modifications Application

How do we evaluate the final fuzzy system? samples Coverage Index CIα = # Active , a sample being active # Samples if it fires at least one rule over threshold α. Input 2

Input 2

x99

x99 3 2

3

x2,...,50

2

x1

x2,...,50 x1

1

1

x100 1

2

x100

x51,...,98

x51,...,98 3

Input 1

1

2

3

IF input 1 IS 2 AND IF input 2 IS 1 IF input 1 IS 1 AND IF input 2 IS 2

No threshold (MR ) 0.1 threshold (MR0.1 )

CI0 = 0.99

CI0.1 = 0.02

Input 1

Numerical accuracy measured by classical RMSE on active data q 1 b PI = n (y − y )2 (n=number of active samples for given α) S. Destercke, S. Guillaume, B. Charnomordic

OLS and interpretability : application to depoll. problem

Introduction OLS : Original method and modifications Application

Application: what and why? What?

why?

Water depollution process by anaerobic digestion

Process requires little energy and produces renewable energy But bacteria population grow slow and sensitive to environment changes Need to build system to detect quickly unstable state threatening the population (i.e. fault detection) Acidogenic state particularly critical Using OLS to analyze data with expert help and improve detection systems.

589 samples coming from a fixed-bed reactor of 1m3 Input: 7 variables Output: Expert value between 0-1, characterizing current state as acidogenic or not S. Destercke, S. Guillaume, B. Charnomordic

OLS and interpretability : application to depoll. problem

Introduction OLS : Original method and modifications Application

Application: building input partitions Applying hierarchical process on data gives partitions for four variables (pH, Volatile Fatty Acids, Input flow Qin, CH4 concentration ) A

A1

1.2

2

1.2

A4

A3

1

1

0.8

0.8

0.6

0.6

0.4

0.4

5

5.5

6

6.5

7

7.5

8

8.5

9

0

0

1000

2000

3000

pH A

1.2

5

4

A2

1

4000

5000

6000

7000

8000

vfa A

1.2

3

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0

A

A

3

0.2

0.2

0

A

A2

A1

A1

A

2

A3 A4

A5

0.2

0

5

10

15

20

25

30

35

40

45

50

Input flow rate (Qin) S. Destercke, S. Guillaume, B. Charnomordic

0 40

50

60

70

80

90

CH4 OLS and interpretability : application to depoll. problem

Introduction OLS : Original method and modifications Application

Application: analysis of first results OLS on 589 samples : final RB has 53 rules and PI=0.046 Remarks on these first result Rule ordering: On 589 samples, only 35 have output > 0.5, while among the first 10 rules, 8 have output > 0.5 (with 6 close to 1) → Selecting rule by variance tends to privilege "faulty" samples, which contribute more to the variance. Out of range conclusions: Some computed rule conclusions are outside [0, 1], due to the unconstrained least-square optimization. Detection and treatment of outliers: two of the first rules If pH is "high" (A3 ) and . . . , then output is 0.999 If pH is "very high" (A4 ) and . . . , then output is 1 were inconsistent with knowledge (acidogenic state incoherent with a basic pH). The 2 samples corresponding to these rules were removed. S. Destercke, S. Guillaume, B. Charnomordic

OLS and interpretability : application to depoll. problem

Introduction OLS : Original method and modifications Application

Application: distinct conclusion reduction In final system, number of distinct conclusions was brought from 49 to 6 5

30 25 Rule conclusion distribution with non−reduced vocabulary Number of rules

Number of rules

4

3

2

1

Rule conclusion distribution with reduced vocabulary

20 15 10 5

0 −0.5

0

0.5

Conclusion Value

1

1.5

S. Destercke, S. Guillaume, B. Charnomordic

0 −0.2

0

0.2

0.4

0.6

Conclusion Value

0.8

1

1.2

OLS and interpretability : application to depoll. problem

Introduction OLS : Original method and modifications Application

Application: final system summary Rules

PI(α=0)

CI0

CI0.1

Modified OLS

51

0.054

100 %

100 %

Original OLS

51

0.074

100 %

30 %

1

High risk Inferred value

0.8

Non-neglectable risk

0.6

0.4

Needs further investigation

0.2

Very low risk 0

0

0.2

0.4

0.6

Observed value

S. Destercke, S. Guillaume, B. Charnomordic

0.8

1

OLS and interpretability : application to depoll. problem

Introduction OLS : Original method and modifications Application

Application: summary

Applying the modified OLS algorithm allowed us to Remove erroneous data from sample base Extract rules corresponding to critical situations Point out interesting experimental points for experts Build a final interpretable system with a good qualitative predictive quality (and whose numerical efficiency competes with the one of the original method)

Moreover, OLS algorithm (by its principle) seems particularly fitted to problems when important samples are also rare, like fault detection problems.

S. Destercke, S. Guillaume, B. Charnomordic

OLS and interpretability : application to depoll. problem

Introduction OLS : Original method and modifications Application

modified OLS: advantages/defects/perspectives Advantages Provides robust and interpretable rule bases for regression problems Focus on rare samples and on most important rules Can be used for knowledge extraction as well as for system modeling

Disadvantages Computational cost in high dimensional problems Variables have to be selected before applying OLS Learned Rules are complete (i.e. contain all inputs)

Perspectives Robustness study, Remedies to disadvantages cited above Extend to other methods based on orthogonalisation (e.g. TLS) Refine rule sel., e.g. by using backward-forward regression techniques S. Destercke, S. Guillaume, B. Charnomordic

OLS and interpretability : application to depoll. problem