这篇文章的代码所描述的算法在这里有详细的说明。
Epsilon-NFA到NFA的目标主要是产生一个没有Epsilon边的,跟原状态图等价的新状态图。过程不复杂,首先从起始状态开始,寻找所有Epsilons边到达的对象的集合,然后复制这个集合的所有状态包含的非Epsilon状态。其实状态做完之后,寻找所有能够产生非Epsilon边的状态然后重复这个过程,最后NFA就出来了。代码如下:
1 Automaton::Ref EpsilonNfaToNfa(Automaton::Ref source, bool(*epsilonChecker)(Transition*))
2 {
3 Automaton::Ref target=new Automaton;
4 Dictionary<State*, State*> stateMap;
5 Dictionary<State*, State*> reverseStateMap;
6 List<State*> epsilonStates;
7 stateMap.Add(source->startState, target->NewState());
8 reverseStateMap.Add(stateMap[source->startState], source->startState);
9 target->startState=target->states[0].Obj();
10 CopyFrom(target->captureNames.Wrap(), source->captureNames.Wrap());
11
12 for(int i=0;i<target->states.Count();i++)
13 {
14 State* targetState=target->states[i].Obj();
15 State* sourceState=reverseStateMap[targetState];
16 if(sourceState->finalState)
17 {
18 targetState->finalState=true;
19 }
20 epsilonStates.Clear();
21 epsilonStates.Add(sourceState);
22
23 for(int j=0;j<epsilonStates.Count();j++)
24 {
25 State* currentState=epsilonStates[j];
26 for(int k=0;k<currentState->transitions.Count();k++)
27 {
28 Transition* transition=currentState->transitions[k];
29 if(epsilonChecker(transition) && !epsilonStates.Contains(transition->target))
30 {
31 if(transition->target->finalState)
32 {
33 targetState->finalState=true;
34 }
35 epsilonStates.Add(transition->target);
36 }
37 }
38 }
39
40 for(int j=0;j<epsilonStates.Count();j++)
41 {
42 State* epsilonState=epsilonStates[j];
43 for(int k=0;k<epsilonState->transitions.Count();k++)
44 {
45 Transition* nonEpsilonTransition=epsilonState->transitions[k];
46 if(!epsilonChecker(nonEpsilonTransition))
47 {
48 if(!stateMap.Keys().Contains(nonEpsilonTransition->target))
49 {
50 stateMap.Add(nonEpsilonTransition->target, target->NewState());
51 reverseStateMap.Add(stateMap[nonEpsilonTransition->target], nonEpsilonTransition->target);
52 }
53 Transition* newTransition=target->NewTransition(targetState, stateMap[nonEpsilonTransition->target]);
54 newTransition->capture=nonEpsilonTransition->capture;
55 newTransition->index=nonEpsilonTransition->index;
56 newTransition->range=nonEpsilonTransition->range;
57 newTransition->type=nonEpsilonTransition->type;
58 }
59 }
60 }
61 }
62 return target;
63 }
我们来看一个例子:
(/+|-)?/d+(./d+)?
这是一个判断一个字符串是不是小数的正则表达式,首先是Epsilon-NFA:
1 [START]State<0>
2 To State<2> : <Epsilon>
3 To State<4> : <Epsilon>
4 To State<1> : <Epsilon>
5 State<1>
6 To State<6> : <Epsilon>
7 State<2>
8 To State<3> : <Char : 43[+] - 43[+]>
9 State<3>
10 To State<1> : <Epsilon>
11 State<4>
12 To State<5> : <Char : 45[-] - 45[-]>
13 State<5>
14 To State<1> : <Epsilon>
15 State<6>
16 To State<7> : <Char : 48[0] - 57[9]>
17 State<7>
18 To State<8> : <Epsilon>
19 To State<10> : <Epsilon>
20 State<8>
21 To State<9> : <Char : 48[0] - 57[9]>
22 State<9>
23 To State<7> : <Epsilon>
24 State<10>
25 To State<11> : <Char : 46[.] - 46[.]>
26 To State<13> : <Epsilon>
27 State<11>
28 To State<12> : <Epsilon>
29 State<12>
30 To State<13> : <Char : 48[0] - 57[9]>
31 [FINISH]State<13>
32 To State<14> : <Epsilon>
33 State<14>
34 To State<15> : <Char : 48[0] - 57[9]>
35 State<15>
36 To State<13> : <Epsilon>
37
然后是NFA,也就是今天这个算法的结果:
1 [START]State<0>
2 To State<1> : <Char : 43[+] - 43[+]>
3 To State<2> : <Char : 45[-] - 45[-]>
4 To State<3> : <Char : 48[0] - 57[9]>
5 State<1>
6 To State<3> : <Char : 48[0] - 57[9]>
7 State<2>
8 To State<3> : <Char : 48[0] - 57[9]>
9 [FINISH]State<3>
10 To State<4> : <Char : 48[0] - 57[9]>
11 To State<5> : <Char : 46[.] - 46[.]>
12 To State<6> : <Char : 48[0] - 57[9]>
13 [FINISH]State<4>
14 To State<4> : <Char : 48[0] - 57[9]>
15 To State<5> : <Char : 46[.] - 46[.]>
16 To State<6> : <Char : 48[0] - 57[9]>
17 State<5>
18 To State<7> : <Char : 48[0] - 57[9]>
19 [FINISH]State<6>
20 To State<6> : <Char : 48[0] - 57[9]>
21 [FINISH]State<7>
22 To State<6> : <Char : 48[0] - 57[9]>
23
接下来就是NFA到DFA的生成了。
posted on 2009-10-28 04:34
陈梓瀚(vczh) 阅读(2372)
评论(7) 编辑 收藏 引用 所属分类:
VL++3.0开发纪事