Commit: r1488: RuleFileReader.optimiseAndSaveBinaryOp(): Try to be symmetric.

Version 1488 was commited by marko on 2010-01-18 08:13:28 +0000 (Mon, 18 Jan 2010) RuleFileReader.optimiseAndSaveBinaryOp(): Try to be symmetric. optimiseAndSaveAndOr(): Refactored from optimiseAndSaveBinaryOp().

On 18/01/10 08:13, svn commit wrote:
RuleFileReader.optimiseAndSaveBinaryOp(): Try to be symmetric. optimiseAndSaveAndOr(): Refactored from optimiseAndSaveBinaryOp().
So you are trying to transform a=b & (c=d|e=f) to a=b&c=d | a=b&e=f The second.isType(OR) will not be executed because first.isType(EQUALS) will have matched first. Also it is not so clear that it is an optimisation as both the new rules will be under the same search key and so you will end up doing pretty much the same thing in either case. With the OR first they will be under different search keys and so will reduce the work in some cases. But more importantly it is *necessary* to deal with the case where the OR comes first because the first term has to be an EQUALS or EXISTS. Symmetry does not apply as the situation is asymmetrical to begin with, and there is no such requirement on the second term. ..Steve

On Mon, Jan 18, 2010 at 09:10:49AM +0000, Steve Ratcliffe wrote:
On 18/01/10 08:13, svn commit wrote:
RuleFileReader.optimiseAndSaveBinaryOp(): Try to be symmetric. optimiseAndSaveAndOr(): Refactored from optimiseAndSaveBinaryOp().
So you are trying to transform
a=b & (c=d|e=f)
to
a=b&c=d | a=b&e=f
The second.isType(OR) will not be executed because first.isType(EQUALS) will have matched first.
Yes, as I wrote in the other message, this comment is somewhat bogus (it already was before the patch). I corrected the comment in r1490. The following would be more appropriate: a~b & (c=d|e=f) to a~b & c=d | a~b & e=f and further (in optimiseAndSaveBinaryOp() on and1 and and2) to c=d & a~b | e=f & a~b In fact, I tested this style definition: a~b & (c=d|e=f) { set ab=1; } It passes with my patch and fails without the second.isType(OR) check: Error in style: Error: (lines:58): Invalid rule file (expr &)
Also it is not so clear that it is an optimisation as both the new rules will be under the same search key and so you will end up doing pretty much the same thing in either case. With the OR first they will be under different search keys and so will reduce the work in some cases.
I see, there can be at most one search key per rule? So, it would seem to make sense to put the most selective condition first. And I guess then it would not make sense to move the isType(OR) checks before EQUALS or EXISTS.
But more importantly it is *necessary* to deal with the case where the OR comes first because the first term has to be an EQUALS or EXISTS. Symmetry does not apply as the situation is asymmetrical to begin with, and there is no such requirement on the second term.
Does my above example convince you that the symmetry might be useful? Or do you think that we should throw syntax error for "stupid" style definitions? Marko

Hi Marko
a~b& (c=d|e=f)
OK cool, that is an example where it would be useful.
I see, there can be at most one search key per rule? So, it would seem to make sense to put the most selective condition first. And I guess then it would not make sense to move the isType(OR) checks before EQUALS or EXISTS.
Yes currently each rule is indexed by one search key. Indexing on further terms might help in the case where the key tag is changed by a 'set', so it is something I've thought about.
Does my above example convince you that the symmetry might be useful? Or do you think that we should throw syntax error for "stupid" style definitions?
Your example is good. I would not call it symmetrical though :) There are lots more things that do not work, here are a few that I've just tried out: 1. a~b # doesn't work 2. a~b & c=d # works 3. a~b & c~d & e=f # doesn't work. 4. (a~b | c~d) & e=f # works 5a (a~b | c~d) & e=f & g=h # doesn't work 5b ((a~b | c~d) & e=f) & g=h # works! 6a e=f & g=h & (a~b | c~'d.*') # works 6b (e=f & g=h) & (a~b | c~'d.*') # doesn't work Just looking at first and second doesn't work at all in general. Of course no-one uses anything that complex in their styles. ..Steve

Hi Steve,
There are lots more things that do not work, here are a few that I've just tried out:
1. a~b # doesn't work 2. a~b & c=d # works 3. a~b & c~d & e=f # doesn't work. 4. (a~b | c~d) & e=f # works 5a (a~b | c~d) & e=f & g=h # doesn't work 5b ((a~b | c~d) & e=f) & g=h # works! 6a e=f & g=h & (a~b | c~'d.*') # works 6b (e=f & g=h) & (a~b | c~'d.*') # doesn't work but 6c would work, already before my patch: (e=f | g=h) & (a~b | c~d)
Just looking at first and second doesn't work at all in general.
We could get much further by shifting the "simplest" or "most selective" operations (EQUALS, EXISTS) to the left branches of AndOp and OrOp trees. Some simple left-recursivefying tree transformations should do the trick, e.g., (a&b)&c would be transformed into A&(B&C) with some mapping between a,b,c and A,B,C. The A,B,C would be sorted by getType(), following the order EQUALS < EXISTS < REGEX < NOT_EXISTS < ....
Of course no-one uses anything that complex in their styles.
Right, it would probably be better to write simple rules than to figure out what the parser did behind our backs. That reminds me of the compiler course that I took in 1998. I wrote a compiler from a Pascal-like language to Digital Alpha assembler. We were given the code generator, in Java. I implemented some expression simplification (more than just constant folding). I think it was mostly about arithmetic operations, not logic. Marko

Hi Marko I produced the attached patch which makes all the previous examples compile (apart from the plain a~b example). ..Steve

Hi Steve, On Mon, Jan 18, 2010 at 05:09:05PM +0000, Steve Ratcliffe wrote:
I produced the attached patch which makes all the previous examples compile (apart from the plain a~b example).
Great! The algorithm looks OK to me. I have some remarks about it, using stricter type casts, adding missing static qualifiers and adding or updating some comments, and replacing an "assert false" with an exception (in case someone runs with assertions disabled and the code is rearranged so that the assertion would be reached). Your patch with my revisions is attached. Unfortunately, I was unable to test on today's dump because of the NullPointerException that I mentioned earlier. Marko
participants (3)
-
Marko Mäkelä
-
Steve Ratcliffe
-
svn commit