jpaul.RegExps
Class RegExp<A>

java.lang.Object
  extended by jpaul.RegExps.RegExp<A>
All Implemented Interfaces:
java.io.Serializable, java.lang.Cloneable
Direct Known Subclasses:
RegExp.Atomic, RegExp.Concat, RegExp.EmptyStr, RegExp.None, RegExp.Star, RegExp.Union

public abstract class RegExp<A>
extends java.lang.Object
implements java.lang.Cloneable, java.io.Serializable

RegExp models a regular expression. A regular expression describes a set of strings (=lists) made of symbols of some type A.

Several inner classes model the different kinds of regular expressions. Given a regular expression, to do processing on a case by case basis, one should use the visitor pattern; see RegExp.Visitor. Alternatives (e.g., instanceof tests) are not elegant and should be avoided.

Version:
$Id: RegExp.java,v 1.13 2006/03/14 02:29:31 salcianu Exp $
Author:
Suhabe Bugrara - suhabe@alum.mit.edu, Alexandru Salcianu - salcianu@alum.mit.edu
See Also:
Serialized Form

Nested Class Summary
static class RegExp.Atomic<A>
          Regular expression that matches only the 1-length string consisting of exactly one specific symbol.
static class RegExp.Concat<A>
          Regular expression produced by concatenating two regular expressions.
static class RegExp.EmptyStr<A>
          Regular expressions that matches only the empty string.
static class RegExp.None<A>
          Regular expression that does not match any string.
static class RegExp.Star<A>
          The star regular expression.
static class RegExp.Union<A>
          The regular expression that matches any string matched by at least one of two specific regular expression.
static class RegExp.Visitor<A,Res>
          Instance of the visitor pattern for RegExps.
 
Constructor Summary
RegExp()
           
 
Method Summary
protected abstract  int _hashCode()
          Does the real work behing hashCode.
<Res> Res
accept(RegExp.Visitor<A,Res> visitor)
          Method for the visitor pattern.
static
<A> RegExp<A>
buildConcat(java.util.List<RegExp<A>> concatTerms)
          Builds a regular expression equivalent to the concatenation of several (possibly more than 2) terms.
static
<A> RegExp<A>
buildUnion(java.util.Set<RegExp<A>> unionTerms)
          Builds a regular expression equivalent to the union of several (possibly more than 2) terms.
 int hashCode()
          Computes the hashcode for this object.
 RegExp<A> simplify()
          Returns a simplified, equivalent version of this regular expression.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

RegExp

public RegExp()
Method Detail

accept

public <Res> Res accept(RegExp.Visitor<A,Res> visitor)
Method for the visitor pattern. This method will dynamically select and execute the visit method of the visitor that corresponds to the dynamic type of the regular expression. E.g., in the case of a regular expression that is a RegExp.Concat object, accept executes the code of visit(RegExp.Concat).


hashCode

public int hashCode()
Computes the hashcode for this object. Dummy recursive computation of the hashcode can be TREMENDOUSLY inneficient: several transformations (ex: the conversion from NFA to regular expressions) can produce in polynomial time regular expressions of implicit exponential size (due to sharing: e.g., both left and right parts of a RegExp.Concat are the same object). To prevent this efficiency problems, hashCode uses caching (perfectly safe for immutable objects like the current RegExps). The hashcode is computed only the first time, by calling _hashCode that performs the real computation. Normally, subclasses should implement _hashCode without touching hashCode.

Overrides:
hashCode in class java.lang.Object

_hashCode

protected abstract int _hashCode()
Does the real work behing hashCode.

See Also:
hashCode()

buildConcat

public static <A> RegExp<A> buildConcat(java.util.List<RegExp<A>> concatTerms)
Builds a regular expression equivalent to the concatenation of several (possibly more than 2) terms. In the normal case, this method applies the binary Concat constructor repeatedly. If the list of terms contains a single term, the method returns that term (so, the resulting RegExp is not always an instance of Concat). If the list is empty, the method returns the EmptyStr regexp.


buildUnion

public static <A> RegExp<A> buildUnion(java.util.Set<RegExp<A>> unionTerms)
Builds a regular expression equivalent to the union of several (possibly more than 2) terms. In the normal case, this method applies the binary Union constructor repeatedly. If the set of terms contains a single distinct term, the method returns that term (so, the resulting RegExp is not always an instance of Union). If the set of terms is empty, the method returns None.


simplify

public RegExp<A> simplify()
Returns a simplified, equivalent version of this regular expression. Applies some simple simplifications for regular expression. E.g., concatenation with EmptyStr and union with None can be ignored. This method does not necessarily return the smallest equivalent regexp; regexp minimization is a hard problem, PSPACE-complete.



Copyright 2005 Alexandru Salcianu - salcianu@alum.mit.edu