Learning Deterministic Finite Automata from Smallest Counterexamples
Abstract. We show that deterministic finite automata (DFAs) with n states and input alphabet Sigma can efficiently be learned from less than |Sigma|n^2 smallest counterexamples. This improves on an earlier result by Ibarra and Jiang who required |Sigma|n^3 smallest counterexamples. We show furthermore that the special DFAs operating on input words of an arbitrary but fixed length (called leveled DFAs) are efficiently learnable from |Sigma| n log(n) (1+o(1)) smallest counterexamples. This improves on an earlier result by Ibarra and Jiang who required |Sigma|n^2 smallest counterexamples. Finally, we show that our algorithm for leveled DFAs cannot be substantially improved. For this purpose, we present a general lower bound on the number of smallest counterexamples (required by any learning algorithm). This bound can be stated in terms of a (new) combinatorial dimension associated with the target class. A computation of this dimension for leveled DFAs leads to a lower bound of the form (1/4)|Sigma| n log(n) (1+o(1)). This coincides with our upper bound modulo a factor of approximately 4.