%mm6/8/99
\documentstyle[12pt]{article}

%au queries
\newcommand{\query}[1]{\marginpar{\raggedright\scriptsize #1}}

\newtheorem{lemma}{Lemma}
\newtheorem{theorem}{Theorem}
\newtheorem{corollary}{Corollary}
\newtheorem{proposition}{Proposition}
\newtheorem{definition}{Definition}
\newcommand{\qed}{$\Box$}

%MM added
\newenvironment{proof}
{\begin{trivlist}\item[\hskip\labelsep{\textbf{Proof}}]}
{\leavevmode\unskip\nobreak
\quad\qed
\end{trivlist}}


\title{The Permanent Requires Large Uniform Threshold 
Circuits\footnotetext{A 
preliminary version of this work appeared in 
\cite{cj99-07-30}.}}

\author{Eric Allender\\
Department of Computer Science\\
Rutgers University\\
Piscataway, NJ  08855-1179\\
{\tt allender@cs.rutgers.edu}\\
{\tt http://www.cs.rutgers.edu/$\tilde{~}$allender/}
}

\date{6 August 1999}

\begin{document}

%%
%% Generic Titlepage
%%
\def\registered{\({}^{\ooalign%
    {\hfil\raise.07ex\hbox{{\tiny\textup{R}}}%
    \hfil\crcr\scriptsize\mathhexbox20D\cr}}\)}
\begin{titlepage}
\pagenumbering{roman}
\null\vfil\vskip 60pt
\begin{center}
{\Huge\bf Chicago\nolinebreak[1] Journal of Theoretical Computer\nolinebreak[1] Science\par}
\vskip 3em
{\huge\it The MIT Press\par}
 \end{center}
\par\vskip 1.5em
{\Large
  \begin{center}
    Volume 1999, Article 7\\
    \textit{The Permanent Requires Large Uniform Threshold Circuits}
  \end{center}
}
\par\noindent
ISSN 1073--0486\@. MIT Press Journals, Five Cambridge Center, Cambridge, MA
02142-1493 USA; (617)253-2889;
\emph{journals-orders@mit.edu, journals-info@mit.edu}\@.
Published one article at a time in \LaTeX\ source form on the
Internet\@. Pagination varies from copy to copy\@. For more
information and other articles see:
\begin{itemize}
  \item \emph{http://mitpress.mit.edu/CJTCS/}
  \item \emph{http://www.cs.uchicago.edu/publications/cjtcs/}
  \item \emph{ftp://mitpress.mit.edu/pub/CJTCS}
  \item \emph{ftp://cs.uchicago.edu/pub/publications/cjtcs}
\end{itemize}

\par\noindent
The \emph{Chicago Journal of Theoretical Computer Science} is
abstracted or indexed in \emph{
Research Alert,\registered
SciSearch, \registered
Current Contents\registered/Engineering
Computing \& Technology}, and \emph{CompuMath Citation Index\@.\registered}

\par\noindent\copyright 1999 The Massachusetts Institute of Technology\@.
Subscribers are licensed to use journal articles in a variety of
ways, limited only as required to ensure fair attribution to authors
and the journal, and to prohibit use in a competing commercial
product\@. See the journal's World Wide Web site for further
details\@. Address inquiries to the Subsidiary Rights Manager, MIT
Press Journals; (617)253-2864;
\emph{journals-rights@mit.edu}\@.\pagebreak[2]

\par\noindent The \emph{Chicago Journal of Theoretical Computer Science}
is a peer-reviewed scholarly journal in theoretical computer
science\@. The journal is committed to providing a forum for
significant results on theoretical aspects of all topics in computer
science\@.
\par
\begin{trivlist}
  \item[\emph{Editor-in-Chief:}] Janos Simon
  \item[\emph{Consulting Editors:}]
    Joseph Halpern, Eric Allender, Raimund Seidel
  \item[\emph{Editors:}]
    \begin{tabular}[t]{lll}
	Martin Abadi & Greg Frederickson & John Mitchell \\
	Pankaj Agarwal & Andrew Goldberg & Ketan Mulmuley \\
	Georg Gottlob & Gil Neiger & Tetsuo Asano  \\
	Vassos Hadzilacos & David Peleg & Laszl\'o Babai \\
	Juris Hartmanis & Andrew Pitts & Eric Bach \\
	Maurice Herlihy & James Royer & Stephen Brookes \\
	Ted Herman & Alan Selman & Jin-Yi Cai \\
	Stephen Homer & Nir Shavit & Anne Condon \\
	Neil Immerman & Eva Tardos & Cynthia Dwork \\
	Howard Karloff & Sam Toueg & David Eppstein \\
	Philip Klein & Moshe Vardi & Ronald Fagin \\
	Phokion Kolaitis & Stuart Kurtz & Jennifer Welch \\
	Lance Fortnow & Stephen Mahaney & Pierre Wolper \\
	Steven Fortune & Michael Merritt
\end{tabular}
  \vspace{1ex}
  \item[\emph{Managing Editor:}] Michael J.\ O'Donnell
  \vspace{1ex}
  \item[\emph{Electronic Mail:}] \emph{chicago-journal@cs.uchicago.edu}
\end{trivlist}
\vfil\null
\end{titlepage}
\maketitle
\pagenumbering{arabic}
%% 
%% End Titlepage code
%%


\begin{abstract}
We show that the permanent cannot be computed by uniform constant-depth
threshold circuits of size $T(n)$ for any function $T$ such that
for all $k$, $T^{(k)}(n) = o(2^n)$.  More generally, we show that
any problem that is hard for the complexity class C$_=$P requires
circuits of this size (on the uniform constant-depth threshold circuit
model).
In particular, this lower bound applies to any problem that is hard
for the complexity classes PP or \#P.

This extends a
recent result by Caussinus, McKenzie, Th\'{e}rien, and Vollmer \cite{cj99-07-21},
showing that there are problems in
the counting hierarchy that require superpolynomial-size uniform TC$^0$ 
circuits.  The
proof in \cite{cj99-07-21} uses ``leaf languages'' as a tool in obtaining their
separations.  Their proof does not immediately yield larger lower bounds
for the complexity of these problems, and it also does not yield a lower
bound for any particular problem at any fixed level of the counting 
hierarchy.  (It only shows that hard problems must exist at  \emph{some} level
of the counting hierarchy.)
We also present related and somewhat weaker lower bounds,
extending the theorem of \cite{cj99-07-21} showing that ACC$^0$ is properly 
contained in ModPH.
\end{abstract}

\section{Introduction}
\subsection{Motivation and background}
The central problem in complexity theory is the task of proving lower bounds
on the complexity of specific problems.  Circuit complexity, in particular,
the study of constant-depth circuits, is one of the few areas where
complexity theory has succeeded in actually providing lower bounds.
Yet even in the study of constant-depth circuits, one quickly arrives at the
limits of current lower-bound technology.  It is known that constant-depth
circuits of AND, OR, and NOT gates (so-called AC$^0$ circuits) 
require exponential size, even to compute
the parity of $n$ input bits (see \cite{cj99-07-25,cj99-07-24}), 
and similar lower bounds are 
known for constant-depth circuits of AND, OR, NOT, and MOD$p$ gates where
$p$ is prime (refer to \cite{cj99-07-11,cj99-07-23}).  
When MOD$m$ gates are allowed for
composite $m$, however, almost nothing is known.  It remains an open 
question if there is any problem
in NTIME($2^{n^{O(1)}}$) that cannot be done with polynomial size and 
constant depth with AND and MOD6 gates.

There is considerable reason to be interested in circuits with AND, OR, and
MOD$m$ gates; circuits of this sort are called ACC$^0$ circuits (for ``alternating
circuits with counters''; the superscript $0$ refers to the
circuit depth of $O(\log^0 n)$).  The lovely result of 
\cite{cj99-07-03} 
characterizing NC$^1$ (log-depth fan-in two circuits) in terms of 
constant-width branching programs relies heavily on algebraic techniques
and shows that NC$^1$ corresponds to computation over  \emph{nonsolvable}
algebras.  Barrington also defined the corresponding notion of computation
over  \emph{solvable} algebras, and it is shown in \cite{cj99-07-05} that
this notion corresponds exactly to ACC$^0$ circuits.  To restate these two
points:
\begin{enumerate}
\item The results of \cite{cj99-07-03} establish intimate connections between
circuit complexity and algebraic structure.
\item In this algebraic setting, ACC$^0$ is the most important subclass of
NC$^1$.
\end{enumerate}

Although, as mentioned
above, it is unknown if small ACC$^0$ circuits suffice to compute all problems
in NTIME$(2^{n^{O(1)}})$, lower bounds for  \emph{uniform} ACC$^0$ 
circuits can be found 
in \cite{cj99-07-02}.  
The techniques of \cite{cj99-07-02} (see also \cite{cj99-07-27}) 
employ diagonalization,
which is less useful in the nonuniform setting.
Since our results, like those of \cite{cj99-07-21} and 
\cite{cj99-07-02}, concern uniform circuits, it is necessary to briefly discuss 
uniformity.

A circuit family $\{C_n\}$ consists of a circuit for each input length $n$.
If $C_n$ is ``sufficiently easy'' to construct from $n$, then the family
$\{C_n\}$ is said to be  \emph{uniform}.  Different notions of ``sufficiently
easy'' give rise to different notions of uniformity; the question of
which is the ``right'' one to use when studying classes
of circuits is not always clear.  

For the circuit classes considered here,
convincing arguments are presented in \cite{cj99-07-04} 
% regarding the supremacy of a very restrictive notion of 
% uniformity called  \emph{Dlogtime-uniformity}.  
       making the case that a very restrictive notion of
       uniformity called  \emph{Dlogtime-uniformity} is the correct
       notion to use.
Briefly, a circuit family $\{C_n\}$ is Dlogtime-uniform
if, given a tuple $(n,g,h)$, a deterministic Turing machine can, in time 
linear in the length of the string $(n,g,h)$,
find whether gate $g$ is connected to gate $h$ in circuit $C_n$
% and also describe the nature of gates $g$ and $h$.  
	and determine if $g$ and $h$ are AND gates, OR gates, 
	input gates, etc.
The length of the input $(n,g,h)$ is logarithmic in
the size of the circuit $C_n$, which is why we call it Dlogtime-uniform.
(Dlogtime-uniformity is essentially equivalent to
what Ruzzo called $U_D$ uniformity in \cite{cj99-07-12}, although he considered
only circuits of fan-in two, and not the unbounded fan-in circuits considered
here and in \cite{cj99-07-04}.)  

Throughout the rest of this paper,
all mention of uniform circuits refers to Dlogtime-uniform circuits.
In addition, ACC$^0(S(n))$
denotes the class of languages with  \emph{uniform} ACC$^0$ 
circuits of size $S(n)$.  ACC$^0$ denotes ACC$^0(n^{O(1)})$.

In contrast to our lack of lower bounds for nonuniform ACC$^0$ circuits
for sets in NTIME($2^{n^{O(1)}}$),
it was shown in \cite{cj99-07-02} that
exponential size (i.e., size at least $2^{n^{\epsilon}}$)
is required to compute the permanent (and other problems complete for \#P)
on  \emph{uniform} ACC$^0$ circuits.  Thus there are 
sets in P$^{\mathrm \#P}$ that require exponential-sized uniform ACC$^0$ circuits.

The complexity class PP is closely related to \#P (for instance, 
P$^{\mathrm \#P}$ = P$^{\mathrm PP}$).  Recall that a set $A$ is in PP if 
there is a nondeterministic polynomial time machine $M$ with the
property that $x \in A$ if and only if the number of accepting paths of
$M$ on input $x$ is greater than the number of rejecting paths.  PP
contains both NP and coNP 
(see \cite{cj99-07-08}).  Another related complexity class
is C$_=$P;  a set $A$ is in C$_=$P if 
there is a nondeterministic polynomial time machine $M$ with the
property that $x \in A$ if and only if the number of accepting paths of
$M$ on input $x$ is  \emph{equal to} the number of rejecting paths.
C$_=$P contains coNP but is not known to contain NP; PP is contained in
NP$^{\mathrm C_=P}$ \cite{cj99-07-15}.

One might expect that similar exponential lower bounds would hold for
PP or C$_=$P as hold for \#P,
but \cite{cj99-07-02} was able to show only that sets complete for these
classes require more than sub-subexponential-size ACC$^0$ circuits,
where the term ``sub-subexponential'' is made precise as follows.

\begin{definition}
A function $t$ is said to be  \emph{sub-subexponential} if, for all $k$, 
\[t(t(n)^k) = 2^{n^{o(1)}}. \]
\end{definition}

In \cite{cj99-07-02}
this term was defined slightly differently; in that paper, $t$
had to satisfy only the condition that $t(t(n)) = 
2^{n^{o(1)}}$.  Note that for all ``natural'' and interesting size bounds $t$,
these conditions are equivalent. Observe that size bounds such
as $2^{\log^k n}$ and $2^{(\log n)^{k\log\log n}}$ are sub-subexponential.

Another class of constant-depth circuits that has attracted interest uses
threshold (or MAJORITY) gates instead of counters.  Let
TC$^0(S(n))$ denote the class of sets accepted by uniform 
constant-depth threshold circuits of size $S(n)$; TC$^0$ denotes
TC$^0(n^{O(1)})$.  TC$^0$ captures the complexity of important natural
computational problems such as sorting, counting, and integer multiplication.
It is also a good complexity-theoretic model for the ``neural net'' model
of computation (see \cite{cj99-07-26}).

It is easy to observe that ACC$^0$ $\subseteq$ TC$^0$ (for example, see
\cite{cj99-07-04}),
and thus we have even fewer lower bounds for the threshold circuit model
than for ACC$^0$ circuits.  Furthermore, since TC$^0(s(n)) \subseteq$
DSPACE$(\log s(n))$ (for $s(n) \geq n$) and since (by an easy consequence 
of the space hierarchy theorem) for any PSPACE-complete set $A$ there
is some $\epsilon > 0$ such that $A \not\in$ DSPACE$(n^\epsilon)$, it
follows
that PSPACE-complete sets require exponential-size uniform TC$^0$ 
circuits.
Yet, there is still no smaller complexity class in PSPACE that is known
to require exponential-size uniform TC$^0$ circuits.

There are well-studied subclasses of PSPACE that correspond in a natural
way to the complexity classes AC$^0$, ACC$^0$, and TC$^0$.  The
relationship between the polynomial hierarchy and AC$^0$ is well known
and was established by \cite{cj99-07-07}.  One way to present this correspondence
is to observe that, when one considers alternating Turing machines that
make only $O(1)$ alternations, a polynomial running time yields the polynomial
hierarchy, while a logarithmic running time yields uniform AC$^0$.  The analogous
subclasses of PSPACE corresponding to ACC$^0$ and TC$^0$ are ModPH and the counting
hierarchy, respectively.

ModPH is in some sense a generalization of the polynomial hierarchy and
of $\oplus$P (formal definitions appear in Section \ref{modsec}).
The counting hierarchy
(defined in \cite{cj99-07-17} and studied by several authors) consists
of the union of the complexity classes PP, PP$^{\mathrm PP}$, 
PP$^{\mathrm PP^{\mathrm PP}}$, \ldots.  (Note that this is equal to the
union of the classes
C$_=$P, C$_=$P$^{\mathrm C_=P}$, C$_=$P$^{\mathrm C_=P^{\mathrm C_=P}}$, 
\ldots.)
In Section \ref{modsec}, we present
models of computation (similar to alternating Turing machines)
such that polynomial time on this model characterizes ModPH (or the
counting hierarchy), while logarithmic time characterizes ACC$^0$
(or TC$^0$, respectively).

\subsection{Statement of the main results}

A recent paper by Caussinus, McKenzie, Th\'{e}rien, and Vollmer \cite{cj99-07-21}
shows that ACC$^0$ is properly contained in ModPH, and TC$^0$ is properly
contained in the counting hierarchy.  The proof given by \cite{cj99-07-21}
uses ``leaf languages'' as a tool and does not explicitly present a lower
bound for any language in ModPH or in the counting hierarchy.  
The present
work began as an attempt to discover if these techniques could be used to
find an explicit lower bound.  This attempt was only partially successful.
For each given language $A$ in ModPH,
it is  \emph{still} an open question whether $A$ has polynomial-size uniform 
ACC$^0$ circuits.  The proof in \cite{cj99-07-21} shows only that there
 \emph{exists} a set in ModPH that requires superpolynomial-size ACC$^0$
circuits; the present work gives a very simple direct proof of this
same separation, but with the improvement that ``superpolynomial'' is 
replaced by ``sub-subexponential.''

In contrast, we  \emph{are} able to give explicit lower bounds on the
uniform threshold circuit size required for many problems in the counting
hierarchy.  Although we are able only to show that some set  \emph{exists} in the
counting hierarchy that requires more than sub-subexponential-size
uniform threshold circuits, we  \emph{can} obtain explicit lower bounds if we 
weaken the size bound only slightly.  

Recall that a function $t$ is 
sub-subexponential if $t^{(2)}(n) = 2^{n^{o(1)}}$, where $t^{(k)}$ 
denotes $t$ composed with itself $k$ times.  
We obtain a smaller class of functions if we impose the
harsher restriction that for  \emph{all} $k$, 
$t^{(k)}(n) = o(2^n)$, but there seem to be no natural
functions of interest that satisfy the former condition but not the
latter.  In particular, functions $t$ such as
$2^{\log^k n}$ and $2^{(\log n)^{k\log\log n}}$ satisfy the condition that
for all $k$, $t^{(k)}(n) = o(2^n)$.

The main result of this paper can now be stated.

\newenvironment{maintheorem}
{\begin{trivlist}\item[\hskip\labelsep{\textbf{Main theorem 
(Theorem \ref{mainthm})}}]\it}
{\end{trivlist}}

\begin{maintheorem}
Let $A$ be hard for {\rm C$_=$P} under $\leq^{\mathrm TC^0}_{\mathrm T}$ 
reducibility, and
let $t$ be a function such that for all $k, t^{(k)}(n) = o(2^n)$.
Then $A \not\in TC^0(t(n))$.
\end{maintheorem}

The notion of $\leq^{\mathrm TC^0}_{\mathrm T}$ reducibility is defined as
follows.  Let $A$ and $B$ be subsets of $\{0,1\}^*$.  Then
$A \leq^{\mathrm TC^0}_{\mathrm T} B$ if there is a uniform family of polynomial-size
constant-depth circuits with MAJORITY gates and oracle gates for $B$, 
accepting
$A$.  (This is a natural adaptation of the notion of AC$^0$ reducibility
studied in \cite{cj99-07-18} and elsewhere.)

In particular, all sets that are currently known to be complete for PP 
require threshold circuits of this size, because all such sets currently
known are in fact complete under many-one reductions computable in uniform
AC$^0$.  

\begin{corollary}
The permanent cannot be computed by uniform constant-depth threshold
circuits of size $t(n)$ if, for all $k, t^{(k)}(n) = o(2^n)$.
\end{corollary}
\begin{proof}
It was shown in
\cite{cj99-07-20} (see also comments in \cite{cj99-07-02}) that the set
$\{(x,i,b)  \mid \mbox{the}\ i\mbox{th bit of PERMANENT}(x)\  
\mbox{is equal
to}\ b\}$ is hard for C$_=$P under AC$^0$ reducibility (with only
one query).  Thus Theorem \ref{mainthm} applies.
\end{proof}

In contrast, some of the functions that are shown to be \#P-complete in
\cite{cj99-07-16} are shown to be complete  \emph{only\/} under 
 \emph{polynomial-time Turing} 
reducibility; for example, we have not checked to see if the problem of
counting the number of (possibly imperfect) matchings in a bipartite 
graph is hard for
C$_=$P under $\leq^{\mathrm TC^0}_{\mathrm T}$
reducibility (although we suspect that this is
the case), and until this is established, the lower bounds of this paper are 
not known to hold for this problem.  Similarly, the functions that are
shown by Toda in \cite{cj99-07-14} to be complete for FP$^{\mathrm \#P}$ are not
immediately known to require large threshold circuits; it first needs
to be established that they are hard for C$_=$P under TC$^0$ reductions.

\section{Machine models}
\label{modsec}

We assume the reader is familiar with nondeterministic oracle Turing
machines.  Given natural number $m$ and oracle $A$, Mod$_m$P$^A$ is
the class of languages $B$ such that, for some nondeterministic polynomial-time
Turing machine $M$, $x$ is in $B$ if and only if the number of accepting
computations of $M^A$ on input $x$ is a multiple of $m$.  Then the
class ModPH is defined to be the smallest class of languages containing
P and with the property that if $A$ is in ModPH, then so are NP$^A$ and
Mod$_m$P$^A$ for every natural $m$.  ModPH has been studied by several authors
(see, for example, \cite{cj99-07-09}). 

It is useful to have a model of computation characterizing
ACC$^0$ and ModPH, in the same way that alternating Turing machines
characterize both AC$^0$ and the polynomial hierarchy.  The appropriate
model of computation was defined in \cite{cj99-07-02} as a variant of
alternating Turing machines.  We refer the reader to \cite{cj99-07-02}
for detailed definitions; for the purposes of this paper it  suffices
for the reader who is familiar with alternating Turing machines to consider
the most natural way of augmenting the usual existential and universal
states of an alternating Turing machine, by adding Mod$_m$ states.  
(Intuitively, a Mod$_m$ configuration $C$ of an alternating Turing machine
is accepting if and only if 
$i$ is a multiple of $m$, where $i$ is the number
of accepting configurations that are reachable from $C$ and are at
the start of the next ``alternation level.'')

Let a  \emph{signature} $\sigma$ be a finite string from 
$\{\forall, \exists, {\mathrm Mod}_2, {\mathrm Mod}_3, {\mathrm Mod}_4,\ldots\}^*$.
For any alternating Turing machine making $O(1)$ alternations, each path
in the alternating tree of the machine on any input $x$ has a signature
given by the sequence of types of states the machine enters.  If $M$ is
an alternating machine such that, on all inputs $x$, all paths have the
same signature $\sigma$, then $M$ is said to be a $\sigma$ machine.
For instance,
the signature of a $\Sigma_2$ machine is $\exists\forall$, and the signature
of a typical machine 
accepting a language in ${\mathrm NP^{\oplus P^{Mod_7 P}}}$ is
$\exists {\mathrm Mod}_2 {\mathrm Mod}_7$.
Let $\sigma$time$(t(n))$ denote the class of languages accepted by
$\sigma$ machines running in time $t(n)$.  The technical lemmas in
\cite{cj99-07-02} essentially prove the following proposition.

\begin{definition}
Let us call a function $f$  \emph{constructible} if $f(n) = 2^{g(n)}$,
where the binary representation of $g(n)$ can be computed from the 
binary representation of $n$ 
in time $O(g(n))$.
\end{definition}

\begin{proposition}
Let $2^{t(n)}$ be a constructible function, $t(n) = \Omega(\log n)$.  Then
uniform {\rm ACC}$^0(2^{O(t(n))})$ =
$\bigcup_{\sigma} \sigma$time$(O(t(n)))$.
\end{proposition}

It turns out to be useful to us to note that a ``tape reduction
theorem'' holds for $\sigma$ machines.  (In some ways, this can
be viewed as a generalization of \cite{cj99-07-29}.)

\begin{proposition} Let $\sigma$ be any nonempty signature.
If a set is accepted in time
$t(n)$ by a $\sigma$ machine with $k$ worktapes, then it is also
accepted in time $O(t(n))$ by a $\sigma$ machine with two worktapes.
\end{proposition}

\begin{proof}
Given a $k$-tape $\sigma$ machine, follow the construction
in \cite{cj99-07-02} and build an ACC$^0$ circuit, such that $\sigma$ is the
sequence of types of gates encountered in a root-to-leaf path.
(Note that if the signature $\sigma$ is in $\{\forall,\exists\}^*$, then
this is actually an AC$^0$ circuit.)
In the construction given in \cite{cj99-07-02}, the deterministic linear-time
machine that checks the uniformity condition needs $k$ tapes.  
Let us briefly explain:
The gates of the circuit are labeled with configurations of the 
$\sigma$-machine
at points in the computation when an alternation is made, 
and the labels also include a sequence
of bits denoting the path in the alternation tree that leads from the first
configuration to the second.  
The output gate of the circuit is labeled with the start configuration of
the $\sigma$ machine.
In order to determine what gates are connected,
the ``uniformity machine'' needs only to simulate the $\sigma$-Turing machine
along
that path; if the $\sigma$-machine has $k$ tapes, then the uniformity
machine  has $k$ tapes, too.

However,
suppose we change 
the naming convention for the gates in the circuit in order
to utilize the original tape-reduction proof
for nondeterministic machines in \cite{cj99-07-06}.  
Then we can make do with
a two-tape deterministic machine checking the uniformity condition.
That is, let $M_1$ be the $k$-tape uniformity machine for the original
circuit family. If the original circuit has gates $g$ and $h$, where there 
is an edge in the circuit from $h$ to $g$---corresponding 
to a computation path 
of the $\sigma$ machine from $g$ to $h$---then 
the new circuit has gates $(g,u)$ and $(h,uv)$,
where $v$ is a string of length $t(n)$ recording the reading
from each of the $k$ heads
of the uniformity machine $M_1$ in the computation of length
$t(n)$, which verifies that $h$ is connected to $g$.  Since there are only
$O(1)$ alternations of the $\sigma$-machine, and hence the circuit has
depth $O(1)$, the label size is still $O(t(n))$ bits, and thus the
circuit size is still $2^{O(t(n))}$.

Now given a uniform $\sigma$-circuit family where the uniformity condition
is checked by a two-tape machine, the construction in \cite{cj99-07-02} yields
a two-tape $\sigma$-machine accepting the original language.
\end{proof}

Similarly, we find it very convenient 
to have a single model of computation
that is sufficient for describing both TC$^0$ and the counting hierarchy.
Such a model is described in \cite{cj99-07-10}.
In the model, which is called a ``threshold Turing machine,''
TC$^0$ corresponds to $O(\log n)$ time and $O(1)$
uses of the threshold operation, and the counting hierarchy
corresponds to polynomial time and $O(1)$ uses of the threshold 
operation.  The characterization of the counting hierarchy in 
terms of threshold Turing machines is given in \cite{cj99-07-10}, but
the corresponding characterization of TC$^0$ is  \emph{not} presented there 
(since
\cite{cj99-07-10} predates the uniformity considerations of
\cite{cj99-07-04}). It does not seem to have been published anywhere
else.  Although \cite{cj99-07-04}  \emph{does} give many equivalent characterizations
of TC$^0$, the threshold Turing machine model is not mentioned in \cite{cj99-07-04}.
Nonetheless, the proof of the following proposition is quite standard and
follows along the lines of related results in \cite{cj99-07-10,cj99-07-04}.

\begin{proposition}\label{mm1}
% Let $t(n)$ be a 
% constructible function, $t(n) = \Omega(\log n)$.  Then
% uniform threshold circuit depth$(O(1))$, size$(2^{O(t(n))})$ =
% threshold Turing machine time$(O(t(n)))$, thresholds$(O(1))$.
    Let $t(n)$ be a constructible function, $t(n) = \Omega(\log n)$.  Then
    the following classes are equal:
    \begin{enumerate}
        \item Uniform threshold circuit depth$(O(1))$, size$(2^{O(t(n))})$
        \item Threshold Turing machine time$(O(t(n)))$, thresholds$(O(1))$.
    \end{enumerate}
\end{proposition}

As is the case with the $\sigma$ machines considered above, the
threshold Turing machines also enjoy a tape-reduction property,
proved in essentially the same way.  If a set is accepted in time
$t(n)$ by a $k$-tape threshold Turing machine,  it is accepted
in time $O(t(n))$ by a two-tape threshold Turing machine.

The lower bounds presented in this paper do not depend on this tape
reduction, but the statement
of Theorem \ref{diag} is 
simplified by taking advantage of the tape reduction.

\section{Diagonalization}

It is important 
to note that the techniques used to prove the nondeterministic
time hierarchy (originally proved 
in \cite{cj99-07-13};  we use the
very simple and general version 
proved by \v{Z}\'{a}k \cite{cj99-07-19}) can be used to
prove analogous hierarchies for 
other computational models defined in terms
of nondeterministic Turing machines (with a fixed bound on the number of
worktapes).  In particular, an essentially word-for-word translation of
the proof in \cite{cj99-07-19} shows the following.

\begin{theorem}\label{diag}
Let $2^T$ be constructible.  Then there is a
set $B$ in $\sigma$time$(T(n))$ such that, for all $t$ with
$t(n+1)=o(T(n))$, $B$ is not in $\sigma$time$(t(n))$.
Also, there is a set $D$ in
threshold Turing machine time$(O(T(n)))$,thresholds$(k)$
such that, for all $t$ with
$t(n+1)=o(T(n))$, $D$ is not in
threshold Turing machine time$(O(t(n)))$,thresholds$(k)$.
\end{theorem}

\begin{proof}
For completeness, we present the main outline of the proof.  Let
$M_1, M_2,\ldots$ be an enumeration of two-tape $\sigma$-machines
(threshold machines, respectively).  Let $f$ be a rapidly growing
function such that time $T(f(i,n,s))$ is enough time for a 
 \emph{deterministic}
machine to compute the function 
\[(i,n,s) \mapsto \left\{ \begin{array}{ll}
          1 & \mbox{if } M_i\ \mbox{accepts}\ 1^n\ \mbox{in}\ \leq s\ \mbox{steps} \\
          0 & \mbox{otherwise. }
        \end{array} \right.  \]
Note that 
letting $f(i,n,s)$ be greater than $T^{-1}(2^{2^{i+n+s}})$ is sufficient;
it is important in our setting to handle  \emph{sublinear} functions
$T$.

Now divide $\Sigma^*$ into regions, so that in region $j=(i,y)$, we diagonalize
against machine $M_i$, thus ensuring that each machine is considered infinitely
often.  The regions are defined by functions  \emph{start}$(j)$ and 
 \emph{end}$(j)$, defined as follows:  \emph{start}(1) = 1,   \emph{start}$(j+1)$
=  \emph{end}$(j)$+1, where
 \emph{end}$(j)$
= $f$($i$,  \emph{start}$(j)$, $T$( \emph{start}$(j)$)) (where $j = (i,y)$).
The important point is that, on input $1^{end(j)}$, a deterministic machine
can, in time $T$, determine whether $M_i$ accepts $1^{start(j)}$ in
less than or equal to $T$( \emph{start}$(j)-1$) steps.
By picking $f$ appropriately easy to invert, we can guarantee that, on input
$1^n$, we can in time $T(n)$ determine which region $j$ contains $n$.

Now it is easy to verify that the following routine can be computed in
time $T(n)$ by a $\sigma$-machine (or a threshold machine, respectively).
In the pseudocode below, $U$ is a ``universal'' $\sigma$-machine 
(or threshold machine) with
four tapes, which is therefore 
able to simulate one step of machine $M_i$ in
about $i^3$ steps.
\begin{enumerate}
\item On input $1^n$, determine which region $j$ contains $n$.  Let $j = (i,y)$.
\item If $n$ =  \emph{end}$(j)$, then accept if and only if 
$M_i$ does  \emph{not} accept
$1^{start(j)}$ in $\leq T$( \emph{start}$(j)-1$) steps.
\item Otherwise, accept if and only if 
$U$ accepts $(i,1^{n+1})$ in $\leq T(n)$ steps.
(Here, it is important that we are talking about $T(n)$ steps of $U$,
which may be only about $T(n)/i^3$ steps of $M_i$.)
\end{enumerate}

Let us call the set defined by the preceding pseudocode $A$.  Clearly,
$A$ is in $\sigma$time$(T(n))$.  We now claim that is is not in
$\sigma$time$(t(n))$.

Assume otherwise, and let $M_i$ be the $\sigma$ machine accepting $A$
in time $t(n)$.  Let $c$ be a constant such that $i^3t(n+1) < T(n)$
for all $n \geq c$.  Let $y$ be a string of length 
greater than or equal to $c$, and consider
stage $j = (i,y)$.  Then for all $n$ such that
 \emph{start}$(j) \leq n <$  \emph{end}$(j)$, we have $1^n \in A$ 
if and only if $1^{n+1}
\in A$.  However, this contradicts the fact that $1^{start(j)} \in A$
if and only if $1^{end(j)} \not\in A$.
\end{proof}

\section{Nonconstructive lower bounds}

Once the definitions are in hand, the proof is now quite straightforward.

\begin{theorem}
Let $t$ be a constructible sub-subexponential function.
Then there exist
sets $A$ in {\rm ModPH} requiring size greater
than $t(n)$ to compute on uniform {\rm ACC}$^0$ circuits.
\end{theorem}

\begin{proof}
Let $t$ be given.  Let $C$ be a set complete for P under Dlogtime-uniform
projections.  A ``projection'' is a function computable by a circuit
with no gates other than NOT gates.  A projection is Dlogtime-uniform
if the circuit satisfies the usual Dlogtime-uniformity conditions.  For
more background and motivation, see \cite{cj99-07-28}.
For instance, the standard complete set \{$(i,x,0^j) : 
M_i$ accepts $x$ in time $j$\} is a good choice for $C$.
The proof consists of two cases:
\begin{itemize}
\item $C$ requires size greater than $t(n)$ to compute on uniform
ACC$^0$ circuits.  In this case, of course there is nothing to prove.
\item $C$ can be 
computed by uniform ACC$^0$ circuits of size $t(n)$.
Since $t$ is constructible, let $g$ be the function such that 
$t(n) = 2^{g(n)}$.
In this case, it must happen that there is some $\sigma$ such that
ACC$^0$ is in $\sigma$time$(g(n^{O(1)}))$, because uniform circuits for
any set reducible to $C$ can easily be constructed from the ACC$^0$ circuits for
$C$.
\end{itemize}

Now standard translational techniques can be used to show that
for any signature $\tau$, $\tau$time($g(n)$) is contained in
$\sigma$time$(g(t(n)^{O(1)}))$.  To see this, consider any language
$A$ in $\tau$time($g(n)$).  Let $A'$ = \{$x10^j : j+|x|+1 = t(|x|)$
and $x \in A$\}.  Our constructibility assumptions on $t$ ensure
that $A'$ is in ACC$^0$ and, hence, 
is in $\sigma$time$(g(n^l))$ for some $l$.
Let $M$ be this $g(n^l)$-time-bounded $\sigma$ machine accepting $A'$.
The $\sigma$ machine $M'$ that, on input $x$, simulates $M$ on input
$x10^{t(|x|)-|x|-1}$ runs in time $g(t(n)^l)$.

Since $t$ is sub-subexponential, 
$2^{n^{\epsilon}} > t(t(n)^l) = 2^{g(t(n)^l)}$ and thus
$g(t(n)^l) = o(n)$.  Thus
it follows from Theorem \ref{diag}
that there is a set $B$ in $\sigma$time$(n)$ (and hence in ModPH) such that,
for all $l$, $B$ is not in $\sigma$time$(g(t(n)^l))$ Therefore,  $B$ is
not in $\tau$time$(g(n))$ and does not have uniform ACC$^0$ circuits
of size $t(n)$.
\end{proof}

It is important to note that, because of the nonconstructive nature of the
proof of this theorem, the proof offers no clue as to  \emph{what} set in
ModPH has large ACC$^0$ circuits.
An essentially identical proof yields the following theorem.

\begin{theorem}\label{subsubthm}
Let $t$ be a constructible sub-subexponential function.
Then there exist
sets $A$ in the counting hierarchy requiring size greater
than $t(n)$ to compute on uniform threshold circuits.
\end{theorem}

\section{Main result}

\begin{theorem}
\label{mainthm}
Let $t$ be a constructible function such that for all $k$,
$t^{(k)}(n) = o(2^{n})$.  Let $A$ be any set that is hard for C$_=$P
under $\leq^{\mathrm TC^0}_{\mathrm T}$ reductions.  
Then $A$ cannot be computed by
uniform constant-depth threshold circuits of size $t(n)$.
\end{theorem}
\begin{proof}
Assume otherwise.  Then we can show that for every set $B$ in the
counting hierarchy, there is some $k$ such that $B$ has uniform
constant-depth threshold circuits of size $T(n) = t^{(k)}(n)$.
But since $T(T(n)) = 2^{n^{o(1)}}$, 
this contradicts Theorem \ref{subsubthm}.
For the purposes of this proof, define CH$_1$ to be C$_=$P, and for
$i > 1$, define CH$_i$ to be C$_=$P$^{{\mathrm CH}_{i-1}}$.

First note that, under the assumption, C$_=$P has circuits of size
$$
t(n^{O(1)})n^{O(1)} = O(t(t(t(n)))).
$$
The circuit consists of a polysize TC$^0$ reduction
{from} the C$_=$P set to $A$, where 
the oracle gates are replaced by circuits
for $A$.  Here, we assume without loss of generality that $t(n) 
\geq n^{\log n}$.  Otherwise, we can take $t'$ to be the maximum of $t(n)$
and $n^{\log n}$.

Now assume that all sets in CH$_i$ have uniform constant-depth circuits
of size $O(t^{(4i)}(n))$, and consider a 
set $A \in$ CH$_{i+1}$.  Thus there
is some nondeterministic machine $M$ and a set $D \in$ CH$_i$ such that
$M^D$ has exactly as many accepting paths as rejecting paths on input
$x$ if and only if $x \in A$.  The set \{$(x,C)$ : $M$ has exactly as many
accepting paths as rejecting paths on input $x$, when all oracle queries are
answered according to the circuit $C$\} is in C$_=$P and, by the basis case,
has circuits of size $t(t(t(|(x,C)|)))$.  When we replace $C$ by the
circuit for $A$ that exists by inductive hypotheses, we obtain a circuit
of size less than or equal to $t^{(3)}(n + t^{(4i)}(n))
\leq t^{(4(i+1))}(n)$.
\end{proof}

We do not know how to prove an explicit lower bound for any problem in
ModPH that would be analogous to Theorem \ref{mainthm}.  It is easy to
observe, by the same proof techniques, the existence of a set that is
complete either for NP or for Mod$_p$P for some prime $p$
that requires large ACC$^0$ circuits.  Thus, in order to find a set that
is not in ACC$^0$, 
one need not consider anything beyond one of the ``bottom''
levels of ModPH.  However, 
unlike the counting hierarchy, there are infinitely
many such bottom levels in ModPH.

\section{More separations}

{From} the foregoing, we know that TC$^0$ is properly contained in
C$_=$P (and hence is properly contained in PP).  Note, however, that
C$_=$P is not known (or expected) to have circuits of less than
exponential size.  It is natural to ask if exponential size is
necessary in order to find a language that is not in TC$^0$.  In this
section we show that it is not necessary; smaller size is sufficient in
order to define languages that are not in TC$^0$.  (On the other hand,
merely having superpolynomial size is  \emph{not} known to be sufficient.)
First we make a simple observation.

\begin{proposition}
For all $\epsilon > 0$, {\rm ACC}$^0$ is properly contained in
\begin{center}
$($DTIME$(n^{\epsilon}) \cup \bigcup_\sigma \sigma$time$(\log n \log^* n))$.
\end{center}
\end{proposition}
\begin{proof}
By standard padding methods, it is easy to construct a set $A \in$
DTIME$(n^\epsilon)$ that is complete for P under projections.  This set $A$ is 
thus also hard for ACC$^0$ under projections.  If $A$ is not in
ACC$^0$, then this yields the desired conclusion.

Otherwise, $A$ is in ACC$^0$ and is therefore in $\sigma$time$(O(\log n))$
for some $\sigma$.  Since $\sigma$time$(O(\log n))$ is closed under projections,
it follows that ACC$^0$ is equal to $\sigma$time$(O(\log n))$.  By 
diagonalization, we obtain that ACC$^0$ is properly contained in
$\sigma$time$(O(\log n \log^* n))$.
\end{proof}

An identical proof yields the following.
\begin{proposition}
For all $\epsilon > 0$, {\rm TC}$^0$ is properly contained in
\begin{center}
$($DTIME$(n^{\epsilon}) \cup {\mathrm TC}^0(n^{O(\log^* n)})$.
\end{center}
\end{proposition}

(By essentially the same argument, we obtain that {\rm TC}$^0$ is properly
contained in {\rm NC}$^1$ $\cup {\mathrm TC}^0(n^{O(\log^* n)})$.)
We immediately get the following corollaries, which seem only marginally
better than the results of \cite{cj99-07-21} showing proper inclusion in
ModPH and the counting hierarchy.

\begin{corollary}
Let $\epsilon$ be greater than $0$.  Then
\begin{center}
{\rm ACC}$^0$ is properly contained in {\rm ACC}$^0(2^{n^\epsilon})$.\\
{\rm TC}$^0$ is properly contained in {\rm TC}$^0(2^{n^\epsilon})$.
\end{center}
\end{corollary}

But now we  use the technique of \cite{cj99-07-01} to get a better
separation.

\begin{lemma}
Let $S$ be a constructible function such that 
$S(n) \geq n$.
If {\rm ACC}$^0$ = {\rm ACC}$^0(S(n))$, 
then {\rm ACC}$^0$ = {\rm ACC}$^0(S(S(n)))$.
\end{lemma}

\begin{proof}
Let $A$ be any set in $\sigma$time$(O(\log S(S(n))))$.  Since a 
constructible
function $S(n)$ is of the form $2^{g(n)}$, this means that $A$ is in 
$\sigma$time$(O(g(S(n))))$.  Let $A'$ be the padded version
$\{x10^{S(|x|)-|x|-1} : x \in A\}$.  Our assumption implies that
$A'$ is in ACC$^0$ and, thus, is in $\sigma'$time$(O(\log n))$ for some
$\sigma'$.  This in turn implies 
that $A$ is in $\sigma'$time$(O(\log(S(n))))$
and, thus, by assumption, is in ACC$^0$.
\end{proof}

\begin{corollary}
Let $T$ be a constructible 
function such that, for some $k$ and all large $n$, $T^{(k)}(n) > 2^n$, where
$T^{(k)}$ is $T$ composed with itself $k$ times.  Then
\begin{center}
{\rm ACC}$^0$ is properly contained in {\rm ACC}$^0(T(n))$.
\end{center}
\end{corollary}

\begin{corollary}
Let $T$ be a constructible 
function such that, for some $k$ and all large $n$, $T^{(k)}(n) > 2^n$.  Then
\begin{center}
{\rm TC}$^0$ is properly contained in {\rm TC}$^0(T(n))$.
\end{center}
\end{corollary}

\section{Conclusions and open problems}

It is often harder to ask the right question than to answer that
question.  In \cite{cj99-07-02} we presented lower bounds on the
uniform circuit complexity of certain problems in PSPACE, and we did
not see any way to prove lower bounds on the ACC$^0$ circuit complexity
of any given problem in ModPH.  Given the inspiration of \cite{cj99-07-21},
it is easy to give a direct proof showing that there  \emph{exist} sets
in ModPH having large ACC$^0$ circuit complexity, without giving lower
bounds on any specific set in ModPH.

This same technique, when taken one step further, provides explicit lower bounds
for many specific problems in the counting hierarchy, including 
the complete
sets for C$_=$P, PP, and several functions complete for \#P.

An obvious question is whether the sub-subexponential lower bounds given
here and in \cite{cj99-07-02} can be improved to exponential lower bounds.
The lower bounds presented here for C$_=$P, PP, and the permanent
are incomparable with the bounds presented
in \cite{cj99-07-02}; the bounds presented here are for more powerful circuits
(threshold circuits as opposed to ACC$^0$ circuits), but the size bounds
presented here are not as large as in \cite{cj99-07-02}.  It seems unlikely that
the bounds presented here are optimal; probably exponential size is required
for all of these problems.

Of course, an even more desirable step would be to prove directly that
MAJORITY requires exponential size for ACC$^0$ circuits.  The 
so-called natural proofs
framework of \cite{cj99-07-22} indicates that many lower bound proofs may
be quite difficult to obtain.  However, since ACC$^0$ is a
very limited class in many respects (and, in particular, it is not clear
that one should expect pseudo\-random generators to be computable in
ACC$^0$), it is not clear that lower bounds for ACC$^0$ should be hard to
obtain.

\section*{Acknowledgments}
I thank the authors of \cite{cj99-07-21} for making their manuscript available
to me.  I thank Dieter van Melkebeek for helpful discussions and
Ken Regan for his suggestions.

\subsection*{Acknowledgment of support}
The author was supported in part by NSF grants CCR-9509603 and CCR-9734918.


\addcontentsline{toc}{section}{Bibliography}

\bibliographystyle{alpha}
\bibliography{cj99-07}

\end{document}