#

### CJTCS Special Issue: CATS 2009

### Selected papers from

# 2009 Computing: The Australasian Theory Symposium

### held in Wellington, New Zealand January 20-23 2009.

## Guest Editors: Rod Downey, Victoria University of Wellington, NZ

## Prabhu Manyem, Shanghai University, PRC.

#### Article 2010-12

Published by Dept. CS U. Chicago. Copyright 2010 CJTCS and the author.

####
Distributing Frequency-Dependent Data Stream Computations

Sumit Ganguly

Indian Institute of Technology, Kanpur,

India

E-mail: `sganguly@cse.iitk.ac.in`

*June 22, 2010*
##### Abstract

The data streaming model of computation processes a sequence of
continuously arriving data in a single-pass over the input using
sub-linear space. For efficiency purposes, it is often desirable
to perform the computations in a highly distributed fashion, as
are currently done in internet applications and sensor networks.
The distributed computation is typically performed using a
tree-topology, where the nodes are processing elements and the
data resides in the leaves of the tree. A large class of interesting
and practical functions are symmetric functions (i.e., invariant of the
permutation of data at the leaves) of the inputs or their approximations.
Flexible distributed processing
refers to the class of distributed tree computations
whose output remains within a pre-specified
tolerance limit regardless of the permutation of the data on the leaves
or the topology of the computation tree.

We consider the question: Do efficient flexible distributed computations
exist corresponding to data streaming algorithms for computing the same
function/approximation?
The work by Jan Feldman, S. Muthukrishnan, Anastasios Sidiropoulos, Cli Stein,
and Zoya Svitkina. *On distributing symmetric streaming computa-
tions* (SODA 2008) shows that corresponding to
any deterministic algorithm
that computes a total symmetric function of an input stream,
there exists a flexible distributed algorithm for computing the same function
of the concatenation of the input streams at the leaves of the tree from left to
right. Further, they present resource bounds of the distributed algorithm in
terms of the resource bounds of the streaming algorithm. In this paper, we
show the existence of efficient flexible distributed algorithms corresponding to
a given
data stream algorithm that computes an approximation to the
function of the frequency
vector of the stream. As opposed to the work of Feldman et al., the data stream algorithm need
not compute a symmetric function of the input stream.

- The article: PDF (161,642 bytes)
- Source materials:
**ZIP** (50,458 bytes)
- Self citation in
**BIBTeX** (205 bytes)
- Original version (pre-June 2010): PDF (130,460 bytes)

Special Issue, CATS 2009, Article 8
2010 Article 13

Special Issue
Published Articles