Introduction to Distributed Computing. [AW, Chapters 1, 2 and 3]: message passing model; broadcast algorithm; spanning tree algorithm; an "eventually holding invariant" technique for proving correctness of asynchronous algorithms; the leader election problem; impossibility of solving the problem by a deterministic anonymous algorithm on a ring; how to solve leader election with identifiers -- the Bully algorithm with O(n^2) message complexity on asynchronous rings; improved Bully algorithm with O(n log n) message complexity; a lower bound of Omega(n log n) on message complexity of leader election on asynchronous rings -- a technique for proving lower bounds by "reapplying schedules".
Search Engines. [BP98,PBMW99,Hav99,Kle99,DL03,HN99] The PageRank algorithm for calculating ranks of websites based on link structure and predefined "authority" of certain websites. A simplified version with a proof of convergence assuming that the web links form an undirected graph and that the largest eigenvalue is separated from others. The "real" version with dampening factor, its reformulation as a random walk on a graph, and a proof of convergence using the Fundamental Theorem of Markov Chains. The HITS heuristic algorithm for finding "hubs" and "authorities" related to the query topic. Proofs of convergence of the algorithm under various assumptions. An Omega(|Web|) lower bound on the size of index that allows to locate in O(1) time a website that contains a given word. A design of a scalable web crawler with emphasis on external memory data structures.
Peer-to-Peer Systems. [RFHKS01,SML+03,PRR99] A distributed hash table, called Content-Addressable Network, "embedded" in [0,1]d, expected time of insertions, searches and removals, algorithms for a node joining and leaving a CAN that preserve partitioning of [0,1]d. The Chord implementation of distributed hash table embedded in a ring, logarithmic search time assuming up-to-date finger tables, algorithm for joining that may create "branches", correctness of search despite outdated fingers and branches, and algorithm for "stabilizing" the system to "collapse" the branches onto the ring. A randomized protocol for ensuring that a nearby replica is accessed.
Internet Computing. [Ros03,RY04,Coo74,GM01,SLO03,DJMM04,Mer87] A notion of server availability. Scheduling computation of a 2-dimensional mesh DAG to achieve optimal server availability -- upper and lower bounds. A notion of memory cost. A lower bound on memory cost of any schedule for a "reduction mesh". A scheduling algorithm that maximizes server availability, and minimizes memory cost. A method for encouraging any worker to compute, on average, at least 1-1/n fraction of the tasks assigned to the worker using "implanted ringers", as long as workers are profit-driven and tasks are "one-way". A method for detecting, with given probability, if a worker has not computed at least a constant fraction of tasks assigned to the worker, for arbitrary tasks (not necessarily "one-way"), using Merkle Tree to generate succinct and quickly verifiable proofs of computation of any task.
Web communities. [FLG00,FLGC02,IK03,PPR96] Definition of web community as a subset of nodes with at least as many "internal" as "external" edges, for any node of the subset; connection to min cut in the graph; application of the Max-flow Min-cut theorem to find a community; a heuristic that "prioritizes" edges by assigning possibly different capacities to the edges. Alternative definition of web community as a collection of pages that have high "energy", when "energy" is iteratively begin injected into a graph and propagated. Different interpretations of the graph: textual similarity, browsing trajectories, hyperlink structure. Proof that as long as "energy" injected in each iteration is bounded, and other assumptions hold, the total "energy" accumulated in the graph is bounded by an absolute constant.
| student name | paper presented | date | file |
| Crutcher Dunnavant | Cho J., Garcia-Molina, H., Page, L.: Efficient Crawling Through URL Ordering. Computer Networks and ISDN Systems, Vol. 30(1-7), (1998) 161--172 | 4/08/04 | [pps] |
| Qunwei Zheng | Dwork, C., Naor, M.: Pricing via Processing or Combating Junk Mail. 12th Annual International Cryptology Conference (CRYPTO), (1992) 16--20 | 4/08/04 | [pps] |
| Olga Gavrylyako | Hader, S., Hamprecht, F.A.: Efficient density clustering using basin spanning trees. 26th Annual Conference of the Gesellschaft für Klassifikation (GfK1), (2002) 39--48 | 4/13/04 | [pps] |
| Jun Liu | Li, Q., Aslam, J., Rus, D.: Online Power-aware Routing in Wireless Ad-hoc Networks. 7th Annual International Conference on Mobile Computing and Networking (MOBICOM), (2001) 97--107 | 4/13/04 | [pps] |
| [AW] | Attiya, H., Welch, J.: Distributed Computing, Fundamentals, Simulations, and Advanced Topics. McGraw-Hill Publishing Company, UK (1998) |
|
[BP98] |
Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks, Vol. 30(1-7) (1998) 107--117 |
| [Coo74] | Cook, S.A.: An observation on time-storage trade off. Journal of Computer and System Sciences, Vol. 9(3) (1974) 308--316 |
|
[DL03] |
Demaine, E.D., López-Ortiz, A.: A linear lower bound on index size for text retrieval. Journal of Algorithms, Vol. 48(1) (2003) 2--15 |
| [DJMM04] | Du, W., Jia, J., Mangal, M., Murugesan, M.: Uncheatable Grid Computing. 24th International Conference on Distributed Computing Systems (ICDCS), (2004) |
| [FLG00] | Flake. G.W., Lawrence, S., Giles, C.L.: Efficient Identification of Web Communities. 6th International Conference on Knowledge Discovery and Data Mining (KDD), (2000) 150--160 |
| [FLGC02] | Flake, G.W., Lawrence, S., Giles, C.L., Coetzee, F.M.: Self-Organization and Identification of Web Communities. Computer, Vol. 35(3) (2002) 66--70 |
| [GM01] | Golle, P., Mironov, I.: Uncheatable Distributed Computations. RSA Conference -- Topics in Cryptography, (2001) 425--440 |
|
[Hav99] |
Haveliwala, T.H.: Efficient computation of PageRank. Stanford University Technical Report (1999) |
| [HN99] |
Heydon, A., Najork, M.: Mercator: A Scalable, Extensible Web Crawler. World Wide Web, Vol. 2(4) (1999) 219--229 |
| [IK03] | Imafuji, N.: Kitsuregawa, M.: Finding a Web Community by Maximum Flow Algorithm with HITS Score Based Capacity. 8th International Conference on Database Systems for Advanced Applications (DASFAA) (2003) 101--106 |
|
[Kle99] |
Kleinberg, J.: Authoritative sources in a hyperlinked environment. Journal of the ACM, Vol. 46(5) (1999) 604--632 |
| [Mer87] | Merkle, R.C.: A Digital Signature Based on a Conventional Encryption Function. 7th Annual International Cryptology Conference (CRYPTO), (1987) 369--378 |
|
[PBMW99] |
Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank Citation Ranking: Bringing Order to the Web. Stanford Digital Libraries (1999) |
| [PPR96] | Pirolli, P., Pitkow, J.E., Rao, R.: Silk from a Sow's Ear: Extracting Usable Structures from the Web. Conference on Human Factors and Computing Systems (CHI), (1996) 118--125 |
| [PRR99] | Plaxton, C.G., Rajaraman, R., Richa, A.W.: Accessing Nearby Copies of Replicated Objects in a Distributed Environment. Theory of Computing Systems, Vol. 32(3) (1999) 241--280 |
| [RFHKS01] | Ratnasamy, S., Francis, P., Handley, M., Karp, R.M., Schenker, S.: A Scalable Content-Addressable Network. ACM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM) (2001) 161--172 |
| [Ros03] | Rosenberg, A.L.: On Scheduling Mesh-Structured Computations on the Internet. Preliminary version: IEEE International Parallel and Distributed Processing Symposium (IPDPS) (2003) |
| [RY04] | Rosenberg, A.L., Yurkewych, M.: Optimal Schedules for Some Common Computation-Dags on the Internet. Manuscript. University of Massachusetts, Amherst (2004) |
| [SML+03] | Stoica, I., Morris, R., Liben-Nowell, D., Karger, D., Kaashoek, M.F., Dabek, F., Balakrishnan, H.: Chord: A Scalable Peer-to-Peer Lookup Protocol for Internet Applications. IEEE/ACM Transactions on Networking, Vol. 11(1) (2003) 17--32 |
| [SLO03] | Szajda, D., Lawson, B., Owen, J.: Hardening Functions for Large Scale Distributed Computations. IEEE Symposium on Security and Privacy, (2003) 216--224 |