« June 2007 | Main | August 2007 »

July 20, 2007

Things to read while the simulator runs; part 5

Continuing the ever popular series of things to read while the simulator runs, here's a collection of papers I've either read this month, or have been added to my never-vanishing stack of papers to read.

S. Redner, "Random Multiplicative Processes: An Elementary Tutorial." Am. J. Phys. 58, 267 (1990).

An elementary discussion of the statistical properties of the product of N independent random variables is given. The motivation is to emphasize the essential differences between the asymptotic behavior of the random product and the asymptotic behavior of a sum of random variables -- a random additive process. For this latter process, it is widely appreciated that the asymptotic behavior of the sum and its distribution is provided by the central limit theorem. However, no such universal principle exists for a random multiplicative process. [Ed: Emphasis added.] ...

A. Csikasz-Nagy, D. Battogtokh, K.C. Chen, B. Novak and J.J. Tyson, "Analysis of a generic model of eukaryotic cell-cycle regulation." Biophysical Journal 90, 4361-4379 (2006).

We propose a protein interaction network for the regulation of DNA synthesis and mitosis that emphasizes the universality of the regulatory system among eukaryotic cells. The idiosyncrasies of cell cycle regulation in particular organisms can be attributed, we claim, to specific settings of rate constants in the dynamic network of chemical reactions. The values of these rate constants are determined ultimately by the genetic makeup of an organism. To support these claims, we convert the reaction mechanism into a set of governing kinetic equations and provide parameter values (specific to budding yeast, fission yeast, frog eggs, and mammalian cells) that account for many curious features of cell cycle regulation in these organisms...

E.F. Keller, "Revisiting 'scale-free' networks." BioEssays 27, 1060-1068 (2005).

Recent observations of power-law distributions in the connectivity of complex networks came as a big surprise to researchers steeped in the tradition of random networks. Even more surprising was the discovery that power-law distributions also characterize many biological and social networks. Many attributed a deep significance to this fact, inferring a 'universal architecture' of complex systems. Closer examination, however, challenges the assumptions that (1) such distributions are special and (2) they signify a common architecture, independent of the system's specifics. The real surprise, if any, is that power-law distributions are easy to generate, and by a variety of mechanisms. The architecture that results is not universal, but particular; it is determined by the actual constraints on the system in question.

N. Tishby, F.C. Pereira and W. Bialek, "The information bottleneck method." In Proc. 37th Ann. Allerton Conf. on Comm., Control and Computing, B Hajek & RS Sreenivas, eds, 368-377 (1999).

We define the relevant information in a signal x \in X as being the information that this signal provides about another signal y \in Y. Examples include the information that face images provide about the names of the people portrayed, or the information that speech sounds provide about the words spoken. Understanding the signal x requires more than just predicting y, it also requires specifying which features of X play a role in the prediction. We formalize this problem as that of finding a short code for X that preserves the maximum information about Y. That is, we squeeze the information that X provides about Y through a 'bottleneck' formed by a limited set of codewords X-bar. ... Our variational principle provides a surprisingly rich framework for discussing a variety of problems in signal processing and learning...

Update 21 July: Cosma points me to a very nice article related to the information bottleneck method: C.R. Shalizi and J.P. Crutchfield, "Information Bottlenecks, Causal States, and Statistical Relevance Bases: How to Represent Relevant Information in Memoryless Transduction." Advances in Complex Systems, 5, 91-95 (2002).

R.E. Schapire, "The strength of weak learnability." Machine Learning 5, 197-227 (1990).

... A concept class is learnable (or strongly learnable) if, given access to a source of examples of the unknown concept, the learner with high probability is able to output an hypothesis that is correct on all but an arbitrarily small fraction of the instances. The concept class is weakly learnable if the learner can produce an hypothesis that performs only slightly better than random guessing. In this paper, it is shown that these two notions of learnability are equivalent...

Update 22 July: I should also add the following.

P. W. Anderson, "More is Different." Science 177 393-396 (1972).

The reductionist hypothesis may still be a topic for controversy among philosophers, but among the great majority of active scientists I think it is accepted without question. The workings of our minds and bodies, and of all the animate or inanimate matter of which we have any detailed knowledge, are assumed to be controlled by the same set of fundamental laws, which except under certain extreme conditions we feel we know pretty well. ... The main fallacy in [thinking that the only research of any valuable is on the fundamental laws of nature] is that the reductionist hypothesis does not by any means imply a "constructivist" one: The ability to reduce everything to simple fundamental laws does not imply the ability to start from those laws and reconstruct the universe. ... The behavior of large and complex aggregates of elementary particles, it turns out, is not to be understood in terms of a simple extrapolation of the properties of a few particles. Instead, at each level of complexity entirely new properties appear, and the understanding of the new behaviors requires research which I think is as fundamental in its nature as any other. ...

Naturally, Philip Anderson has long been associated with the Santa Fe Institute.

posted July 20, 2007 11:12 PM in Things to Read | permalink | Comments (0)

July 05, 2007

Boulder School on Biophysics

Blogging this month will be light as I'm attending the Boulder School on Biophysics (ostensibly the Boulder School is about condensed matter, but the main topic changes each year and this year it's about squishy (soft) condensed matter). The talks so far remind me of how much easier good modeling is when you're constrained to 3 spatial dimensions and basically know what kind of forces you'll need to deal with in order to get relatively realistic behavior (at a variety of levels of realism). In the world of networks, we don't have these kinds of constraints and so coming up with reasonable mechanisms is significantly harder. As with biology, the data analysis for networks has advanced much more quickly than has the theory, precisely, I imagine, for this reason. Later in the school, there will be some presentations on micro-biological networks, which I'm looking forward to. These systems seem like more reasonable objects for good modeling than are things like the Internet, since we can actually do controlled experiments on many of them to see how well the theories hold up.

A few things I've learned so far.

Chromosomes seem to behave a lot like jointed chains wiggling around in solution, and yet they also tend to maintain their position inside the nucleus, suggesting some sort of tethering behavior -- a behavior that has been implicated as a mechanical gene-regulation mechanism. Similar tethering behavior seems to appear in prokaryotic cells as well. There's the suggestion that spatial location of genes has an important role in their expression levels and general regulation (a result known by experimental micro-biologists, but not well understood theoretically).

Histones (those bits of protein that DNA wraps itself around at regular intervals over the entire length of the chromosome) are also implicated in regulation, basically by making it more difficult to transcribe a gene when its start codon is in the middle of one of the turns around the histone; also, histones seem to have a preferred sequence of DNA for binding; the suggestion is that the copious amounts of repeated sequences in the genome may be related to binding these objects in a regular fashion.

Motor proteins are fascinating squishy things. Your muscles are composed of large quantities of myosin motor proteins that crawl along actin filaments by exerting a tiny force of a few tens of pico-newtons (it would take tens of billion of these to exert the equivalent force that the Earth exerts on an apple).

posted July 5, 2007 07:37 PM in Things that go squish | permalink | Comments (0)