Open Problem: Decomposition of a Complete Graph Into A Bayesian Network

Screen shot 2010-02-19 at 10.39.48 AM.pngThis post is an open-ended thought experiment regarding my ongoing research in random graph models of social networks.

On Wednesday I attending my first NYC Machine Learning Meetup, where the topic was graphical models. The talk was given by Max Khesin (slides and video), and while I am casually familiar with the subject of probabilistic graphic models it was an excellent introduction. The representation of probabilities as a directed graph is very elegant and powerful, and as such the talk motivated me to think about how to apply it to my own work.

One application that immediately came to mind was defining the probability distribution used in my work on the structurally induced random graph model (SIRG). Under the current model, this probability distribution is defined by counting the number of subgraph isomorphism (SGI) of some number of graphs within the base structure. One criticism I have received from many readers is the obvious inefficiency of using SGI as a initial condition of the model. The SGI problem is NP-complete, which severely limits the scalability of the SIRG model both in terms of analyzing the base structure and the number of subgraphs in the initial probability distribution.

One potential solution to this problem, and alternative specification of the SIRG, would be to define a Bayesian network decomposition of the complete graph of  K_{\tau} where \tau is the number of nodes from which to define the set of complete subgraphs for the probability distribution for generating structure in the SIRG. In this case, each node would represent some single component subgraph of the complete graph K_{\tau} with corresponding conditional probabilities. from the This; however, presents the interesting—and equally difficult—problem of first defining the method of decomposition and then proving its existence for any value of \tau.

If it is possible to create such a well-defined decomposition it will overcome the limitation of using SGI because in a Bayesian network we need only to provide priors to the top parent node of the directed acyclic graph (DAG), which in this case would be the dyad. We can easily define this prior as the density of the base structure in the SIRG, i.e., the prior probability of a dyad will be the observed number of dyads in the base structure, which is defined as the graph density.

As an example, suppose \tau=4, then one possible decomposition of K_{4} is the following.

ex1_crop.png

In the above case we restrict connection between each level of the DAG to contain those single component subgraphs of some discrete number of vertices \le \tau. The first parent node is the dyad, and its probability is given by the density of the base structure H. This is a tree graph with no intra-level connections, and a conveniently the triple and triangle at level 3 have no mutual ties. To achieve this, we also require that if any graph represented by a node at the previous level is a factor of a graph at the current level then it is the only parent node; hence, the triple node is a parent of only two children, while the triangle is a parent of two. The triangle is a factor of three graphs at level four, leaving only two children from the triple.

One could easily; however, devise several other methods for decomposition. For example, if we relax the restrictions from the above example to any factorizable parent graph than the triple would gain additional children and we would add intra-level edges. What’s important; however, is that sparsity of edges is preferred for efficient Bayesian networks, and as such it is unlikely that this lessening of restriction would be beneficial.

Thus, we are left with two very difficult problems. First, two define a function that produces all of the single component subgraphs of some number of vertices greater than two. More difficult; however, is to prove that a unique and efficient Bayesian network decomposition exists for all values of \tau. Also, there are considerable substantive difference between defining the probability distribution of the SIRG as a large joint probability of several conditional probabilities, rather than a one shot distribution.


Automatically Generated Related posts:

  1. Structurally Induced Random Graph Model of Social Networks
  2. UPDATED: Network Decomposition Visualization
  3. Rumsfeld as a Bayesian?
  4. Inital Forays into Bayesian Analysis for the Intelligence Community
  5. Open thread: network prediction and quantifying future wars

Leave a Reply

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Technorati Profile