The problem of optimal query processing in distributed database systems was shown to be NP-hard. However, for a special type of queries called star queries, we have developed a polynomial optimal algorithm. Semijoin tactics are applied for query processing. An execution graph is introduced to represent the semijoin programs associated with the distributed processing of the queries. We then identify optimality properties of semijoin programs for star queries, and use these properties to derive the optimal semijoin program. We have shown that the optimal semijoin program can be found from serial semijoin strategies, defined as serial semijoin programs which include each semijoin associated with the query exactly once. By making certain assumptions on the file sizes and the semijoin selectivities, we can obtain the optimal semijoin program from these strategies in polynomial time. Our assumption on selectivites is consistent in the sense that we consider the selectivity of a semijoin based on the current database state, i.e., we take into consideration the reduction effects of all prior semijoins.
IEEE Transactions on Software Engineering (SCI,EI),1097-1107