FDA reviewers need a means to rapidly predict organ-specific carcinogenicity to aid in evaluating new chemicals submitted for approval. This research addressed the building of a database to use in developing a predictive model for such an application based on structure–activity relationships (SAR). The Internet availability of the Carcinogenic Potency Database (CPDB) provided a solid foundation on which to base such a model. The addition of molecular structures to the CPDB provided the extra ingredient necessary for SAR analyses. However, the CPDB had to be compressed from a multirecord to a single record per chemical database; multiple records representing each gender, species, route of administration, and organ-specific toxicity had to be summarized into a single record for each study. Multiple studies on a single chemical had to be further reduced based on a hierarchical scheme. Structural cleanup involved removal of all chemicals that would impede the accurate generation of SAR type descriptors from commercial software programs; that is, inorganic chemicals, mixtures, and organometallics were removed. Counterions such as Na, K, sulfates, hydrates, and salts were also removed for structural consistency. Structural modification sometimes resulted in duplicate records that also had to be reduced to a single record based on the hierarchical scheme. The modified database containing 999 chemicals was evaluated for liver-specific carcinogenicity using a variety of analysis techniques. These preliminary analyses all yielded approximately the same results with an overall predictability of about 63%, which was comprised of a sensitivity of about 30% and a specificity of about 77%.
Journal of Toxicology and Environmental Health Part A, 67(17),1363-1389