Submit your CCIB Retreat RSVP by clicking here

2018 CCIB Annual Retreat

Thursday, December 13, 2018
9:00 a.m. – 5:00 p.m.
Camden Nursing and Science Building

(530 Federal Street, Camden, NJ 08103)

5:00 p.m. – 6:00 p.m.

(200 Federal Street, Camden, NJ 08103)

6 p.m. – 7 p.m.
The Amazing Escape Room

(2050 Springdale Rd Suite 200, Cherry Hill, New Jersey 08003)








Provost, Rutgers-Camden – Dr. Michael Palis


Associate Dean for Science, Mathematics, Technology, and Health Sciences, Rutgers-Camden – Dr. Joseph Martin


Director, CCIB – Dr. Nir Yakoby

 (Abstracts for all speakers can be found below)


Speaker: Nawaf Bou-Rabee

Title: Sticky Diffusions & their Numerical Solution



Speaker:  Lingyu Guan

Title: What can we learn from odd-shaped bacteria?







Keynote Speaker:  Dr. Monica Driscoll

Distinguished Professor of Molecular Biology & Biochemistry, School of Arts & Sciences, Rutgers, The State University of New Jersey

Title: Addressing the Complexities of Neurodegeneration and Aging, One Worm at a Time.



Poster Session & Lunch provided by CCIB



Speaker:  CCIB – Dr. David Salas de la Cruz

Title: Leech Cocoons and Polychaete Tubes: an Annelid Adventure



Speaker:  CCIB Student – Sung Won Oh

Title: Relative affinities of general anesthetics for pseudo-symmetric intersubunit binding sites of heteromeric GABA(A) receptors


Speaker:  CCIB – Dr. Julianne Griepenburg







Speaker: CCIB Best Student Paper 2017 – Ruchi Lohia

Title: Age-driven modulation of tRNA-derived fragments in Drosophila and their potential targets



Speaker:  CCIB – Dr. Sarah Allred




Speaker:  CCIB Student – Cody Stevens




Closing Remarks & Annual Retreat Group Photo





Leave for the CCIB Social Event – Amazing Escape Room




KEYNOTE SPEAKER – DR. SARAH TISHKOFF – David and Lyn Silfen University Professor, Departments of Genetics and Biology, Perelman School of Medicine, University of Pennsylvania




DR. ERIC KLEIN – Department of Biology, Rutgers-Camden

Title: What can we learn from odd-shaped bacteria?

Abstract:  The diversity of cell shapes found in the bacterial kingdom emphasizes the evolutionary pressures for achieving physiologically important morphologies. While extensive efforts have been made to understand the regulation of prototypical cell morphologies, such as rod-shaped E. coli, essentially nothing is known about the vast majority of unique cell shapes. A number of phylogenetically diverse species form thin cellular appendages called stalks or prosthecae. For organisms in the Caulobacteraceae family, it is well established that stalk synthesis is directly tied to their dimorphic life cycle in which non-dividing motile swarmer cells differentiate into replicating sessile stalked cells. Under phosphate limiting conditions, stalks are dramatically elongated. What is not known is the molecular composition of the stalk or the mechanism(s) required for stalk biosynthesis.

         The stalk contains all of the layers of a typical Gram-negative envelope including the peptidoglycan (PG) cell wall. We have recently observed that the stalk PG, unlike the cell body PG, is both lysozyme-resistant and a poor ligand for a bacteriophage PG-binding protein. To our knowledge, this is the first evidence of a bacterial cell with differentially regulated domains containing unique peptidoglycan structures. Since the mechanism of stalk elongation is unknown, we have used both targeted approaches and unbiased genetic screens to identify a suite of genes required for stalk synthesis including the cytoskeletal protein mreB, a novel transpeptidase cc2105, and the mannose-6-phosphate isomerase manA. Ongoing work is aimed at determining the role of each of these pathways in stalk biogenesis.



Title: The role of time scales in the spatiotemporal dynamics of EGFR activation

Abstract:  Tissue patterning and cell fates are regulated by cell signaling pathways. A highly-conserved signaling pathway, the epidermal growth factor receptor (EGFR), controls both the posterior-anterior and the dorsal-ventral axes during Drosophila melanogaster oogenesis. The TGF-alpha-like ligand Gurken (GRK) is secreted from around the oocyte nucleus and activates EGFR in the overlaying follicular epithelium. Complexity is found in the dynamic localization of the nucleus, and hence the source of GRK. Early, the nucleus is present at the posterior end. Later, the nucleus is situated at the dorsal anterior side of the oocyte. Thus, EGFR activation is dynamic. Furthermore, the oocyte continuously grows during oogenesis. Current models consider solely GRK diffusion from a static location. Here, based on experimental data, we built a mathematical model using reaction-diffusion PDEs to recapitulate the spatiotemporal dynamic activation of EGFR. Our model reveals the crucial role of various time scales, including GRK’s diffusion in the perivitelline space, the nuclear movement in the oocyte, and the growing size of the oocyte. Simulations show the importance of the interplay among the three actions to control the spatiotemporal dynamics of GRK and EGFR activation.


DR. DANIEL SHAIN – Chair, Department of Biology, Rutgers-Camden

Title:  Leech Cocoons and Polychaete Tubes: an Annelid Adventure

Abstract:   Segmented worms (annelids) secrete membranous structures for protection, movement and embryonic development. Here, the physical and ultrastructural properties of representative cocoons (e.g., from leeches) and tubes (e.g., from polychaetes) will be explored. Aside from the remarkable biomaterial aspects of these structures, their collective physiochemical properties support an evolutionary link that may explain the transition from marine to terrestrial environments. 


MS. SRUTHI MURLIDARAN – CCIB Ph.D. Student, Rutgers-Camden

Title: Relative affinities of general anesthetics for pseudo-symmetric intersubunit binding sites of heteromeric GABA(A) receptors

Abstract: GABA(A), a pentameric ligand gated ion channel is critical for regulating neuronal excitability.These inhibitory receptors, gated by ?-amino butyric acid(GABA), can be potentiated and also directly activated by intravenous and inhalational anesthetics. Although the receptor is considered to be an important target for general anesthetics, the mechanism of receptor
modulation remains unclear. These receptors are predominantly found in 2?: 2?:1? stoichiometry, with four unique inter-subunit interfaces. This pseudosymmetric nature of the receptor has led experimental studies to show ambiguous predictions in identifying the binding site. Here we use thermodynamically rigorous free energy perturbation (AFEP) techniques and
Molecular Dynamic simulations to rank the different inter-subunit site by affinity. AFEP calculations predicted selective propofol binding to interfacial sites, with higher affinities for ?/? than ?-containing interfaces. The simulations revealed the key interactions leading to propofol selective binding within GABAA receptor subunit interfaces, with stable hydrogen bonds observed between propofol and ?/? cavity residues, lower tendency to form hydrogen bonds with the more hydrophilic +?/?- interfacial cavity.   Sevoflurane, a comparatively lesser potent anesthetic, though shows preference towards ?/? interface as binding sites, does not display micro molar affinity to any particular site. Flooding simulations furthers the AFEP predictions, with identifying sevoflurane bindings sites in ?+/?- and +?/?- interfacial cavity and providing insights into the sevoflurane-binding  pathway, ligand-protein interactions and receptor occupancy.


MR. SPYROS KARAISKOS – CCIB Ph.D. Student, Rutgers-Camden

Recipient of the 2016 CCIB Best Paper Award

Title: Age-driven modulation of tRNA-derived fragments in Drosophila and their potential targets

Abstract:   Development of sequencing technologies and supporting computation enable discovery of small RNA molecules that previously escaped detection or were ignored due to low count numbers. While the focus in the analysis of small RNA libraries has been primarily on microRNAs (miRNAs), recent studies have reported findings of fragments of transfer RNAs (tRFs) across a range of organisms.

         We describe Drosophila melanogaster tRFs, which appear to have a number of structural and functional features similar to those of miRNAs but are less abundant. As is the case with miRNAs, (i) tRFs seem to have distinct isoforms preferentially originating from 5’ or 3’ end of a precursor molecule (in this case, tRNA), (ii) ends of tRFs appear to contain short “seed” sequences matching conserved regions across 12 Drosophila genomes, preferentially in 3’ UTRs but also in introns and exons; (iii) tRFs display specific isoform loading into Ago1 and Ago2 and thus likely function in RISC complexes; (iii) levels of loading in Ago1 and Ago2 differ considerably; and (iv) both tRF expression and loading appear to be age-dependent, indicating potential regulatory changes from young to adult organisms.

         We found that Drosophila tRF reads mapped to both nuclear and mitochondrial tRNA genes for all 20 amino acids, while previous studies have usually reported fragments from only a few tRNAs. These tRFs show a number of similarities with miRNAs, including seed sequences. Based on complementarity with conserved Drosophila regions we identified such seed sequences and their possible targets with matches in the 3’UTR regions. Strikingly, the potential target genes of the most abundant tRFs show significant Gene Ontology enrichment in development and neuronal function. The latter suggests that involvement of tRFs in the RNA interfering pathway may play a role in brain activity or brain changes with age.


DR. ANDREY GRIGORIEV – Biology Department, Rutgers-Camden

Title: Finding variants in genomes

Abstract:  The next-generation sequencing technology has become an essential tool for the life science research. It is used to address many fundamental questions in various fields of biology allowing one to find exact genomic variants as differences between samples (for instance, related species) via comparison to a reference genome. However, precise and reliable detection of these variants still presents a significant problem. This talk will present our progress in developing variant-finding algorithms and examples of their application.


MS. WENYI WANG – CCIB Ph.D. Student, Rutgers-Camden

Title: Profiling and evaluating environmental chemicals that induce oral acute toxicity using mitochondrial membrane disruption assay, big data and new read-across strategy.

Abstract:  Assessment of chemical animal toxicity using in vitro assays, especially High-Throughput Screening (HTS) assays, or in silico models is of great interest for alternatives to traditional animal models. However, due to the complexity of animal and human toxicity, neither in vitro assays nor in silico models have acceptable predictive accuracy to evaluate new compounds for most animal toxicity endpoints. Aiming at building a mechanism-driven in silico approach for animal acute toxicity, we start from a dataset of mitochondrial membrane potential disruption (MMP) quantitative High-Throughput Screening (qHTS) assay, which we combine with all publically available toxicity assay data. Firstly, we use the in-house rat oral toxicity database of 4,647 compounds and the MMP qHTS dataset of 7,320 compounds to automatically search PubChem, the largest public data sharing portal, for all in vitro assay data to establish bioprofiles for compounds found in both databases. Next, we use fragmental descriptors to cluster all the compounds into subsets and optimize the bioprofiles for compounds in each subset. Using a novel biosimilarity search tool developed in our lab, we are able to evaluate animal toxicants by both the most statistical significant chemical descriptors and the relevant assay testing results, to read-across and cross-validate MMP qHTS data as well as acute toxicity outcomes. This new read-across strategy will not only fill data gaps for non-tested new compounds but also be able to indicate potential toxicity mechanisms by linking the chemical and biological profiles to cellular responses and the respective hazard. The established automatic chemical in vitro-in vivo profiling workflow will be applicable to develop predictive models for other toxicity endpoints.


DR. LUCA LARINI – Physics Department, Rutgers-Camden 

Title: Aggregation of the microtubule associated protein tau

Abstract: Tau is an intrinsically disordered protein found in the central nervous system. This protein is involved in regulating the microtubule stability and is essential for the proper development of the brain. As a consequence, mutations and misregulation of tau are often associated with dementia. In particular, aggregated tau is found in people affected by Alzheimer’s disease. In this talk, we will discuss the origin of this increased aggregation propensity in Alzheimer’s disease as well as other forms of dementia.



Ms. Catherine Guay – CCIB, Rutgers-Camden           

Title:   High-throughput spatial cis-regulatory analysis by turning chaos into order

Abstract:  Cis-regulatory modules (CRMs) are essential for cell-type specific gene expression patterns and contain abundant gene regulatory information in their DNA sequences. The biggest roadblock for efficient utilization of gene regulatory information contained within CRMs is the lack of a high-throughput method for cis-regulatory analysis. To address this critical challenge, we have developed a novel high-throughput, quantitative method for spatial cis-regulatory analysis using sea urchin embryos as a test bed. The new method takes advantage of i) stochastic and mosaic incorporation of reporter constructs in early embryos upon transgenesis and ii) a novel method for high-throughput, single embryo-resolution measurement of the copy numbers of expressed and incorporated reporter constructs in many mosaic embryos. Because the level of reporter expression in an embryo is determined by a combination of the intrinsic activity of a given CRM and the cells that harbor the reporter construct at the time of measurement, we hypothesized that the profile of reporter expressions measured at single-embryo resolution in a sufficiently large number of mosaic embryos is determined solely by spatial activity of a given CRM. Our proof-of-principle experiment showed that the new method can rapidly classify ?100 CRM::reporter constructs based on their spatial activities without relying on imaging tools. The new method has the potential to increase the throughput of spatial cis-regulatory analysis at least three orders of magnitude compared to traditional imaging-based analyses. We anticipate that the new method can vastly accelerate discovery of cell-type specific CRMs, identification of gene regulatory networks, and evolutionary comparison of CRM functions.


Ms. Ruchi Lohia – CCIB, Rutgers-Camden           

Title:  Mechanism underlying conformational effects of a disease-associated hydrophobic-to-hydrophobic substitution on an intrinsically disordered region 

Abstract: Disease-associated Single Nucleotide Polymorphisms (SNPs) are common in the disordered regions of proteins. Most mutational studies of IDP’s consider loss or gain of a charged residue. In this study, we explore the local and global effect of a charge-neutral mutation between two hydrophobic residues, with known effects on function:  the Val66Met SNP in the 100-residue disordered prodomain of Brain Derived Neurotropic Factor (BDNF). Val66Met is the most frequently found SNP in the BDNF disordered domain, and is associated with various neurological and psychiatric disorders such as bipolar disorder and Alzheimer’s disease. Previously, NMR studies demonstrated that the prodomain is disordered with different secondary structure preferences for Val and Met at 275K (Anastasia et al 2013). We used large-scale, fully atomistic temperature replica exchange molecular dynamics simulations of both the Val and Met forms of the BDNF prodomain. MD simulations identify similar regions of residual secondary structure compared to NMR studies. Interestingly, we observe reversed temperature dependence of the secondary structure around the SNP for Val and Met. With increasing temperature, Val66 is less likely to assume helical secondary structure, while Met66 is more likely, consistent with established reduction of the valine side-chain entropy upon helix formation.  At room temperature, we also observe an increase in the radius of gyration of the Met66 prodomain relative to the Val66 prodomain, which can be reliably attributed to differential hydrogen bonding preferences of SNP-adjacent residues, affecting their likelihood of hydrogen bonding with distant residues. These results indicate the neutral substitution may exert its effects by critically adjusting entropic cost of local secondary-structure elements, which, in turn, affects the conformational ensemble via differential long-range beta bridging patterns. Furthermore, although the SNP is neutral, it alters the exposure of charged residues around the SNP, which can potentially affect the binding characteristics of the prodomain.


Mr. Sean McQuade – CCIB, Rutgers-Camden           

Title: Modeling Human Lipid Metabolism for Multiple Classes of Virtual Patient

Abstract:  Linear-In-Flux-Expressions, a new method for improving the simulation capability of Quantitative Systems Pharmacology(QSP) models.  Using the relationship among model parameters at steady state, we may generate virtual patient parameterizations that envelop the range of clinical data.  The method is demonstrated using a cholesterol metabolism QSP model.  This new approach provides insight into the biological systems under study, as well as the pharmacological effects on the system.


Mr. Steven Moffett – CCIB, Rutgers-Camden           

Title: Effects of putative and computationally-predicted ligands on the beta-3 homopentamer construct of the GABAA receptor.

Abstract:   The GABAA receptor, the most prevalent mediator of inhibitory synaptic activity in the brain, is a pentameric ligand-gated ion channel. The wide range of subunits and subunit variations that can contribute to the pentamer allow for many different receptor constructs. One construct of the GABAA receptor is a pentamer comprised solely of beta-3 subunits. The beta-3 homopentamer has been expressed in Xenopus laevis oocytes, and, recently, a three-dimensional crystal structure of the beta-3 homopentamer has become available. Substances such as pentobarbital and topiramate have been shown to modulate the beta-3 construct. Others, such as histamine and the anesthetic propofol, have been implicated as possible beta-3 homopentamer modulators based on computational docking studies. We aim to express the beta-3 homopentamer in Xenopus oocytes, to verify the actions of putative receptor ligands, and to investigate the effect of computationally-predicted binding molecules on the beta-3 homopentamer in vitro.


Ms. Sruthi Murlidaran – CCIB, Rutgers-Camden           

Title: Relative affinities of general anesthetics for pseudo-symmetric intersubunit binding sites of heteromeric GABA(A) receptors

Abstract: GABA(A), a pentameric ligand gated ion channel is critical for regulating neuronal excitability.These inhibitory receptors, gated by ?-amino butyric acid(GABA), can be potentiated and also directly activated by intravenous and inhalational anesthetics. Although the receptor is considered to be an important target for general anesthetics, the mechanism of receptor modulation remains unclear. These receptors are predominantly found in 2?: 2?:1? stoichiometry, with four unique inter-subunit interfaces. This pseudosymmetric nature of the receptor has led experimental studies to show ambiguous predictions in identifying the binding site. Here we use thermodynamically rigorous free energy perturbation (AFEP) techniques and Molecular Dynamic simulations to rank the different inter-subunit site by affinity. AFEP calculations predicted selective propofol binding to interfacial sites, with higher affinities for ?/? than ?-containing interfaces.The simulations revealed the key interactions leading to propofol selective binding within GABAA receptor subunit interfaces, with stable hydrogen bonds observed between propofol and ?/? cavity residues, lower tendency to form hydrogen bonds with the more hydrophilic +?/?- interfacial cavity. Sevoflurane, a comparatively lesser potent anesthetic, though shows preference towards ?/? interface as binding sites, does not display micro molar affinity to any particular site. Flooding simulations furthers the AFEP predictions, with identifying sevoflurane bindings sites in ?+/?- and +?/?- interfacial cavity and providing insights into the sevoflurane-binding pathway, ligand-protein interactions and receptor occupancy. 


Mrs. Nicole Revaitis – CCIB, Rutgers-Camden           

Title:  Discrete regulatory domains control the expression of simple patterns during Drosophila oogenesis

Abstract:   Patterning of the Drosophila melanogaster eggshell has been extensively studied, however, the cis-regulatory modules (CRMs), that control the spatiotemporal expression of genes, are mostly unknown. This study takes advantage of the Flylight GAL4 collection, composed of over 7000 lines generated from intronic and intergenic regions of DNA, and cross-listed the ~1200 genes that are contained within this collection to the 84 genes that are known to be expressed during oogenesis. There are 22 genes, or 281 lines, common between the two lists. Of these, 61 lines recapitulate the full or partial patterns of their endogenous gene. Using RNA-seq to identify isoforms expressed during oogenesis, we map the distribution of the 61 fragments to the corresponding gene model, and found an enrichment of fragments active in the first intron. In addition, we demonstrate the use of different anteriorly active FlyLight lines as tools to disrupt eggshell patterning in a targeted manner. Interestingly, while the average fragment is roughly 3 kB, there is usually only one component of the endogenous gene’s pattern in each fragment. This follows our hypothesis that complex gene patterns are assembled by individual CRMs.


Ms. Sreelekha Revur – CCIB, Rutgers-Camden           

Title:  Identification of novel stalk-specific penicillin binding proteins in Caulobacter crescentus 

Abstract:  Caulobacter crescentus is a Gram-negative bacterium with a polar stalk. The stalk synthesized during an asymmetric cell division is significantly elongated in response to phosphate starvation. The stalk has been characterized as an extension of the cell pole, containing all three layers of cell envelope. However, the mechanism(s) involved in stalk elongation are not well characterized. The stalk envelope contains peptidoglycan, thus identifying the penicillin binding proteins (PBPs) involved in the process would shed light on the mechanism of stalk elongation. Koyasu et al. (Microbiology 1983) identified two inner membrane PBPs, PBP-X and -Y which appear to be stalk-specific, although the genes encoding these two proteins are unknown. Using the fluorescent penicillin analogue, Bocillin FL, we have replicated these experiments and identified a novel PBP whose overexpression results in ectopic stalk formation. We are currently expanding this work to identify additional PBPs involved in Caulobacter stalk peptidoglycan synthesis using biotinylated ampicillin-derivatives.


Mr. Daniel Russo – CCIB, Rutgers-Camden           

Title: CIIProCluster: Developing Read-Across Predictive Toxicity Models Using Big Data

Abstract:   Accurate predictive computational models for complex toxicity endpoints, e.g. oral acute toxicity and hepatotoxicity, have remained elusive. The difficulties in model development for these endpoints can be attributed to the complex mechanisms relevant to the toxicity phenomena.  Recent work has shown that incorporating biological data into model development has resulted in better predictivity and allowed for intuition on mechanisms of action for toxicants. However, in the current big data era, finding and characterizing relevant biological data to evaluate the chemical toxicity of interest is a major challenge.  The Chemical In-vitro, In-vivo Profiling portal was created to use big data sources for the prediction of new compounds ( As a major advancement of the CIIPro project, we present CIIProCluster, a new read-across approach for creating predictive toxicity models based on the available bioassay data for chemicals of interest. Briefly, all available biological data for the target compounds are automatically extracted. Chemical features relevant to each bioassay testing are identified by using Fisher’s Exact Test (p < 0.05) to rank chemical fragments existing in the target compounds of each bioassay dataset. The available bioassays can be clustered by the chemical fragment features that contribute to their activation.  In this study, PubChem bioassays were prioritized and clustered based upon the chemical-in vitro relationships described above for rat oral acute toxicity. Several clusters of PubChem bioassays not only are able to predict acute toxicity for new compounds but also show toxicity mechanisms for toxicants containing the relevant chemical features. This new read-across approach can be easily applied to generate predictive models for other animal toxicity endpoints.

CCIB Retreat Poster – Russo

Mr. Liam Sharp – CCIB, Rutgers-Camden           

Title: Coarse-Grained Simulations of Nicotinic Acetylcholine Receptors in Complex Mixed Membranes: Embedded Lipids and Domain Partitioning

Abstract:   Nicotinic acetylcholine receptors (nAChRs) are pentameric Ligand Gated Ion Channels that are critical to signaling across synapses and the neuromuscular junction; such signaling is facilitated by high densities of nAChRs in the post-synaptic membrane. Organization of nAChRs, including parti- tioning behavior in membranes containing distinct lipid domains, is poorly characterized. Numerous experimental studies have shown nAChR gain-of-function likely caused by direct interactions with cholesterol, but a significant role for lipid domains has been suggested by nAChR gain-of-function upon bulk cholesterol depletion. Furthermore, the opportunity for cholesterol to have a direct inter- actions will likely have a complex dependence on the extent of domain formation and lipid species in the membrane, which has not been previously addressed. In the present research, we use Molecular Dynamics Simulations with coarse-grained resolution via the MARTINI model to investigate con- centrations of cholesterol and other lipids local to nAChRs embedded in complex model membranes with a range of head groups and degrees of unsaturation. Cholesterol and unsaturated lipids are observed binding in deep non-annular sites in the nAChR bundle (based on the 2BG9 cryo-EM struc- ture), consistent with our previous predictions. nAChR partitions, however, into cholesterol-poor phases, resulting in dynamic exchange between cholesterol and unsaturated phospholipids.

CCIB Retreat Poster – Sharp


Ms. Sung Won Oh – CCIB, Rutgers-Camden           

Title:  Logic-Gated Catalytic Circuits for Biosensing

Abstract:   In biochemical pathways, many enzyme functions are regulated by inhibition byproducts, or product feedback inhibition. In this project, artificial swinging arms are designed to channel the transfer of intermediates in multienzyme reactions. Swinging arms play important role in multi-step, catalytic transformations in multienzyme complexes. DNA logic-AND-gated circuits will be implemented to swinging arms to control the release and activation of swinging arms to control the transport of intermediates in enzyme reactions. Logic gate circuits are composed of single-stranded DNA molecules to regulate the pathway activities and specificities. In order to regulate the multienzyme pathway, logic gate circuits will be implemented on DNA nanostructures for controlling and switching pathway activities to produce different final products depending on specific inputs. An ‘AND’ logic-gated swinging arm is designed then native polyacrylamide gel electrophoresis (PAGE) is used to characterize the opening and closing of logic-gate circuits for release of swinging arm. Currently, DNA-based logic gated circuits for controlling swinging arms has been created with toehold design for DNA strand displacement, resulting in releasing of swinging arm. After logic-gated circuits are controlled, it will be characterized using FRET (fluorescence resonance energy transfer microscopy) to evaluate the release of the swinging arm and the activation of enzyme pathway by input of two key strands. Eventually, logic-gated NAD-swinging arm will be applied to detecting biotargets with signal amplification in a small test tube and visible color change.


Ms. Linlin Zhao – CCIB, Rutgers-Camden           

Title:   Experimental errors in QSAR modeling sets: what we can do and what we cannot do

Abstract:   Numerous data sources have become available for quantitative structure–activity relationship (QSAR) modeling studies. However, the quality of various data sources may be different based on the nature of experimental protocols. In this study, we explored the relationship between the ratio of questionable data, which was obtained by simulating experimental errors, in the modeling sets and the QSAR modeling performance. To this end, we used eight datasets (four continuous endpoints and four binary endpoints) that has been extensively curated in our lab to create over 1,800 various QSAR models. Each dataset was duplicated to seven new modeling sets with different ratios of simulated experimental errors (i.e. randomizing the activities of part of the compounds) in the modeling process. The five-fold cross validation process was used to show the model performance, which becomes worse when the ratio of experimental errors increases. All the resulting models were also used to predict external sets of new compounds which were excluded at the beginning of modeling process. The modeling results showed that the compounds with relatively large prediction errors in cross validation process are more likely to be those with simulated experimental errors. However, after removing certain number of compounds with large prediction errors in cross-validation process, the external predictions of new compounds did not gain improvement. Our conclusion is that the QSAR predictions, especially consensus predictions, are able to indicate those compounds with potential experimental errors. But removing those compounds will not result in better model performance due to overfitting. Apparently extra experimental testing is necessary for those compounds found to be questionable by QSAR predictions. 


2017 CCIB Annual Retreat

2016 CCIB Annual Retreat

2015 CCIB Annual Retreat