Chancellery Research - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 22
  • Item
    Thumbnail Image
    Boolean versus ranked querying for biomedical systematic reviews
    Karimi, S ; Pohl, S ; Scholer, F ; Cavedon, L ; Zobel, J (BMC, 2010-10-12)
    BACKGROUND: The process of constructing a systematic review, a document that compiles the published evidence pertaining to a specified medical topic, is intensely time-consuming, often taking a team of researchers over a year, with the identification of relevant published research comprising a substantial portion of the effort. The standard paradigm for this information-seeking task is to use Boolean search; however, this leaves the user(s) the requirement of examining every returned result. Further, our experience is that effective Boolean queries for this specific task are extremely difficult to formulate and typically require multiple iterations of refinement before being finalized. METHODS: We explore the effectiveness of using ranked retrieval as compared to Boolean querying for the purpose of constructing a systematic review. We conduct a series of experiments involving ranked retrieval, using queries defined methodologically, in an effort to understand the practicalities of incorporating ranked retrieval into the systematic search task. RESULTS: Our results show that ranked retrieval by itself is not viable for this search task requiring high recall. However, we describe a refinement of the standard Boolean search process and show that ranking within a Boolean result set can improve the overall search performance by providing early indication of the quality of the results, thereby speeding up the iterative query-refinement process. CONCLUSIONS: Outcomes of experiments suggest that an interactive query-development process using a hybrid ranked and Boolean retrieval system has the potential for significant time-savings over the current search process in the systematic reviewing.
  • Item
    Thumbnail Image
    Prediction of breast cancer prognosis using gene set statistics provides signature stability and biological context
    Abraham, G ; Kowalczyk, A ; Loi, S ; Haviv, I ; Zobel, J (BMC, 2010-05-25)
    BACKGROUND: Different microarray studies have compiled gene lists for predicting outcomes of a range of treatments and diseases. These have produced gene lists that have little overlap, indicating that the results from any one study are unstable. It has been suggested that the underlying pathways are essentially identical, and that the expression of gene sets, rather than that of individual genes, may be more informative with respect to prognosis and understanding of the underlying biological process. RESULTS: We sought to examine the stability of prognostic signatures based on gene sets rather than individual genes. We classified breast cancer cases from five microarray studies according to the risk of metastasis, using features derived from predefined gene sets. The expression levels of genes in the sets are aggregated, using what we call a set statistic. The resulting prognostic gene sets were as predictive as the lists of individual genes, but displayed more consistent rankings via bootstrap replications within datasets, produced more stable classifiers across different datasets, and are potentially more interpretable in the biological context since they examine gene expression in the context of their neighbouring genes in the pathway. In addition, we performed this analysis in each breast cancer molecular subtype, based on ER/HER2 status. The prognostic gene sets found in each subtype were consistent with the biology based on previous analysis of individual genes. CONCLUSIONS: To date, most analyses of gene expression data have focused at the level of the individual genes. We show that a complementary approach of examining the data using predefined gene sets can reduce the noise and could provide increased insight into the underlying biological pathways.
  • Item
  • Item
    Thumbnail Image
    2 Document Compaction for Efficient Query Biased Snippet Generation
    Tsegay, Y ; Puglisi, SJ ; Turpin, A ; Zobel, J ; Boughanem, M ; Berrut, C ; Mothe, J ; SouleDupuy, C (SPRINGER-VERLAG BERLIN, 2009)
  • Item
    Thumbnail Image
    Cache-conscious collision resolution in string hash tables
    Askitis, N ; Zobel, J ; Consens, M ; Navarro, G (SPRINGER-VERLAG BERLIN, 2005)
  • Item
    Thumbnail Image
    Searchablewords on theWeb
    Williams, HE ; Zobel, J (SPRINGER, 2005-04)
  • Item
    Thumbnail Image
    Efficient plagiarism detection for large code repositories
    Burrows, S ; Tahaghoghi, SMM ; Zobel, J (WILEY, 2007-02)
  • Item
    Thumbnail Image
    Accurate discovery of co-derivative documents via duplicate text detection
    Bernstein, Y ; Zobel, J (PERGAMON-ELSEVIER SCIENCE LTD, 2006-11)
  • Item
    Thumbnail Image
    Redundant documents and search effectiveness
    Bernstein, Y ; Zobel, J (ACM, 2005-12-01)
  • Item
    Thumbnail Image
    Using query logs to establish vocabularies in distributed information retrieval
    Shokouhi, M ; Zobel, J ; Tahaghoghi, S ; Scholer, F (ELSEVIER SCI LTD, 2007-01)