Computing and Information Systems - Research Publications

Permanent URI for this collection

http://hdl.handle.net/11343/350

Search Results

Now showing 1 - 10 of 596

Nye tray vs sieve tray: A comparison based on computational fluid dynamics and tray efficiency

Abbasnia, S ; Nasri, Z ; Shafieyoun, V ; Golzarijalal, M (Wiley, 2021-10)

Nye and sieve trays were hydrodynamically simulated and compared. The simulations were performed in a Eulerian‐Eulerian framework under unsteady (transient) conditions at industrial scale. Conducted on an air‐water system, the simulations included three dimensions and two phases. The velocity distribution across the tray, the height of clear liquid, the froth height, and the pressure drop were investigated and compared with experimental data. Péclet number was calculated using hydrodynamic and geometric parameters. The tray efficiencies were also predicted to further compare the two trays. The results showed that the liquid flow was steadier on the Nye tray rather than the sieve tray, possibly because of the special structure of the liquid and gas inlets for the Nye tray.
Computational Fluid Dynamics versus Experiment: An Investigation on Liquid Weeping of Nye Trays

Abbasnia, S ; Shafieyoun, V ; Golzarijalal, M ; Nasri, Z (Wiley, 2021-01)

The weeping phenomenon was investigated using some experimental tests and a numerical model. The tests were performed within a 1.22‐m‐diameter pilot‐scale column including two chimney trays and two Nye test trays with an air‐water system. The rates of weeping were measured in the Nye trays with two heights of the weir and a hole area of 5 %. Moreover, the weeping rates in the outlet and inlet halves of the Nye tray and the total weeping rate were calculated. In the next step, an Eulerian‐Eulerian computational fluid dynamics (CFD) technique was used. The results show good agreement between the attained CFD findings and the experimental data.
Benchmarks for measurement of duplicate detection methods in nucleotide databases

Chen, Q ; Zobel, J ; Verspoor, K (OXFORD UNIV PRESS, 2023-12-18)

UNLABELLED: Duplication of information in databases is a major data quality challenge. The presence of duplicates, implying either redundancy or inconsistency, can have a range of impacts on the quality of analyses that use the data. To provide a sound basis for research on this issue in databases of nucleotide sequences, we have developed new, large-scale validated collections of duplicates, which can be used to test the effectiveness of duplicate detection methods. Previous collections were either designed primarily to test efficiency, or contained only a limited number of duplicates of limited kinds. To date, duplicate detection methods have been evaluated on separate, inconsistent benchmarks, leading to results that cannot be compared and, due to limitations of the benchmarks, of questionable generality. In this study, we present three nucleotide sequence database benchmarks, based on information drawn from a range of resources, including information derived from mapping to two data sections within the UniProt Knowledgebase (UniProtKB), UniProtKB/Swiss-Prot and UniProtKB/TrEMBL. Each benchmark has distinct characteristics. We quantify these characteristics and argue for their complementary value in evaluation. The benchmarks collectively contain a vast number of validated biological duplicates; the largest has nearly half a billion duplicate pairs (although this is probably only a tiny fraction of the total that is present). They are also the first benchmarks targeting the primary nucleotide databases. The records include the 21 most heavily studied organisms in molecular biology research. Our quantitative analysis shows that duplicates in the different benchmarks, and in different organisms, have different characteristics. It is thus unreliable to evaluate duplicate detection methods against any single benchmark. For example, the benchmark derived from UniProtKB/Swiss-Prot mappings identifies more diverse types of duplicates, showing the importance of expert curation, but is limited to coding sequences. Overall, these benchmarks form a resource that we believe will be of great value for development and evaluation of the duplicate detection or record linkage methods that are required to help maintain these essential resources. DATABASE URL: : https://bitbucket.org/biodbqual/benchmarks.
Directive Explanations for Actionable Explainability in Machine Learning Applications

Singh, R ; Miller, T ; Lyons, H ; Sonenberg, L ; Velloso, E ; Vetere, F ; Howe, P ; Dourish, P (ASSOC COMPUTING MACHINERY, 2023-12)

In this article, we show that explanations of decisions made by machine learning systems can be improved by not only explaining why a decision was made but also explaining how an individual could obtain their desired outcome. We formally define the concept of directive explanations (those that offer specific actions an individual could take to achieve their desired outcome), introduce two forms of directive explanations (directive-specific and directive-generic), and describe how these can be generated computationally. We investigate people’s preference for and perception toward directive explanations through two online studies, one quantitative and the other qualitative, each covering two domains (the credit scoring domain and the employee satisfaction domain). We find a significant preference for both forms of directive explanations compared to non-directive counterfactual explanations. However, we also find that preferences are affected by many aspects, including individual preferences and social factors. We conclude that deciding what type of explanation to provide requires information about the recipients and other contextual information. This reinforces the need for a human-centered and context-specific approach to explainable AI.
Scalable Approximate Butterfly and Bi-triangle Counting for Large Bipartite Networks

Zhang, F ; Chen, D ; Wang, S ; Yang, Y ; Gan, J (Association for Computing Machinery (ACM), 2023-12-08)

A bipartite graph is a graph that consists of two disjoint sets of vertices and only edges between vertices from different vertex sets. In this paper, we study the counting problems of two common types of em motifs in bipartite graphs: (i) butterflies (2x2 bicliques) and (ii) bi-triangles (length-6 cycles). Unlike most of the existing algorithms that aim to obtain exact counts, our goal is to obtain precise enough estimations of these counts in bipartite graphs, as such estimations are already sufficient and of great usefulness in various applications. While there exist approximate algorithms for butterfly counting, these algorithms are mainly based on the techniques designed for general graphs, and hence, they are less effective on bipartite graphs. Not to mention that there is still a lack of study on approximate bi-triangle counting. Motivated by this, we first propose a novel butterfly counting algorithm, called one-sided weighted sampling, which is tailored for bipartite graphs. The basic idea of this algorithm is to estimate the total butterfly count with the number of butterflies containing two randomly sampled vertices from the same side of the two vertex sets. We prove that our estimation is unbiased, and our technique can be further extended (non-trivially) for bi-triangle count estimation. Theoretical analyses under a power-law random bipartite graph model and extensive experiments on multiple large real datasets demonstrate that our proposed approximate counting algorithms can reach high accuracy, yet achieve up to three orders (resp. four orders) of magnitude speed-up over the state-of-the-art exact butterfly (resp. bi-triangle) counting algorithms. Additionally, we present an approximate clustering coefficient estimation framework for bipartite graphs, which shows a similar speed-up over the exact solutions with less than 1% relative error.
TransCP: A Transformer Pointer Network for Generic Entity Description Generation With Explicit Content-Planning

Trisedya, BD ; Qi, J ; Zheng, H ; Salim, FD ; Zhang, R (IEEE COMPUTER SOC, 2023-12-01)
Focused Contrastive Loss for Classification With Pre-Trained Language Models

He, J ; Li, Y ; Zhai, Z ; Fang, B ; Thorne, C ; Druckenbrodt, C ; Akhondi, S ; Verspoor, K (Institute of Electrical and Electronics Engineers (IEEE), 2023-01-01)
Special issue on efficient management of microservice-based systems and applications

Xu, M ; Dustdar, S ; Villari, M ; Buyya, R (Wiley, 2023)
Proactive digital workplace transformation: Unpacking identity change mechanisms in remote-first organisations

Bruenker, F ; Marx, J ; Mirbabaie, M ; Stieglitz, S (SAGE PUBLICATIONS LTD, 2023-01-01)

Digital transformation fundamentally changes the way individuals conduct work in organisations. In accordance with this statement, prevalent literature understands digital workplace transformation as a second-order effect of implementing new information technology to increase organisational effectiveness or reach other strategic goals. This paper, in contrast, provides empirical evidence from two remote-first organisations that undergo a proactive rather than reactive digital workplace transformation. The analysis of these cases suggests that new ways of working can be the consequence of an identity change that is a precondition for introducing new information technology rather than its outcome. The resulting process model contributes a competing argument to the existing debate in digital transformation literature. Instead of issuing digital workplace transformation as a deliverable of technological progress and strategic goals, this paper supports a notion of digital workplace transformation that serves a desired identity based on work preferences.
Developing a deep learning natural language processing algorithm for automated reporting of adverse drug reactions

McMaster, C ; Chan, J ; Liew, DFL ; Su, E ; Frauman, AG ; Chapman, WW ; Pires, DEV (ACADEMIC PRESS INC ELSEVIER SCIENCE, 2023-01)

The detection of adverse drug reactions (ADRs) is critical to our understanding of the safety and risk-benefit profile of medications. With an incidence that has not changed over the last 30 years, ADRs are a significant source of patient morbidity, responsible for 5%-10% of acute care hospital admissions worldwide. Spontaneous reporting of ADRs has long been the standard method of reporting, however this approach is known to have high rates of under-reporting, a problem that limits pharmacovigilance efforts. Automated ADR reporting presents an alternative pathway to increase reporting rates, although this may be limited by over-reporting of other drug-related adverse events. We developed a deep learning natural language processing algorithm to identify ADRs in discharge summaries at a single academic hospital centre. Our model was developed in two stages: first, a pre-trained model (DeBERTa) was further pre-trained on 1.1 million unlabelled clinical documents; secondly, this model was fine-tuned to detect ADR mentions in a corpus of 861 annotated discharge summaries. This model was compared to a version without the pre-training step, and a previously published RoBERTa model pretrained on MIMIC III, which has demonstrated strong performance on other pharmacovigilance tasks. To ensure that our algorithm could differentiate ADRs from other drug-related adverse events, the annotated corpus was enriched for both validated ADR reports and confounding drug-related adverse events using. The final model demonstrated good performance with a ROC-AUC of 0.955 (95% CI 0.933 - 0.978) for the task of identifying discharge summaries containing ADR mentions, significantly outperforming the two comparator models.

Computing and Information Systems - Research Publications

Permanent URI for this collection

Filters

Date

Author

Type

Settings

Sort By

Results per page

Statistics

Citations

Search Results