EspecialidadesComputación
Tutor
FLORES GALLEGO, MARIA JULIA
Descripción y Objetivos
Static Bayesian networks (BNs) represent causal processes which may be repeated, but which do not change upon repetition. While simple in concept, they are nevertheless powerful representations for dealing with systems showing significant uncertainty about their outcomes, and they are widely employed in health care, ecological and environmental sciences, manufacturing industries, and government (see www.norsys.com/clients.htm). A great deal of effort has been put into automating the learning of Bayesian networks from data, including at Monash, yielding our CaMML program (Causal discovery via MML). One case study with CaMML has been on a medical dataset (Flores et al., 2011).
Dynamic Bayesian networks (DBNs) represent causal systems which change over time, yielding time series data from their observation. Examples include all kinds of economic markets, such as stock markets and currencies, weather and climate systems, and human disease processes. More recently, CaMML has been extended to learn DBNs from observational data.
This project will look at applying the new CaMML DBN learning to one or more medical datasets. This may involve working with medical collaborators at the The Macfarlane Burnet Institute for Medical Research and Public Health, or Melbourne Hospital (depending on which datasets become available).
IMPORTANT: This project will be also co-supervised by Prof. Ann Nicholson (Univ. Monash, Australia). Then, the communication, code comments, and report writing has to be in ENGLISH.
Metodología y Competencias
Stage 1: Dataset collection, a medical dataset will be chosen from the problems we have availability.
Stage 2: Study of the problem and identification of the target
Stage 3: Preprocessing of the data-set (attribute selection, attribute construction, missing values, discretization, etc...) [The student must know DATA MINING or MACHINE LEARNING techniques before choosing this project.
Stage 4: Learning of models, which may imply come back to previous stages, if necessary.
Stage 5: Results collection and evaluation of the models. Comparison. Tables, graphs, confusion mattrices, use of tstatistical tests, etc..
Stage 6: Writing the corresponding report
Note that the defense of the project will also be in English language.
Medios a utilizar
In this project the student will have to use CaMML, Weka, and implement his/her own code in Java, combining the two previous tools and new developments. Knowledge in Data Mining is a prerequisite.
Student must have good level at English (written, read and spoken).
Bibliografía
Korb & Nicholson (2011). Bayesian Artificial Intelligence, Chapters 1&2 (BN basics), 4.8 (DBNs) and Chapter 8 (esp. CaMML sections)
Flores, M.J., Nicholson, A.E., Brunskill, A., Korb, K.B., Mascaro (2011). Incorporating expert knowledge when learning Bayesian network structure: A medical case study, Artificial Intelligence in Medicine, vol 53, issue 3, Elsevier Science, Amsterdam Netherlands, pp. 181-204. ERA Rank A – Impact factor 1.583.
Asignación
El Trabajo Fin de Grado ha sido a asignado a Don/Doña Fernando Rubio Perona
|