Search programme

​Use the search function to search amongst programmes at Chalmers. The study programme and the study programme syllabus relating to your studies are generally from the academic year you began your studies.

Syllabus for

Academic year
MVE441 - Statistical learning for big data
Statistik för stora datamängder
 
Syllabus adopted 2020-02-05 by Head of Programme (or corresponding)
Owner: MPENM
7,5 Credits
Grading: TH - Pass with distinction (5), Pass with credit (4), Pass (3), Fail
Education cycle: Second-cycle
Major subject: Mathematics
Department: 11 - MATHEMATICAL SCIENCES


Teaching language: English
Application code: 20150
Open for exchange students: Yes

Module   Credit distribution   Examination dates
Sp1 Sp2 Sp3 Sp4 Summer course No Sp
0120 Project 1,5c Grading: UG   1,5c    
0220 Take-home examination 6,0c Grading: TH   6,0c    

In programs

MPDSC DATA SCIENCE AND AI, MSC PROGR, Year 1 (compulsory elective)
MPCAS COMPLEX ADAPTIVE SYSTEMS, MSC PROGR, Year 1 (compulsory elective)
MPCAS COMPLEX ADAPTIVE SYSTEMS, MSC PROGR, Year 2 (elective)
MPENM ENGINEERING MATHEMATICS AND COMPUTATIONAL SCIENCE, MSC PROGR, Year 1 (compulsory elective)

Examiner:

Rebecka Jörnsten


Eligibility

General entry requirements for Master's level (second cycle)
Applicants enrolled in a programme at Chalmers where the course is included in the study programme are exempted from fulfilling the requirements above.

Specific entry requirements

English 6 (or by other approved means with the equivalent proficiency level)
Applicants enrolled in a programme at Chalmers where the course is included in the study programme are exempted from fulfilling the requirements above.

Course specific prerequisites

The prerequisites for the course are a basic course in statistical inference and MVE190 Linear Statistical Models. Students can also contact the course instructor for permission to take the course.

Aim

The course should give understanding of and training in techniques for statistical analysis of large data sets.

Learning outcomes (after completion of the course the student should be able to)








  • demonstrate
    understanding of the key concepts and ideas concerning
    classification, clustering and dimension reduction.
  • solve high-dimensional data analysis exercises and interpret the results of such analyses.




Content







  • Overview
    of high-dimensional data analysis

  • Classification:
    Bayes rule, discriminant analysis methods, nearest neighbor
    classifier, classification and regression trees.

  • Cost
    functions, greedy searches, gradient descent, cross-validation.

  • Logistic
    regression

  • Regularization
    methods. Sparse logistic regression, sparse discriminant analysis.

  • Ensemble
    methods: bagging, random projections, random forests.

  • Clustering:
    k-means, hierarchical clustering, model-based clustering, spectral
    methods.

  • Dimension
    reduction: PCA, canonical correlation, multi-dimensional scaling.

  • Special
    topics (subset of the following): networks and graphical models,
    sparse covariance estimation, network clustering and community
    detection, neural networks, matrix completion, collaborative
    filtering.
  • Large-scale
    learning: stochastic searches, batch-methods, online learning.




Organisation







The teaching is organized with lectures, discussions, and reading assignments.

Literature

To be announced.

Examination including compulsory elements







Oral and/or written examination.





Published: Mon 28 Nov 2016.