Course syllabus - Machine Learning With Big Data 7.5 credits

Maskininlärning med Big Data

Course code: DVA453
Valid from: Autumn semester17 Autumn semester18 Autumn semester19
Level of education: Second cycle
Subject: Informatics/Computer and Systems Scie...
Main Field(s) of Study: Computer Science,
In-Depth Level: A1N (Second cycle, has only first-cycle course/s as entry requirements),
School: IDT
Ratification date: 2017-01-31
Change date: 2018-02-01


The rapid development of digital technologies and advances in communications have led to gigantic amounts of data with complex structures called ‘Big data’ being produced every day at exponential growth. The aim of this course is to give the student insights in fundamental concepts of machine learning with big data as well as recent research trends in the domain. The student will learn about problems and industrial challenges through domain-based case studies. Furthermore, the student will learn to use tools to develop systems using machine-learning algorithms in big data.

Learning outcomes

After completing the course, the student shall be able to:

1. describe the basic principles of machine learning and big data
2. demonstrate the ability to identify key challenges to use big data with machine learning
3. show the ability to select suitable Machine Learning algorithms to solve a given problem for big data
4. demonstrate the ability to use tools for big data analytics and present the analysis result 

Course content

Module 1. Introduction and background: introduction is intended to review Machine Learning (ML) and Big Data processing techniques and related subtopics with focus on the underlying themes.
Module 2. Case studies: presents case studies from different application domains and discuss key technical issues e.g., noise handling, feature extraction, selection, and learning algorithms in developing such systems.
Module 3. Machine learning techniques in big data analytics: this module consists of basic understanding of learning theory, clustering analysis, deep learning and other classification techniques appropriate for development work and issues in construction of systems using Big Data.
Module 4.  Data analytics with tools: presents open source tools e.g., KNIME and Spark with examples that guide through the basic analysis of Big Data.

Teaching methods

Online video-based lectures, problem-based learning, assigned readings of scientific articles (reading, searching the web), chat rooms/discussion forum.

Specific entry requirements

100 credits of which at least 70 credits in Computer Science or equivalent, including at least 30 credits in programming or software development. In addition, at least 12 months of documented work experience in software development or related areas. In addition, Swedish course B/Swedish course 3 and English course A/English course 6 are required. For courses given entirely in English exemption is made from the requirement in Swedish course B/Swedish course 3.


Written assignment (INL1), (Module 1), 1,0 credit, (examines the learning objective 1), Marks Fail (U) or Pass (G)
Written assignment (INL2), (Module 2), 1,5 credits, (examines the learning objective 2), Marks Fail (U) or Pass (G)
Written assignment (INL3), (Module 3), 2,0 credits, (examines the learning objectives 3), Marks Fail (U) or Pass (G)
Project (PRO1), (Module 4), 3 credits, (examines the learning agreement 4), Marks Fail (U) or Pass (G)


A student who has a certificate from MDH regarding a disability has the opportunity to submit a request for supportive measures during written examinations or other forms of examination, in accordance with the Rules and Regulations for Examinations at First-cycle and Second-cycle Level at Mälardalen University (2016/0601). It is the examiner who takes decisions on any supportive measures, based on what kind of certificate is issued, and in that case which measures are to be applied.

Suspicions of attempting to deceive in examinations (cheating) are reported to the Vice-Chancellor, in accordance with the Higher Education Ordinance, and are examined by the University’s Disciplinary Board. If the Disciplinary Board considers the student to be guilty of a disciplinary offence, the Board will take a decision on disciplinary action, which will be a warning or suspension.

Rules and regulations for examinations


Two-grade scale

Course literature is preliminary until 3 weeks before the course starts. Literature may be valid over several terms.

Valid from: Autumn semester18

Decision date: 2018-07-04

Last update: 2018-07-04


Shalev-Shwartz, Shai; Ben-David, Shai;

Understanding machine learning : from theory to algorithms

ISBN: 978-1-107-05713-5 LIBRIS-ID: 16717946

xvi, 397 p.

Abu-Mostafa, ;

Learning From Data

ISBN: 9781600490064 LIBRIS-ID: 13914157

James, Gareth.; Witten, Daniela.; Hastie, Trevor.; Tibshirani, Robert.;

An Introduction to Statistical Learning [Elektronisk resurs] : with Applications in R

ISBN: 978-1-4614-7138-7 LIBRIS-ID: 14557777

XIV, 426 p. 150 illus., 146 illus. in color.

Richter, Michael M Author;

Case-Based Reasoning A Textbook [Elektronisk resurs]

LIBRIS-ID: 21159926

Guyon, Isabelle;

Feature extraction [electronic resource] : foundations and applications / Isabelle Guyon ... [et al.] (eds.). [Elektronisk resurs]

LIBRIS-ID: 22022307

Goodfellow, Ian; Bengio, Yoshua; Courville, Aaron;

Deep learning

ISBN: 978-0-262-03561-3 LIBRIS-ID: 19973915

xxii, 775 pages

Yu, Shui.; Guo, Song.;

Big Data Concepts, Theories, and Applications

ISBN: 978-3-319-27763-9 LIBRIS-ID: 19419072

VIII, 437 p. 97 illus., 17 illus. in color.

Karau, Holden Author;

Learning Spark : Lightning-Fast Big Data Analytics [electronic resource] [Elektronisk resurs]

LIBRIS-ID: 21399966

Guido, Sarah;

Introduction to Machine Learning with Python

ISBN: 9781449369415 LIBRIS-ID: 17554477

400 s.

Raschka, Sebastian;

Python machine learning : unlock deeper insights into machine learning with this vital guide to cutting-edge predictive analysis

ISBN: 1-78355-513-0 LIBRIS-ID: 18646485

425 s.

Provost, Foster Author;

Data Science for Business : What You Need to Know about Data Mining and Data-analytic Thinking [electronic resource] [Elektronisk resurs]

LIBRIS-ID: 21842654


Tsai, C. W.; Lai, C.F.; Chao, H.C.; Vasilakos, A.V.;

Big data analytics: a survey

Landset, S.; Khoshgoftaar, T.M.; Richter, A.N.; Hasanin, T.;

A survey of open source tools for machine learning with big data in the Hadoop ecosystem

Fan, J.; Han, F.; Liu, H.;

Challenges of Big Data analysis

Blum, A.L.; Langley, P.;

Selection of relevant features and examples in machine learning

Dash, M.; Liu, H.;

Feature selection for classification

Aamodt, A.; Plaza, E.;

Case-based reasoning: foundational issues, methodological variations, and system approaches

Lopez De Mantaras, R.; et. al, ;

Retrieval, reuse, revision and retention in case-based reasoning

Meng, X.; et. al, ;

MLlib: machine learning in apache spark