DS800: Introduction to Data Processing

Study Board of Science

Teaching language: Danish or English depending on the teacher
EKA: N340041102
Assessment: Second examiner: External
Grading: 7-point grading scale
Offered in: Odense
Offered in: Autumn
Level: Master

STADS ID (UVA): N340041101
ECTS value: 10

Date of Approval: 13-01-2021


Duration: 1 semester

Version: Archive

Comment

DISCONTINUED - Autumn 2021 - DS800 has been replaced by another course (DS831: Programming for Data Science 8 10 ECTS credits).
See danish version for further information.

Entry requirements

The course cannot be taken by students enrolled in the master programme in Computer Science.
The course cannot be taken by students who has taken DM561 or DM562.

Academic preconditions

None

Course introduction

The aim of the course is to enable the student to solve data analysis tasks for a diversity of problems from different research areas. Next to algorithmic thinking, data analysis workflows include activities like data modeling, gathering, cleaning, processing, and means to visualize certain attributes in basic plots. This is important in regard to the remainder of the Data Science education as it provides the basis for carrying out data analysis projects.

The course will provide students with knowledge and competencies in methods from linear algebra, such as matrices and matrix calculations, allowing for a mathematical description of a data science assignment. In addition, the course will provide the students with skills in writing small computer programs to perform scientific calculations that arise in linear algebra or in the methods that will be introduced later in the degree programme.

The course gives an academic basis for studying the topics Data Mining and Machine Learning, Applied Machine Learning, Visualization and Deep Learning, that are part of the degree.

In relation to the competence profile of the degree it is the explicit focus of the course to:
  • Give the competence to develop solutions for data analysis tasks
  • Give the competence to apply and integrate existing modules for data processing
  • Give knowledge and understanding of the principles of programming
  • Give knowledge and understanding of algorithmically processing large amounts of data within different subject areas
  • Give skills in software development 
  • Give skills in data collection, cleaning, validation, integration and visualization

Among others, students partaking the course will particularly earn the following 21st Century Skills:

  • The ability to integrate and assess information
  • Competently find, utilise and assess information
  • Being able to execute and implement
  • Be flexible and adaptable
  • Co-create solutions to existing problems and work effectively in teams

Expected learning outcome

The learning objectives of the course is that the student demonstrates the ability to:
  • Apply learned problem solving strategies to different data processing tasks
  • Adapt existing solutions to related tasks across domains
  • Develop new data analysis strategies
  • Develop Python programs that implement data processing workflows
  • Find, select and utilize existing modules to collect, clean and process data
  • Collaboratively develop data analysis solutions in project teams

Content

The following topics are contained in the course:

1. Basics of computing and algorithmic thinking
2. Python programming
  • basic data types
  • branching
  • loops
  • functions
  • mutable data types
  • modules
  • file I/O, exceptions
  • classes
  • basic data visualization
3. Data analysis workflows with data sets from different domains, e.g.:
  • biography data
  • climate data
  • textual data
  • numerical data

Literature

See Blackboard for syllabus lists and additional literature references.

Examination regulations

Exam element a)

Timing

January

Tests

Written examination.

EKA

N340041102

Assessment

Second examiner: External

Grading

7-point grading scale

Identification

Student Identification Card

Language

Normally, the same as teaching language

Examination aids

Allowed, a closer description of the exam rules will be posted in itslearning.

ECTS value

10

Additional information

Eksamensformen ved reeksamen kan være en anden end eksamensformen ved den ordinære eksamen. 

Indicative number of lessons

90 hours per semester

Teaching Method

At the faculty of science, teaching is organized after the three-phase model ie. intro, training and study phase.
  • Intro phase (Lecture, class lessons) – Number of hours: 46
  • Training phase: Number of Hours: 44, including examiner hours 44
The intro phase facilitates the introduction to new material and topics, which in the skills training phase are processed with exercises prepared at home and discussed in class to validate the acquired knowledge. The study activity in form of practical applications gives the students the possibility to apply and use the knowledge acquired.

Study phase activities:
  • Reading from text books
  • Solving homeworks 
  • Applying acquired knowledge to practical projects

Teacher responsible

Name E-mail Department
Arthur Zimek zimek@imada.sdu.dk Data Science
Stefan Jänicke stjaenicke@imada.sdu.dk Data Science

Timetable

Administrative Unit

Institut for Matematik og Datalogi (datalogi)

Team at Educational Law & Registration

NAT

Offered in

Odense

Recommended course of study