Learning Outcomes
At the end of this course, students should be able to:
1. explain the principles and best practices of managing data with efficiency and
effectiveness;
2. demonstrate knowledge of SQL and NoSQL;
3. explain data warehouse concepts, methodologies and tools; and
4. explain data mining architecture and applications.
Course Contents
Rational Databases: Mapping conceptual schema to relational schema; Database Query
Languages (SQL) and NoSQL, Concept of functional dependencies & multi-valued
dependencies. Transaction processing; distributed databases, XML and semantic Web. Data
warehousing. Introduction to data science. Introduction to Data Warehouse, OLTP Systems;
Differences between OLTP Systems and Data Warehouse: Characteristics of Data Warehouse;
Functionality of Data Warehouse: Advantages and Applications of Data Warehouse.
Advantages, Applications: Top- Down and Bottom-Up Development Methodology: Tools for
Data warehouse development: Data Warehouse Types. Introduction: Scope of Data Mining:
What is Data Mining. How Data Mining Works, Predictive Modelling: Data Mining and Data
Warehousing: Architecture for Data Mining: Profitable Applications: Data Mining Tools.
Lab work: Practical exercises on basic R commands and data structures for manipulating
data; how to read data from multiple formats in and out of R, using loops, conditional
statements, and functions to automate common data management tasks. Exercises on how
to clean and manage multiple complex datasets, manipulate textual data, basic web scraping
techniques, for both standard web pages and the Twitter API. Work on techniques and
hardware necessary to manage large datasets efficiently. Practical exercise on managing
multiple data sets by example; working with text data; converting long- and wide-format data;
and dealing with messy data. R Programming Fundamentals for data I/O and packages,
looping and conditional statements, and functions.