
| Course Code | : CSE424 |
| Course Type | : Area Elective |
| Couse Group | : First Cycle (Bachelor's Degree) |
| Education Language | : English |
| Work Placement | : N/A |
| Theory | : 2 |
| Prt. | : 2 |
| Credit | : 3 |
| Lab | : 0 |
| ECTS | : 6 |
The recent explosion of social media and the computerization of every aspect of economic activity resulted in creation of large volumes of mostly unstructured data: web logs, videos, speech recordings, photographs, e-mails, Tweets, and similar. In a parallel development, computers keep getting ever more powerful and storage ever cheaper. Today, we have the ability to reliably and cheaply store huge volumes of data, efficiently analyze them, and extract business and socially relevant information. This course introduces you to several key IT technologies that you will be able to use to manipulate, store, and analyze big data. This course provides an in-depth coverage of special topics in big data from data generation, storage, management, transfer, to analytics, with focuses on the state-of-the-art technologies, tools, architectures, and systems that constitute big-data computing solutions in high-performance networks. Real-life bigdata applications in various domains (particularly in sciences) are introduced as use cases to illustrate the development, deployment, and testing of a wide spectrum of emerging big-data solutions. Also we will focus on data mining and machine learning algorithms for analyzing very large amounts of data or Big data.
The course material will be drawn from textbooks as well as recent research literature. The following topics will be covered this year: Hadoop, Mapreduce, Association rules, Large scale supervised machine learning, Data streams, Clustering, NoSQL systems (Cassandra, Pig, Hive), and Applications including recommendation systems, Web and security.
| Lec. Hüseyin ABACI |
| 1. | By providing a balanced view of "theory" and "practice," the course should allow the student to understand, use, and build practical big data analytics an management systems. The course is intended to provide a basic understanding of the issues and problems involved in massive on-line repository systems, a knowledge of currently practical techniques for satisfying the needs of such a system, and an indication of the current research approaches that are likely to provide a basis for tomorrow's solutions. |
| 2. | learning of big data concepts, terminology, data analytics characteristics and types of Big Data such as 5V, structured unstructured, semi-structured and metadata. |
| 3. | comprehention of data analysis techniques and topics such as quantitative, qualitative data mining, Statistical Analysis, A/B testing, correlation, regression analysis. |
| 4. | having comprehensive knowledge of storage concepts such as clusters, distributed file systems, RDBMS, NoSQL, in-memory storage; Big Data processing concept such as parallel, distributed, batch data processing. |
| 5. | having a comprehensive knowledge of parallel processes and other design patterns for big data processing: Cloudera virtual machine. HDFS ( Hadoop Distributed Filesystem), YARN (Yet Another Resource Negotiator and Hue). |
| 1. | Big Data Fundamentals: Concepts, Drivers & Techniques (1st ed.). Thomas Erl, Wajid Khattak, and Paul Buhler. Prentice Hall Press, Upper Saddle River, NJ, USA. 2016. |
| 2. | Big Data, Principles and Best Practices of Scalable Realtime Data Systems, Nathan Marz and James Warren, Manning Publications 2015. |
| 3. | Hadoop: The Definitive Guide, Tom White, O’Reilly, 2015. |
| Type of Assessment | Count | Percent |
|---|---|---|
| Midterm Examination | 1 | %20 |
| Final Examination | 1 | %34 |
| Practice | 10 | %20 |
| Quiz | 3 | %6 |
| Assignment | 1 | %20 |
| Activities | Count | Preparation | Time | Total Work Load (hours) |
|---|---|---|---|---|
| Lecture - Theory | 14 | 0 | 2 | 28 |
| Lecture - Practice | 14 | 0 | 2 | 28 |
| Assignment | 5 | 0 | 2 | 10 |
| Term Project | 1 | 8 | 7 | 15 |
| Quiz | 4 | 5 | 1 | 26 |
| Midterm Examination | 1 | 16 | 2 | 18 |
| Final Examination | 1 | 20 | 2 | 22 |
| TOTAL WORKLOAD (hours) | 147 | |||
PÇ-1 | PÇ-2 | PÇ-3 | PÇ-4 | PÇ-5 | PÇ-6 | PÇ-7 | PÇ-8 | PÇ-9 | PÇ-10 | PÇ-11 | |
OÇ-1 | 5 | 5 | 4 | 4 | 4 | 4 | 4 | ||||
OÇ-2 | 5 | 4 | 4 | 4 | 5 | 4 | 4 | 4 | |||
OÇ-3 | 5 | 5 | 4 | 4 | 4 | 5 | 4 | ||||
OÇ-4 | 5 | 5 | 5 | 4 | 5 | 4 | 4 | ||||
OÇ-5 | 5 | 4 | 5 | 4 | 4 | 4 | 5 | 4 | 4 | 5 | 4 |