Parallel and Distributed Computing II весна 2025
Содержание
[убрать]Information about course
This is one-term course designed for third-year undergraduate students pursuing a Bachelor's degree in Computer Science. It provides a comprehensive introduction to the principles and practices of distributed computing, with a specific focus on Big Data analytics and engineering.
Prerequisites for this course include a foundational knowledge in Computer Science, proficiency in Python programming, familiarity with Linux environments, basic bash command usage, and experience with Git.
Throughout the term, students will study key distributed computing frameworks and technologies such as HDFS (Hadoop Distributed File System), MapReduce, Hive, Apache Spark, and Spark Streaming. The course is structured to provide a balance of theoretical knowledge and practical application, with students gaining direct experience by working on the university's dedicated Hadoop cluster. To access the cluster for course-related projects and hands-on learning, students are required to fill out a form to obtain an account.
Upon completion, students will be well-prepared to tackle complex data processing tasks and work effectively in the undustry of Big Data analytics and engineering.
Grading
Links
Contacts
Teacher: Julia Ivanova tg: @lajulienn