Welcome to the Moodle's page of Big Data Computing ! For any further information, students must refer to the official documentation available on the Sapienza website. Attending Classes in Person: Room G50 - Building G, Viale Regina Elena Students who are willing to attend classes in presence must issue their request through the Infostud Lab App or the Prodigit Sapienza online booking system, according to the rules established please, see here.

Once the booking is confirmed - according to the class schedule above - students must go to Room G50, which is located on the 3rd floor of the Building G in viale Regina Elena This opens up a number of challenges on how to deal with those data, as traditional computing paradigms are not conceived to operate at such a scale.

In addition to addressing foundational computer science problems, such as searching and sorting, big data computing mainly focuses on extracting knowledge - thereby value - from large-scale data sets using advanced data analysis techniques, such as machine learning. This course is intended to provide graduate-level students with a deep understanding of programming models and tools that are suitable for the large-scale analysis of data distributed across clusters of computers.

Prerequisites The course assumes that students are familiar with the basics of data analysis and machine learning, properly supported by a strong knowledge of foundational concepts of calculus, linear algebra, and probability and statistics.

In addition, students must have non-trivial computer programming skills preferably using Python programming. Previous experience with Hadoop, Spark, or distributed computing is not required.

Exams Students must prove their level of comprehension of the subject by developing a software project, leveraging the set of methodologies and tools introduced during classes.

Projects must of course refer to typical Big Data tasks: e.

The topic of the project must anyway be agreed with the professor in advance; references where to select interesting projects from will be however suggested throughout the course e. Projects can btc 2021 data dellesame della 3a sem done either individually or in group of at most 2 students, and they should be accompanied by a brief presentation written in english e.

Finally, there will be an oral exam where submitted projects will be discussed in english; other questions on any topic addressed during the course may also be asked, but those can be answered either in english or in italian, as the student prefers.

