BigData

Intro To Big Data

CS 567

COURSE INFORMATION

  • Class Time: MW 12:00-1:15 PM
  • Building and Room: CEC B146B
  • Prerequisites: Fluent in at least one of the following programming languages: Python, Java, or Scala
  • Background in: Data Mining, Machine Learning or Statistics
  • UNM Learn: CS-567 (Fall 2016)

Instructor

  • Trilce Estrada, Assistant Professor
  • Email: estrada@cs.unm.edu
  • Office: CARC 2004A
  • Office hours: M 9:00-12:00

Course description:

The field of computer science is experiencing a transition from computation-intensive to data-intensive problems, wherein data is produced in massive amounts by large sensor networks, new data acquisition techniques, simulations, and social networks. Efficiently extracting, interpreting, and learning from very large datasets requires a new generation of scalable algorithms as well as new data management technologies.

In this course we explore key data analysis and management techniques, which applied to massive datasets are the cornerstone that enables real-time decision making in distributed environments, business intelligence in the Web, and scientific discovery at large scale. In particular, we examine the map-reduce parallel computing paradigm and associated technologies such as distributed file systems, no-SQL databases, and stream computing engines. Additionally we review machine learning methods that make possible the efficient analysis of large volumes of data in near real time.

This course is highly interactive and based on the problem-based learning philosophy; students are expected to make use of said technologies to design highly scalable systems that can process and analyze Big Data for a variety of scientific, social, and environmental challenges.

Core topics:

  • Large databases and their evolution.
  • Big Data technology and trends, special consideration made to the Map-Reduce paradigm.
  • Searching, indexing, and their implications to memory management.
  • Information extraction and feature selection.
  • Supervised-, unsupervised-learning, and stream mining.
  • Introduction to Cloud computing and Amazon EC2

Course objectives:

At the end of this course, the student will become familiar with the fundamental concepts of Big Data management an analytics; will become competent in recognizing challenges faced by applications dealing with very large volumes of data as well as in proposing scalable solutions for them; and will be able to understand how Big Data impacts business intelligence, scientific discovery, and our day-to-day life.


For more information look at the Syllabus

Register!


Hortonworks Academic Partner

Supported by AWS in Education Grant award


Archive

Poster session BigData 2013