2019-2020 2020-2021 2021-2022 2022-2023 2023-2024
Browse
by subject...
    Schedule
view...
 

1 - 3 of 3 results for: CS246

CS 246: Mining Massive Data Sets

The availability of massive datasets is revolutionizing science and industry. This course discusses data mining and machine learning algorithms for analyzing very large amounts of data. Topics include: Big data systems (Hadoop, Spark); Link Analysis (PageRank, spam detection); Similarity search (locality-sensitive hashing, shingling, min-hashing); Stream data processing; Recommender Systems; Analysis of social-network graphs; Association rules; Dimensionality reduction (UV, SVD, and CUR decompositions); Algorithms for large-scale mining (clustering, nearest-neighbor search); Large-scale machine learning (decision tree ensembles); Multi-armed bandit; Computational advertising. Prerequisites: At least one of CS107 or CS145.
Terms: Win | Units: 3-4 | UG Reqs: WAY-FR

CS 246H: Mining Massive Data Sets Hadoop Lab

Supplement to CS 246 providing additional material on the Apache Hadoop family of technologies. Students will learn how to implement data mining algorithms using Hadoop and Apache Spark, how to implement and debug complex data mining and data transformations, and how to use two of the most popular big data SQL tools. Topics: data mining, machine learning, data ingest, and data transformations using Hadoop, Spark, Apache Impala, Apache Hive, Apache Kafka, Apache Sqoop, Apache Flume, Apache Avro, and Apache Parquet. Prerequisite: CS 107 or equivalent.
Last offered: Winter 2020

CS 341: Project in Mining Massive Data Sets

Students work in teams of three to solve a problem involving the analysis of a massive dataset. A proposal, early in March is required. There will be an information session (announced in CS246) explaining the datasets available in early March and this information will also be on the CS341 course website in late February. Each accepted team will be assigned a mentor who will work with them regularly throughout the quarter. Teams will also be provided access to significant computing resources on a commercial public cloud.
Last offered: Spring 2020
Filter Results:
term offered
updating results...
teaching presence
updating results...
number of units
updating results...
time offered
updating results...
days
updating results...
UG Requirements (GERs)
updating results...
component
updating results...
career
updating results...
© Stanford University | Terms of Use | Copyright Complaints