Introduction to Data Science: Building Recommender Systems
Course Summary
This hands-on course is suitable for software engineers, data analysts and statisticians. It is problem-driven and focuses on helping participants understand what a data scientist does, the problems they typically solve and their approach to doing so. By taking a practical approach to the subject, including multiple hands-on exercises, participants will leave the course with skills they can immediately apply to real-world problems.
Download the full agenda for Cloudera's Introduction to Data Science.
Read the blog post: Training a New Generation of Data Scientists.
Data Science Webinar
Watch the on-demand webinar, Training a New Generation of Data Scientists to learn what data scientists do, how they think about problems, the relationship between data science and Hadoop, and how Cloudera training can help you join this growing and increasingly important profession, followed by an informative Q&A with Cloudera Senior Director of Data Science, Josh Wills. Watch now!
Duration
3 days.
You Will Learn
- Describe the role and responsibilities of a data scientist
- Explain several ways in which data scientists create value for organizations across many industries
- Locate and acquire data from diverse sources
- Use transformation and normalization techniques to produce accurate, useful data sets
- Determine the most appropriate type of analysis to perform for a given problem
- Be able to implement an automated recommendation system
- Develop, evaluate and refine scoring systems for recommenders
- Understand the considerations involved in working at scale
- Identify meaningful, actionable and business-oriented results from the analysis
Prerequisites
This course is suitable for software engineers, data analysts and statisticians with basic knowledge of Apache Hadoop: HDFS, MapReduce, Hadoop Streaming, Apache Hive. Students should have proficiency in a scripting language: Python is strongly preferred, but familiarity with Perl or Ruby is sufficient.
Certification Exam
Following successful completion of the training class, attendees will be given a voucher for one attempt of the written certification exam. This voucher is non-transfearable and is given only to individuals who successfully complete the entire training class. Participants are also encouraged to prepare for and take the Data Scientist hands-on lab exam. Both exams will be available soon.
Outline
- Introduction
- Data Science Overview
- Use Cases
- Project Lifecycle
- Data Acquisition
- Evaluating Input Data
- Data Transformation
- Data Analysis and Statistical Methods
- Fundamentals of Machine Learning
- Recommender Overview
- Introduction to Apache Mahout
- Implementing Recommenders with Apache Mahout
- Experimentation and Evaluation
- Production Deployment and Beyond
- Conclusion
- Appendix A : Hadoop Overview
- Appendix B: Mathematical Formulas
- Appendix C : Language and Tool Reference
Training Schedule
| United States | Jun 2013 | Jul 2013 | Aug 2013 | Sep 2013 |
|---|---|---|---|---|
| Charlotte, NC |
Jul 16 - Jul 18
|
|||
| Dallas, TX |
Aug 21 - Aug 23
|
|||
| Denver, CO |
Aug 14 - Aug 16
|
|||
| Los Angeles, CA |
Jul 24 - Jul 26
|
|||
| San Francisco Bay Area, CA |
Jul 10 - Jul 12
|
Aug 7 - Aug 9
|
Sep 18 - Sep 20
|
|
| Washington, DC Metro Area |
Jun 26 - Jun 28
|
Jul 17 - Jul 19
|
Sep 11 - Sep 13
|
| International | Jun 2013 | Jul 2013 | Aug 2013 | Sep 2013 |
|---|---|---|---|---|
| London, United Kingdom |
Jul 29 - Jul 31
|
|||
| 東京, Japan |
Jun 24 - Jun 26
|