Fall 2021 Syllabus

A course unlike any other data science course...?

Course Name: Applied Data Science with Venture Applications: Data-X

Course Number: 

  • INDENG 135 (undergraduate students)
  • INDENG 235 (graduate students)

Units: 3

Semester: Fall 2021

Faculty and Graduate Student Instructors (GSIs):

Role Name and Email Office Hours
Faculty Ikhaq Sidhu, sidhu@berkeley.edu By appointment, and Fridays after class ends

For required project meetings: Mondays 1-2pm and Thursdays 4-5pm PST

Please contact Melissa Glass (Manager, New Initiatives), m.glass@berkeley.edu 

Faculty Derek S. Chan, derekschan@berkeley.edu By appointment, and Fridays after class ends

For required project meetings: Mon-Thu 12-1 or 5-6pm PST, or evening/weekend options via Zoom

GSI Mahan Tajrobehkar, mahan_tajrobehkar@berkeley.edu  Wednesdays, 4:10-5pm PST, via Zoom (Password: 800392) 
GSI Ruiqi Guo, ruiqiguo@berkeley.edu   Mondays, 2-3 PM PST, via Zoom

*At office hours and on Slack Data-X (INDENG 135 / 235), for specific questions on details about algorithms, coding, and math, please ask the GSIs, Mahan and Ruiqi. As students, please also ask and help each other.

Meeting Day/Time: Fridays, 2:10-5:00pm (usually ~2.5 hours) from 8/27 to 12/3/2021. No class Friday 11/26 due to the Thanksgiving holiday 11/25-11/26.

Meeting Location: Room 0105, North Gate

Course Website: You are at the course website, https://datax.berkeley.edu/berkeley-course/ 🙂

Course Prerequisites: Due to the technical nature of this class, students are recommended to have the ability to write code in Python, and have taken a probability or statistics course. At the same time, students from all majors are welcome.

  • A good way to apply Python further is to partner on a team to build real-world projects through a proven framework like Innovation Engineering. Data-X incorporates that framework for AI, data, and systems projects, for example, with students' storytelling, customer validation through low-tech designs / demos, and executing while learning technically.
  • Though stronger programming is correlated with stronger team projects, student diversity adds value. Plus the course not only includes Python coding but is scheduled to introduce additional “low code” (limited coding) tech tools for building real-world projects.

Course Description: Today, the world is literally reinventing itself with Data and AI. However, learning a set of ‘related theories’ and being able to ‘make it work’ are not the same. And, in areas as important as Artificial Intelligence, Data Science, and Machine Learning, if we collectively cannot actually implement and create, then we'll reduce our competitive advantage, economic strength, and even national/global security.

The Data-X framework is designed to bridge the gap between theory and practice as well as academia and industry, by exposing students to state-of-the-art implementation techniques and mindsets.

This highly-applied course surveys a variety of key concepts and tools that are useful for designing and building data science, AI, and Machine Learning applications and systems. The course introduces modern, open source computer programming tools, libraries, and code samples that can be used to implement data applications. The mathematical concepts highlighted in this course include filtering, prediction, classification, decision-making, LTI systems, spectral analysis, and frameworks for learning from data. Each math concept is linked to implementation using Python libraries like NumPy for math array functions, Pandas for manipulation of tables, scikit-learn for machine learning modeling, TensorFlow and Keras for deep learning, and many other topics related to NLP, Neural Networks, Recommender Systems, etc.

Almost weekly, the course intends to cover not 1-2 but 3 tracks, each aimed to guide student teams' project-based work: 1) broader insight from inductive learning games and industry guests; 2) code and theory; and 3) teams and projects (e.g., live team demos and feedback), and helps you with the Innovation Engineering framework below. The framework includes story development, execution while learning, innovation behaviors, and leadership.

Course Objectives

You will learn

  • To define and execute what, why, and how to build real world AI, data, and systems applications for users – working on a project in a team
  • Computer science tools for data science
  • Relevant theory, critical thinking, and insights on AI, data, and systems

Textbook/Resources

Course Communication

Announcements will be made via Slack Data-X (INDENG 135 / 235). As students, much of your learning will be from each other. Slack can facilitate class conversations and team building, and is used in industry. Slack will also be an option for you to ask questions in live class, in case you prefer to ask via text than aloud.

Attendance/Participation Policy

A student shouldn't attend class if sick. Otherwise, because the course is an applied and project team course, attendance and participation are important. If a student has noticeable absences from activities in and out of class, please connect with instructions on rationale. Students may be dropped from the class due to absences. All classes are scheduled to be automatically recorded (e.g., to support exception cases such as sickness) and can be found at bCourses -> Fall 2021 Data-X (INDENG 135 / 235) -> Media Gallery.

In class, students must adhere to current campus directives related to COVID-19 and refusal to do so may result in the student being asked to leave.

Weekly Schedule and Assignments (subject to change)

The weekly schedule and assignments are meant to provide an outline of the course material and structure. However, it is not set in stone and may be modified as the semester unfolds. If substantive updates occur on the syllabus, instructors will communicate via Slack Data-X (INDENG 135 / 235) and in class so you are aware this webpage is updated.

Classes run ~2.5 hours but have a 10-minute break(s) in between.

Acronyms below: IS=Ikhlaq Sidhu, DSC=Derek S. Chan

Week Date Broader insight (mostly) Code and theory Teams and projects Due by subsequent class Recommended by subsequent class
1 8/27 (30 mins) IS & DSC: Why the Data X course is important for students, set course expectations and grading, and cover the history of Data-X. (30-45 mins) IS & DSC Lecture: Intro to ML insights (45 mins) IS & DSC: Cover project definition and your questions (reference module 020) * Read Innovation Engineering, Chapter 4: A Step-by-Step Guide for Innovative Projects (23 pages). For the encryption password, please see Slack Data-X (INDENG 135 / 235), pinned in the #general channel.

* Sheet: Everyone brainstorms, writes down 3 ideas for a project (also quantify potential problem or impact), and a one-paragraph introduction of who you are and what skills and background you bring to the class.

Set up your Jupyter environment (modules 030A, 030B). Use these modules if needed. 030: Installation Instructions. Review basic Python code from cookbook module 030C.

Search and evaluate potential project datasets

2 9/3 Venture track

* (30 mins) Guest Lecture 2:10-2:40pm PST: "Common pitfalls of entrepreneurship" by Shuo Chen (General Partner at IOVC | Faculty at UC Berkeley | CEO at Shinect)

(30 mins) IS: Lecture: Review of Python Data Handling Tools.

Recommended: Review code from cookbook Numpy and Pandas modules 110 and 120 on your own. Be able to run the notebooks on your own computer. If not familiar with these tools, view code videos.

(30 mins) DSC Interactive Session: Diversity, quantity, and quality of project datasets (slides and code)

(60 mins)

* IS: Explain how teams are formed

* DSC: Review of collaborators/industry experts background and project interests via LinkedIn profiles

* 5-10 students pitch project ideas for 1 min each for 1 extra participation point.

* Note: Industry experts do not decide or direct projects, but can act as sounding boards.

* Everyone completes a survey about their interests, behaviors, and skills. Everyone fills out their top industry expert choices. Due Wednesday, 9/8, 11:59pm PST.

* Read Innovation Engineering, Chapter 6: Common Strategic Errors and Story Narrative Mistakes (13 pages). For the encryption password, please see Slack Data-X (INDENG 135 / 235), pinned in the #general channel.

Review code from cookbook Numpy and Pandas modules 110 and 120 on your own. Be able to run the notebooks on your own computer. If not familiar with these tools, view code videos.

Search and evaluate potential project datasets (continued)

How To Stop Artificial Intelligence From Marginalizing Communities?, by Timnit Gebru 

How I'm fighting bias in algorithms, by Joy Buolamwini 

Artificial Intelligence needs all of us, by Rachel Thomas 

Bias in Data and A.I., by Ruja Benjamin 

3 9/10 Non-Venture track

* (30 mins) Guest Lecture 2:10-2:40pm PST: "AI for Social Good Projects" by Ruth Alcantara (Program Manager, AI for Social Good at Google)

Data science application

* (30-45 mins) IS Lecture: A System's View of Data Science with Prediction

* (30 mins) DSC Interactive Session: Slides and code on application of classification, feature importances, precision vs. recall for defining success

(30 mins) Team assignments announced. Project topics not specified yet.

* IS: Demonstrate Navigator.

* DSC: Review Dataset Slide.

* Select preliminary project idea as a team and fill out best elevator pitch of your team project in NABC (Need, Approach, Benefit, Competition) and Dataset slides 2-3 and 7. Submit to Week 4 subfolder.

* List links / locations of potential dataset(s) for your idea

* Start background research on what is available to you

4 9/17 Low-tech demo

* (40 mins) DSC Inductive Learning Game: Customer validation (incentives for winning teams). Each project team is advised to bring at least 1 laptop.

* (15 mins) IS review Low Tech Demo slides due for next week

(30+ mins) IS low-code and high-code tools - including Anvil and licenses.

* Introduce tools, and demo Berkeley Innovation Index (BII) application code

(45 mins)

* IS: Student team checklist review

* Master Class Format: 3-5 project pitches in NABC format

* Complete first version of low-tech demo slides. Submit to Week 5 subfolder.

* Read Innovation Engineering, specific section (9 pages) withheld on purpose until week 4. For the encryption password, please see Slack Data-X (INDENG 135 / 235), pinned in the #general channel.

5 9/24 * (30 mins) DSC Inductive Learning Game: To evaluate whether companies use or don’t use AI (incentives for top team)

* (30 mins) DSC Interactive Session: H2O.ai (low-code automated ML option on Titanic dataset)

(30 mins) DSC Titanic Notebook (Modules 160A-160D walkthrough of pros, cons, and insights beyond); new Module 161; and results vs. H2O.ai (60 mins) 3-5 sample presentations of low-tech demos and Q&A Due 10/1/2021

* Individual student technical homework assignment. Submit to a Week 6 subfolder.

* Submit team self-review slide(s) on whether you addressed "Common Strategic Errors and Story Narrative Mistakes" (Resource: Innovation Engineering, Chapter 6). Submit to a Week 6 subfolder.

* One-time exception to (re)submit week 4 and/or week 5 to a Week 6 subfolder. Most recent versions in Week 4-6 subfolders will be graded.

6 10/1 (20 mins) IS: Review tech strategy tools on the Innovation Engineering site and slide decks you will create, and review technology strategy template due future week (50 mins)

* Guest Lecture: "Introduction to Neural Networks, then Bidirectional Encoder Representations from Transformers (BERT)" by Mario Filho (Machine Learning Expert | Kaggle Grandmaster | Data Scientist)

* DSC: BERT in industry and within an AI/data system

Code for BERT and Anvil, Colab, and APIs

(60 mins) 3-5 sample presentations of low-tech demos and Q&A Due 10/8/2021 by class

* Submit individual Technology Strategy template (through the trade-offs slide) in a Week 7 folder. Each student on a team picks one different component or platform tool.

Between 10/4 to 10/15 

* Every student team schedules time and meets with their assigned instructor group for 30 minutes. Industry mentor attendance is optional there.

* Module 170 ML Algorithm Overview video to cover the meaning of the classifications. Explain in your own words the difference between Logistic Regression, Trees and Neural Networks, and turn in 1 page.
7 10/8 -- (40 mins) IS Lecture: System's view of correlation

(35 mins) DSC: Brief technical topics on project data and system (17 mins). Then team-based execution activity, with option on project data or system (17 mins).

Code example for web scraping

At least 1 student demo technology strategy slides

2 teams demo slides, hold Q&A, and receive feedback

Due 10/15/2021 by class

* Submit brief written reflection on each presenting team via your berkeley.edu email at this Google Form.

Between 10/4 to 10/15 

* Every student team schedules time and meets with their assigned instructor group for 30 minutes. Industry mentor attendance is optional there.

8 10/15 (45 mins) Guest Lecture 2:15-3:00pm PST: "AWS SageMaker, AI Services, & MLOps for building an AI/data system end-to-end" by Shelbee Eigenbrode [Principal AI/ML Specialist Solutions Architect at Amazon Web Services (AWS)] (45 mins) DSC Technical Inductive Learning Game: Active learning machine learning

(Learning objective: For your project or company, save time and/or money with strategic iterative data labeling and machine learning.)

(20 mins)

1 team demos slides, holds Q&A, and receives feedback

Due Mon 10/18/2021, 11:59pm PST

* In-class game active learning machine learning

Due 10/22/2021 by class

* Submit brief written reflection on each presenting team via your berkeley.edu email at this Google Form.

* Individual homework assignment: Submit your prototype system to a Week 9 subfolder.

9 10/22 (45 mins) Guest Lecture: Production ML Systems, Infrastructure, Scalability, and Hidden Costs in Industry" by Michael Mui (Technical Lead at Uber AI) (45 mins) IS Lecture: Measurement for Decisions  

Teams' app/code project demos start 10/22 and beyond, reflecting sprint execution

At least 2 teams demo technical app/code project work, more than slides; hold Q&A; and receive feedback

Due Mon 10/25/2021, 11:59pm PST

* Submit individual 4-question, multiple-choice survey (~3 minutes)

Due 10/29/2021 by class

* Submit brief written reflection on each presenting team via your berkeley.edu email at this Google Form.

* Submit individual self-selected homework module related to your project at a Week 10 subfolder. Consult with GSIs, as needed.

* Submit first group video to a Week 10 subfolder to share each other's individual learning goal and team role, your's group definition of end-of-semester success, and a demo of your technical app/code project work so far.

10 10/29 -- (45 mins) DSC Technical Inductive Learning Game: Grid vs. random search; Bayesian optimization; and ensemble selection. (Learning objective: For your project or company, build intuition for systematic ML practices.)

(45 mins) IS Theory Lecture: Spectral Information in Data (similar to Module 250), plus potential use in your projects/systems

At least 2 teams demo technical app/code project work, more than slides; hold Q&A; and receive feedback Due Wed 11/3/2021, 11:59pm PST

* In-class game on systematic ML

* In-class exercise on Fast Fourier transform (FFT), with instructions page 25

Due 11/5/2021 by class

* Submit brief written reflection on each presenting team via your berkeley.edu email at this Google Form.

* Submit another individual self-selected homework module related to your project at a Week 11 subfolder. Consult with GSIs, as needed.

* Submit second group video to a Week 11 subfolder. Demo technical app/code project work and learning the past week. 4 minutes maximum.

11 11/5 (45 mins) Guest Lecture 3:00-3:45pm PST: “10 commandments of startup creation” with stories of success/failure by Oren Etzioni (CEO at AI2; Professor Emeritus at University of Washington; Venture Partner at Madrona Venture Group)

(15 mins) DSC user testing preparation in advance of next week

--

 

(15 mins) Final project grading criteria

(15 mins) Free time for teams

(45 mins) At least 2 teams demo technical app/code project work, more than slides; hold Q&A; and receive feedback

Due 11/12/2021 by class

* Submit brief written reflection on each presenting team via your berkeley.edu email at a Google Form.

* Be prepared for user testing other students on your app

Between 11/8 to 11/19 

* Every student team schedules time and meets with their assigned instructor group for 30 minutes. Industry mentor attendance is optional there.

12 11/12 (60 mins) DSC Interactive Session: Each team sets up a station in class and conducts user tests of their app on individuals from other teams

Learning outcome: Submit individual user testing template by class end, as practice to conduct team user testing of external users for the next homework

(15 mins) IS Lecture: Wrap-up on Fast Fourier Transform (FFT) from your exercise completed 10/29 - 11/3 Final 4 teams demo technical app/code project work, more than slides; hold Q&A; and receive feedback. The teams also share their usability test feedback from the earlier session that day. Due Mon 11/15/2021, 11:59pm PST

* Submit individual user testing template to week 12 subfolder

Due 11/19/2021 by class

* Submit brief written reflection on each presenting team via your berkeley.edu email at a Google Form.

* Submit team user testing assignment on external users to week 13 subfolder

Between 11/8 to 11/19 

* Every student team schedules time and meets with their assigned instructor group for 30 minutes. Industry mentor attendance is optional there.

13 11/19 (60 mins) Work opportunities

Guest panel: AI / data roles

(60 mins) DSC: Hands-on-keyboard session on virtual server setup and access to persist scripts for system apps Preparation for final project presentation and submission of required items below

  • App
  • Slide deck
  • Code repository
  • News article
  • 5-minute video
Due Mon 11/22/2021, 11:59pm PST

* Submit individual virtual server setup to week 13 subfolder

Due by next class, 12/3

* Provide final project submission via Week 15 folder, write individual peer reviews, and co-present final project live 12/3

14 11/26 Thanksgiving holiday 11/25-11/26
15 12/3 Location: Sibley Auditorium at Bechtel Engineering Center near usual classroom North Gate 0105

Time: 1:10-5:30pm, but students only need to attend either the first half or second half to learn from and support other students at final project presentations. Student teams selected their respective presentation times in advance.

First half

* 1:10-1:25pm: Michelle Lee (Academic Program Manager)

* 1:25-3:25pm: First 9 teams, with ~12 minutes per team

Second half

* 3:30-3:45pm: Michelle Lee (Academic Program Manager)

* 3:45-5:30pm: Last 8 teams, with ~12 minutes per team

To avoid Zoom setup issues across teams, one laptop will be set up with Zoom video/audio, then each presenting team can join that Zoom with their own laptop(s) to share screen on stage in the auditorium.

 

Grading

Grade Adjusted Range*
86-110%
B 70-85%
C 61-69%
D 56-60%

*Adjusted Range: The top 25 students’ average (e.g., 90%) will be adjusted to 100%. An example illustration is below.

Top 25 students (average) = 90%, so 90% / 90% = 100% (Adjusted)
Student 1 = 99%, so 99% / 90% = 110% (Adjusted)
Student 25 = 87%, so 87% / 90% =   97% (Adjusted)
Student X = 70%, so 70% / 90% =   77% (Adjusted)

At week 11, each student can receive confidentially where their interim grade is at.

The two tables below were updated and reviewed at class 11/5/2021.

Week(s) Deliverable Points
All Total 123.00
Rough passing grade in points (before curve) 71.00
2, 3 Complete brief surveys on your interests, behaviors, and skills 2.00
4 Team selection of project idea and submission of initial slides 5.00
5 Team low-tech demos 5.00
6 Individual technical homework 10.00
6 Team self-review on "Common Strategic Errors and Story Narrative Mistakes" 5.00
6 Exception to (re)submit weeks 4 and/or 5 for up to 10% of final grade. Most recent versions in Week 4-6 subfolders will be graded.
7 Individual tech strategy templates (one template per person) 5.00
7, 8 Team’s 1st required 30-min meeting (office hours) with instructor group to support
9 Individual homework on building a prototype system (model, UI, and connection between) 8.00
10 Individual project work for team related to a relevant self-selected module 5.00
10 Group project demo video 1 5.00
11 Individual project work for team related to a relevant self-selected module 5.00
11 Group project demo video 2 3.00
11, 12 Team’s 2nd required 30-minute meeting (office hours) with instructor group to support
13 Team user testing of external users with app 5.00
15 Team final project 25.00
15 Individual peer reviews 5.00
7-15 Participation (rows below)
7-12 6 reflections 12.00
8 In-class Kaggle game 1 2.50
9 Mentor and team statuses survey (4-question multiple choice) 1.00
10 In-class Kaggle game 2 2.50
10 In-class Fast Fourier transform exercise (FFT) 2.50
10-15 Attendance (1 point per week, no class week 14) 5.00
12 In-class individual user testing 2.00
13 In-class virtual server exercise 2.50

 

While 75 points are listed below, that will be rescaled to 25 points of final course grade. E.g., 65 / 75 = 21.7 / 25 = 86.7% on final project
Area Points Possible Grading Criteria
Total 75 Please see below
Validate problem with data if possible / don't solve the wrong problem 5 5=Reasonable
3=Semi-reasonable
1=Entry but no data
Clear value proposition 5 5=Reasonable
3=Semi-reasonable
1=Entry but no data
Emotion, not all logic 5 5=Clear attempt
3=Semi-clear attempt
0=No attempt
Test if your system helps or doesn't help a user's in-app decision to their problem:
Share summary insights from testing 5 or more users on your appTry to have users be as close as possible to your stakeholders
5 5=Insights are/were actionable
3=Insights aren't/weren't actionable
0=No insights
Your system runs end to end during live demo, with your assigned GSI as your user
User interaction, inputs, outputs, data storage of inputs and outputs are shown(Imagine multiple teams reach out to their assigned GSI with 3 days or less left. That will be too late to accommodate all teams. Better to start earlier.)
15 15=System runs end to end
7.5=System runs partially
0=System doesn't run
Final system architecture 5 5=Reasonable
3=Semi-reasonable
0=No entry
Dataset diversity, quality, and quantity, with quantity depending on the system you are building
(e.g., if you chose to create labels, what is your measure of quality on the labels)
5 5=Code output clear
3=Code output vague
0=No code output
2 or more quality practices implemented accurately. Any of the below.

* Clear combination of data, system, decision-making, simulation, and/or UI
* Techniques like measurement for decisions; simulation; FFT, etc.
* If ML / NLP, example practices include
-Feature engineering such as target encoding
-Transfer learning and retraining
-Bayesian optimization or random search
-Ensemble methods

10 10=Implemented 2+ quality practices accurately
5=Implemented 1 quality practice accurately
0=No implemented no quality practices
Final user experience 10 10=App easy to use and interpret
5=App hard to use or interpret (e.g., one needs to ask questions to use)
0=App unusable and uninterpretable (e.g., one can't figure out)
Originality (e.g., create new insights not in problem space before; create uncommon solution; create common solution in a problem space it hasn't been) 5 5=Clear originality
3=Some originality (e.g., online research shows 3rd-party has done some)
1=No originality shown
News story and 5-minute demo video

5 Criteria
* Quantify need
* Clear value proposition that involves emotion, not all logic
* Show demo that runs end to end
* Illustrate user feedback
* Show 2 or more technical system practices (ok to show)

5 For each bullet point, 1 point if attempted
Area

(Below is NOT counted in course project grade, only potential tiebreakers to select one team with option to represent Data-X at Collider Cup

Points Possible Criteria
Number of users, or high impact for less users 5 5=10+ users
3=5 users
0=0 users
A 3rd-party (other than your mentor) agrees to be cited as a published reference 5 5 =2+ references other than your mentor
3=1 reference other than your mentor
0=0 references other than your mentor

 

Course Evaluations

At the middle and end of the term, students will be asked to fill out an evaluation to give feedback about the course. SCET values and appreciates student responses, which are used to better understand and improve our courses. Students are strongly encouraged to submit the evaluations. 

Focus Groups

Two optional focus group events (perhaps with pizza served) will be held during the semester outside class hours for your feedback aloud on the course.

Scheduling Conflicts

Please notify us in writing as soon as possible about any known or potential extracurricular conflicts. We will try our best to help you with making accommodations, but cannot guarantee them in all cases.

Student Code of Conduct & Academic Integrity

Berkeley honor code: Everyone in this class is expected to adhere to this code: “As a member of the UC Berkeley community, I act with honesty, integrity, and respect for others.”

Student Conduct: Ethical conduct is of utmost importance in your education and career. The instructors, the College of Engineering, and U.C. Berkeley are responsible for supporting you by enforcing all students’ compliance with the Code of Student Conduct and the policies listed in the CoE Student Guide. The Center for Student Conduct is set up to support you when you have been affected by actions that may violate these community rules. This includes an organized and transparent process, student participation in the process, mechanisms for appeals, and other mechanisms to protect fairness (https://sa.berkeley.edu/conduct).

Academic Integrity: Any assignment submitted by you and that bears your name is presumed to be your own original work that has not previously been submitted for credit in another course unless you obtain prior written approval to do so from your instructor. In all of your assignments, you may use words or ideas written by other individuals, but only with proper attribution. To copy text or ideas from another source without appropriate reference is plagiarism and will result in a failing grade for your assignment and usually further disciplinary action. For additional information on plagiarism, self-plagiarism, and how to avoid it, see the Berkeley Library website.

If you are not clear about the expectations for completing an assignment or taking a test or examination, be sure to seek clarification from your instructor beforehand. Anyone caught committing academic misconduct will be reported to the University Office of Student Conduct. Potential consequences of cheating and academic dishonesty may include a formal discipline file, probation, dismissal from the University, or other disciplinary actions. 

Inclusion: We are committed to creating a learning environment welcoming of all students. To do so, we intend to support a diversity of perspectives and experiences and respect each others’ identities and backgrounds (including race/ethnicity, nationality, gender identity, socioeconomic class, sexual orientation, language, religion, ability, etc.). To help accomplish this:

  • If you have a name and/or set of pronouns that differ from your legal name, designate a preferred name for use in the classroom at: https://registrar.berkeley.edu/academic-records/your-name-records-rosters.
  • If you feel like your performance in the class is being impacted by your experiences outside of class (e.g., family matters, current events), please don’t hesitate to come and talk with the instructor(s).  We want to be resources for you.
  • We are all in the process of learning how to respect and include diverse perspectives and identities. Please take care of yourself and those around you as we work through the challenging but important learning process.
  • As a participant in this class, recognize that you can be proactive about making other students feel included and respected.  

Student Accommodations

We honor and respect the different learning needs of our students, and are committed to ensuring you have the resources you need to succeed in our class.  If you need accommodations for any reason (e.g. religious observance, health concerns, insufficient resources, etc.) please discuss with your instructor or academic advisor how to best support you.  We will respect your privacy under state and Federal laws, and you will not be asked to share more than you are comfortable sharing.  The disabled student program is a related resource, listed below. UC Berkeley is committed to creating a learning environment that meets the needs of its diverse student body. If you anticipate or experience any barriers to learning in this course, please feel welcome to discuss your concerns with me.

If you have a disability, or think you may have a disability, you can work with the Disabled Students' Program (DSP) to request an official accommodation. The Disabled Students' Program (DSP) is the campus office responsible for authorizing disability-related academic accommodations, in cooperation with the students themselves and their instructors. You can find more information about DSP, including contact information and the application process here: dsp.berkeley.edu. If you have already been approved for accommodations through DSP, please meet with me so we can develop an implementation plan together.

Students who need academic accommodations or have questions about their accommodations should contact DSP, located at 260 César Chávez Student Center. Students may call 642-0518 (voice), 642-6376 (TTY), or e-mail dsp@berkelely.edu.

Prevention of Harassment and Discrimination

The University is committed to creating and maintaining a community dedicated to the advancement, application and transmission of knowledge and creative endeavors through academic excellence, where all individuals who participate in University programs and activities can work and learn together in an atmosphere free of discrimination, harassment, exploitation, or intimidation. For more information on related policies, resources and how to report an incident, see the Office for the Prevention of Harassment and Discrimination (OPHD) website

Safety and Emergency Preparedness/Evacuation Procedures

As class activities may keep you on campus at night, check out the Cal’s Night Safety Services website for details on the University’s comprehensive free night safety services. See the Office of Emergency Management website for details on Emergency Preparedness/Evacuation Procedures. The UC Berkeley Police Department website also has information regarding safety on campus. Dial 510-642-3333 or use a Blue Light emergency phone if you need help.

Grievances

If you have a problem with this class, you should seek to resolve the grievance concerning a grade or academic practice by speaking first with the instructor. Then, if necessary, take your case to the SCET Chief Learning Officer, SCET Faculty Director, IEOR Department Chair, and to the College of Engineering Dean, in that order. Additional resources can be found on the Student Advocate’s Office website and the Ombuds Office for Students website.

SCET Certificate in Entrepreneurship & Technology

This class can be used towards requirements to earn the SCET Certificate in Entrepreneurship & Technology. For details on the certificate requirements and other opportunities to engage with the Center, see the SCET website

Support during Remote Learning: 

We understand that your specific situation may present challenges to class participation. Please contact the instructors if you would like to discuss these and co-develop strategies for engaging with the course. 

The Student Technology Equity Program (STEP) is available to help access a laptop, Wi-Fi hotspot, and other peripherals (https://technology.berkeley.edu/STEP).

Additional Resources

See the Student Affairs website for more information on campus and community resources.

Center for Access to Engineering Excellence (CAEE)                           

The Center for Access to Engineering Excellence (227 Bechtel Engineering Center;

https://engineering.berkeley.edu/student-services/academic-support) is an inclusive center that offers study spaces, nutritious snacks, and tutoring in >50 courses for Berkeley engineers and other majors across campus.  The Center also offers a wide range of professional development, leadership, and wellness programs, and loans iclickers, laptops, and professional attire for interviews.  

Counseling and Psychological Services       

University Health Services Counseling and Psychological Services staff are available to you at the Tang Center (http://uhs.berkeley.edu; 2222 Bancroft Way; 510-642-9494) and in the College of Engineering (https://engineering.berkeley.edu/students/advising-counseling/counseling/; 241 Bechtel Engineering Center), and provide confidential assistance to students managing problems that can emerge from illness such as financial, academic, legal, family concerns, and more. Long wait times at the Tang Center in the past led to a significant expansion to include a 24/7 counseling line at (855) 817-5667.  This line will connect you with help in a very short time-frame.  Short-term help is also available from the Alameda County Crisis hotline: 800-309-2131.  If you or someone you know is experiencing an emergency that puts their health at risk, please call 911.  

The Care Line (PATH to Care Center)

The Care Line (510-643-2005; https://care.berkeley.edu/care-line/) is a 24/7, confidential, free, campus-based resource for urgent support around sexual assault, sexual harassment, interpersonal violence, stalking, and invasion of sexual privacy. The Care Line will connect you with a confidential advocate for trauma-informed crisis support including time-sensitive information, securing urgent safety resources, and accompaniment to medical care or reporting.

Ombudsperson for Students                                            

The Ombudsperson for Students (102 Sproul Hall; 642-5754; http://students.berkeley.edu/Ombuds)  provides a confidential service for students involved in a University-related problem (academic or administrative), acting as a neutral complaint resolver and not as an advocate for any of the parties involved in a dispute. The Ombudsman can provide information on policies and procedures affecting students, facilitate students' contact with services able to assist in resolving the problem, and assist students in complaints concerning improper application of University policies or procedures. All matters referred to this office are held in strict confidence. The only exceptions, at the sole discretion of the Ombudsman, are cases where there appears to be imminent threat of serious harm.

UC Berkeley Food Pantry

The UC Berkeley Food Pantry (#68 Martin Luther King Student Union; https://pantry.berkeley.edu) aims to reduce food insecurity among students and staff at UC Berkeley, especially the lack of nutritious food. Students and staff can visit the pantry as many times as they need and take as much as they need while being mindful that it is a shared resource. The pantry operates on a self-assessed need basis; there are no eligibility requirements.  The pantry is not for students and staff who need supplemental snacking food, but rather, core food support.

Disclaimer: Syllabus/Schedule are subject to change.

References

Innovation Engineering Textbook. Data-X students can access required sections for free. Encryption password is at Slack Data-X (INDENG 135 / 235), pinned in the #general channel.

Navigator Tool template at Innovation Engineering website -> Google Slides to reinforce inductive learning

Low Tech Demo template at Innovation Engineering website -> Google Slides

 

Example dataset source Link
Kaggle https://www.kaggle.com/datasets

https://www.kaggle.com/datasets?sort=votes&datasetsOnly=true

AWS (e.g., Data Exchange) https://aws.amazon.com/data-exchange/

https://registry.opendata.aws/ 

Google Dataset Search https://datasetsearch.research.google.com/ 
Hugging Face https://huggingface.co/datasets/ 
Towards AI (article of dataset links) https://pub.towardsai.net/best-datasets-for-machine-learning-data-science-computer-vision-nlp-ai-c9541058cf4f 
Ubuntu Pit (article of dataset links) https://www.ubuntupit.com/best-machine-learning-datasets-for-practicing-applied-ml/

 

At the end of the semester, one Data-X team and project can qualify to compete for the Collider Cup, SCET's all-star showcase.

Directory of Advisors and Industry Experts for Data-X

The Data-X course and project brings together students, technical experts, start-up companies, and executives.  Each brings a different perspective to data, algorithms, and scale. Please refer to the People webpage.