Skip to content

Latest commit

 

History

History
99 lines (74 loc) · 3.72 KB

File metadata and controls

99 lines (74 loc) · 3.72 KB

Please fill out feedback forms

Feedback #1

https://forms.gle/jp2pEmYbPVXZDgLN6

Feedback #2

https://forms.office.com/Pages/ResponsePage.aspx?id=s_BgbwZfCU6XFZiduozH2NtV2sAf02dAjo6J9c3l1U9UNVhaSDY1Qk1ZSExWSUlQTlNXWEdVSU5GNS4u

When is the final?

  • Wed., Dec 16, 10:30am–12:45pm

  • 135 minutes

Structure of the final

  • Multiple-choice questions, 45 minutes, 45-40 questions, 4 options each
  • Jupyter-Notebook type questions (programming), 90 minutes, 4-5 questions
    • 2-programming problems
      • Emphasis ch 9, 10, 11 in Intro to python chapters
      • Objects
      • File/file IO
      • Recursion
      • Exceptions
      • earlier chapters may be part of problems
    • 2-3 statistics/machine learning/pandas problems
      • Emphasis on chapter 15 in Intro to Python
      • Emphaiss on Chapter 10-17 in Statisitical and Inferential thinking
      • We will use pandas rather than data science library
      • Will use numpy, sklearn, pandas, matplotib, seaborn
      • You should know how to make plots, load data, extract columns
      • Hypothesis testing
      • Permutation Test
      • Bootstraping
      • Regression
      • Classification
      • Clustering

Studying Tips

  • Write questions
  • Study backwards
  • Be able to quickly navigate documentation for numpy, sklearn, pandas, matplotib, seaborn
  • Be able to go through examples (eg. in the books)
  • Good exercise to make sure you know pandas is to translate interential thinking examples into pandas language

Some themes from the MC (concepts part)

Some ML/Stats concept questions

  • What is the difference between supervised and unsupervised ML?
  • What is the difference between classification and regression problems?
  • What is the difference between clustering and dimension reduction problems?
  • Name a classification algorithm?
  • Name a regression algorithm?
  • Name a clutestering algorithm?
  • Name a dimension reduction algorithm?
  • What is the permutiation test?
  • When would you use the permutation test?
  • What is bootstrap?
  • WHen whould you use bootstrap?
  • What is correlation?
  • Why do you split your data into a "train" subset and "test" subset?
  • Why do we need a validation?

Some Example Programming Questions (not in MC form here):

  • Whats the difference between a class and an object?
  • What is an attribute?
  • What is a method?
  • What is the difference between a function and an (object) method?
  • What is a property?
  • What are the four aspects of object oriented programming?
  • What are exceptions and what are the (possible) blocks in an exception?
  • How do you write data to a text file?
  • How do you read data from a text file?
  • What is stdout, stderr and how is it related to a print statement?
  • What is the "with" statement and how does it help with file I/0?
  • Whats the difference between a recursion solution and an iterated solution?
  • What is better about recursion? What is better about iteration?

Programming Problems

As stated above, you will need to write objects/and or functions which use loops, if statments types, strings, etc. to solve some problem.

ML/Data Science problem

Almost always you will have data, either loaded from a library like in sklearn (eg. iris data) or from a csv file (eg. titanic).

You will need to massage the data into the right form.

Then you might have a stats problem (is there a stat difference between group A and group B) which will require to you do stuff with the data (simulation, permutation etc), graph and compute.

Or an ML problem like Classification, Regression, Dimention Reduction or Clustering. Again you will need to compute and make pictures. Also explain.