We ask fellows to work on a small challenge problem to assess problem solving and coding capabilities:
Select a problem from the list below
Perform your analysis in a well-commented Jupyter notebook and post on Github or Google Colab
Share your notebook with us
Some hints for hacking our challenge:
Ask yourself why would they have selected this problem for the challenge? What are some gotchas in this domain I should know about?
What is the highest level of accuracy that others have achieved with this dataset or similar problems / datasets ?
What types of visualizations will help me grasp the nature of the problem / data?
What feature engineering might help improve the signal?
Which modeling techniques are good at capturing the types of relationships I see in this data?
Now that I have a model, how can I be sure that I didn't introduce a bug in the code? If results are too good to be true, they probably are!
What are some of the weaknesses of the model and and how can the model be improved with additional work?
Select Your Challenge Problem
Omniglot, the “transpose” of MNIST, with 1623 character classes, each with 20 examples.
Report one-shot classification (20-way) results using a meta learning approach like MAML.
Food-101 is a challenging vision problem, but everyone can relate to it. Recent SoTA is ~80% top-1, 90% top-5. These approaches rely on lots of TTA, large networks and even novel architectures.
Train a decent model >85% accuracy for top-1 for the test set, using a ResNet50 or smaller network with a reasonable set of augmentations.
Challenge Problem Frequently Asked Questions
What are you looking for in challenge submissions?
1. Problem solving ability - did you understand the problem correctly, and did you take logical steps to solve it?
2. Machine learning skills - what sort of models did you use? How rigorous was your exploratory analysis of the data, your choice and fine tuning of models, and your assessment of results.
3. Coding skills - does your python look presentable or do you code like a scientist?
4. Communication skills - is your solution readable and well explained? Messiness and raw code with no explanation does not reflect well on your potential for working well with our business partners during the fellowship.
What are some common mistakes I should avoid?
Skipping exploratory analysis and feature engineering
Do not jump straight into fitting models without demonstrating to us, in your Jupyter notebook, that you have understood and thought about the dataset.
Choosing models with no explanation
Please use the notebook to explain your thought process. We care about this as much as we care about your results.
Make sure to run your notebook before sharing so that we can see the results. We won't be running your code on our machines. On the other hand, please do not print out the entire dataset or endless rounds of epochs.
Overly simplistic final results
Your final results should consist of more than a single number or percentage printout. Explain why you chose the success metrics you chose, and analyze what your output means.
When are the challenges due?
All deadlines can be found on our apply page.
What is the next step after submitting my challenge problem?
After we review your challenge, candidates selected to move forward in the application process will receive an email with an invitation to schedule a 45-minute interview with a mentor or former fellow. Be prepared to discuss your challenge. You will likely be asked to explain why you chose the model(s) you used, and asked questions gauging how deeply you understand the model(s).