Econ 3750B: Machine Learning
Instructor: Douglas Hanley
Lectures: W 3:00-5:30PM, 4940
Office Hours: TBD, 4907
Syllabus: HTML
Course Notebooks (grad_code)

Data Science (data_science)

Valjax Library (valjax)
0. Python & Ecosystem
1. Optimization Methods
2. Solving Models (with jax)
3. Inference (with sklearn)
4. Machine Learning (with torch)
5. Text Analysis Basics
6. Large Language Models
0. Python Exercises
1. Using Estimators
Course Outline
--- #* Week 0: Intro to Python **Description**: Kickstart the course with an introduction to Python programming, a versatile language that's become a cornerstone in the world of data science. We'll explore fundamental Python programming techniques, then delve into numerical and empirical methods, leveraging powerful libraries like `numpy`, `scipy`, `pandas`, and `statsmodels`. **Homework Assignment**: "Basic Python Programming and Data Manipulation" **Further Reading**: — *Python for Data Analysis* by Wes McKinney --- #* Week 1: Optimization and Estimation **Description**: Dive into the world of optimization, exploring both single and multi-variate methods. This week, we will also introduce the concept of automatic differentiation using libraries like `jax` and `torch`. We'll wrap up with an overview of structural estimation in Python. **Homework Assignment**: "Optimizing Functions and Basics of Estimation" **Further Reading**: — *Pathways to Solutions, Fixed Points, and Equilibria* by Zangwill & Garcia --- #* Week 2: Inference and Classification **Description**: Delve into the ecosystem of `sklearn` as we introduce machine learning methodologies. We'll explore powerful algorithms such as Random Forest and XGBoost, and then venture into the domain of neural networks, using `torch` for training and inference. **Homework Assignment**: "Classification Challenges: From Trees to Neural Networks" **Further Reading**: — *Pattern Recognition and Machine Learning* by Christopher M. Bishop --- #* Week 3: Large Language Models **Description**: Explore the exciting world of Large Language Models (LLMs), beginning with the foundational Transformer architecture. Understand the landscape of available models, their inference challenges, and conclude with hands-on applications of LLMs in research settings. **Homework Assignment**: "Implementing and Inferring with Transformers" **Further Reading**: — [*Attention Is All You Need*]( by Vaswani et al. (original Transformer paper) — [*nanoGPT*]( (a minimal implementation of GPT-2 by Karpathy) --- #* Week 4: Text Analysis and Embeddings **Description**: Dive into the realm of text analysis. Begin with classical methods like Tf-Idf and transition into modern word and sentence embedding techniques. The week will culminate in using these embeddings for various machine learning tasks, such as classification. **Homework Assignment**: "Text Classification using Tf-Idf and Embeddings" **Further Reading**: — [*Distributed Representations of Words and Phrases and their Compositionality*]( by Mikolov et al. (Word2Vec paper) --- #* Week 5: Geospatial Analysis **Description**: Venture into the domain of geospatial analysis using the `geopandas` library. Gain insights into the Convolutional Neural Network (CNN) architecture and its relevance to geospatial data. Wrap up by exploring its real-world applications as demonstrated in literature. **Homework Assignment**: "Analyzing Geospatial Data and Implementing CNNs" **Further Reading**: — *Deep Learning* by Ian Goodfellow, Yoshua Bengio, and Aaron Courville (with emphasis on the CNN chapter) --- #* Week 6: Simulation Frameworks **Description**: Delve deep into simulation frameworks starting with the Simulacra paper. Continue by exploring the concepts outlined in the John Horton paper and wrap up the week brainstorming and discussing some fresh, innovative ideas in this space. **Homework Assignment**: "Building a Simple Simulation based on the Simulacra Paper" **Further Reading**: — [*Large Language Models as Simulated Economic Agents*]( by John Horton — [*Generative Agents: Interactive Simulacra of Human Behavior*]( by Park et al. ---