john schulman openai

About 26 results

Open links in new tab

Any time

joschu.net
http://joschu.net
John Schulman's Homepage
I am currently a researcher at Anthropic, where I’m working on aligning large language models; some of my interests include scalable oversight and developing better written specifications of model behavior (like OpenAI’s Model Spec, Constitutional AI).
joschu.net
http://joschu.net › publications.html
Selected Publications
John Schulman, Jonathan Ho, Cameron Lee, and Pieter Abbeel International Symposium on Robotics Research (ISRR), 2013 Paper / Videos
joschu.net
http://joschu.net › code.html
Code - joschu.net
John Schulman's Homepage. Code. GitHub profile. Highlighted projects developed by my collaborators and me: Procgen Benchmark (2019): GitHub / blog post. Gym Retro (2018): GitHub / blog post on dataset / contest; OpenAI Baselines (2016): GitHub / original post (DQN) / ACKTR + A2C / PPO; OpenAI Gym (2016): homepage / GitHub / blog post / article ...
joschu.net
http://joschu.net › presentations.html
Presentations
2024 Talk about OpenAI Model Spec at Scale conference; 2023 ICML talk on proxy objectives; 2023 Berkeley talk on truthfulness; Older slides and video presentations: TR35 Award Talk at EmTech 2018 in Cambridge, MA. Video / Slides (PDF) Faster Reinforcement Learning via Transfer, at AIX symposium hosted by SK Telecom in Seoul. Video / Slides (PDF)
joschu.net
http://joschu.net › blog › opinionated-guide-ml-research.html
An Opinionated Guide to ML Research - joschu.net
Jan 24, 2020 · I originally wrote this guide in back in December 2017 for the OpenAI Fellows program. In this essay, I provide some advice to up-and-coming researchers in machine learning (ML), based on my experience doing research and advising others. The advice covers how to choose problems and organize your time.
joschu.net
http://joschu.net › docs › nuts-and-bolts.pdf
[PDF]
The Nuts and Bolts of Deep RL Research - joschu.net
New Algorithm? Use Small Test Problems I Run experiments quickly I Do hyperparameter search I Interpret and visualize learning process: state visitation, value function, etc. I Counterpoint: don’t over t algorithm to contrived problem I Useful to have medium-sized problems that you’re intimately familiar with (Hopper, Atari Pong)
joschu.net
http://joschu.net › blog.html
Blog Index - joschu.net
Jan 24, 2020 · John Schulman's Homepage. Blog Index. Sending Samples Without Bits-Back (2020/03/08) Approximating KL Divergence (2020/03/07) An Opinionated Guide to ML Research (2020/01/24)
joschu.net
http://joschu.net › docs › thesis.pdf
[PDF]
O P T I M I Z I N G E X P E C TAT I O N S : F R O M D E E P R E I …
john schulman Summer, 2016 A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate Division of the University of California, Berkeley committee: Pieter Abbeel, Chair …
joschu.net
http://joschu.net › docs
[PDF]
Machine Learning vs Human Learning - joschu.net
John Schulman Gym Retro • Over 1000 games integrated • Uses emulators for classic game systems, e.g. SEGA Genesis • Open source: github.com/openai/gym-retro • Public contest: contest.openai.com • 250 different teams submitted solutions • …
joschu.net
http://joschu.net › docs
[PDF]
Faster Reinforcement Learning via Transfer - joschu.net
John Schulman 2018.09.06. Overview Policy Gradients Success stories Limitations Meta Reinforcement Learning Gym Retro. Terminology ... contest.openai.com April 5 to June 5, 2018 Hired level designers to create 11 custom levels Also created 5 …
Pagination
- 1
- 2