{"componentChunkName":"component---src-templates-post-template-js","path":"/posts/voyager","result":{"data":{"markdownRemark":{"id":"b99898eb-f533-5b04-9e6c-bea4b165a14e","html":"<p>「<a href=\"https://arxiv.org/abs/2305.16291\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Voyager: An Open-Ended Embodied Agent with Large Language Models</a>」を読んだメモです。</p>\n<h2 id=\"abstract\" style=\"position:relative;\"><a href=\"#abstract\" aria-label=\"abstract permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Abstract</h2>\n<blockquote>\n<p>1) an automatic curriculum that maximizes exploration, 2) an ever-growing skill library of executable code for storing and retrieving complex behaviors, and 3) a new iterative prompting mechanism that incorporates environment feedback, execution errors, and self-verification for program improvement.</p>\n</blockquote>\n<p>とくに 2) の動作をスキルライブラリとして保存することや、3) の検証まわりが気になった。</p>\n<h2 id=\"2-method\" style=\"position:relative;\"><a href=\"#2-method\" aria-label=\"2 method permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>2 Method</h2>\n<h3 id=\"22-skill-library\" style=\"position:relative;\"><a href=\"#22-skill-library\" aria-label=\"22 skill library permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>2.2 Skill Library</h3>\n<blockquote>\n<p>Adding a new skill. Each time GPT-4 generates and verifies a new skill, we add it to the skill library, represented by a vector database.</p>\n</blockquote>\n<p>スキルを生成して検証したら、ベクトルデータベースに保存するとのこと。</p>\n<h3 id=\"23-iterative-prompting-mechanism\" style=\"position:relative;\"><a href=\"#23-iterative-prompting-mechanism\" aria-label=\"23 iterative prompting mechanism permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>2.3 Iterative Prompting Mechanism</h3>\n<p>3 種類のフィードバックをするとのこと。</p>\n<ol>\n<li>環境（マインクラフト）からのフィードバック</li>\n<li>実行エラー</li>\n<li>タスクの成功の検証</li>\n</ol>\n<blockquote>\n<p>Instead of manually coding success checkers for each new task proposed by the automatic curriculum, we instantiate another GPT-4 agent for self-verification.</p>\n</blockquote>\n<p>別の GPT-4 インスタンスに検証させるとのこと。</p>\n<h2 id=\"3-experiments\" style=\"position:relative;\"><a href=\"#3-experiments\" aria-label=\"3 experiments permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>3 Experiments</h2>\n<h3 id=\"32-baselines\" style=\"position:relative;\"><a href=\"#32-baselines\" aria-label=\"32 baselines permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>3.2 Baselines</h3>\n<p>ベースラインとして ReAct、Reflexion、AutoGPT と比較するとのこと。</p>\n<blockquote>\n<p>therefore we have to re-interpret them to be executable in MineDojo and compatible with our experimental setting:</p>\n</blockquote>\n<p>「MineDojo」でプレイ可能なように実装し直した？\nMineDojo は Minecraft の API のようなもの？</p>\n<ul>\n<li><a href=\"https://github.com/MineDojo/MineDojo\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">https://github.com/MineDojo/MineDojo</a></li>\n</ul>\n<p>Voyager は Mineflayer を使っているとのこと。\nMineflayer は、Minecraft の JS/Python API。</p>\n<ul>\n<li><a href=\"https://github.com/PrismarineJS/mineflayer\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">https://github.com/PrismarineJS/mineflayer</a></li>\n</ul>\n<h3 id=\"35-multimodal-feedback-from-humans\" style=\"position:relative;\"><a href=\"#35-multimodal-feedback-from-humans\" aria-label=\"35 multimodal feedback from humans permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>3.5 Multimodal Feedback from Humans</h3>\n<p>この時点では GPT-4 は Vision がなかったとのこと。</p>\n<blockquote>\n<p>We demonstrate that given human feedback, Voyager is able to construct complex 3D structures in Minecraft, such as a Nether Portal and a house (Fig. 10). There are two ways to integrate human feedback:</p>\n</blockquote>\n<p>人間からのフィードバックがあると、3D の建造物を建てたりできたとのこと。</p>\n<h2 id=\"感想\" style=\"position:relative;\"><a href=\"#%E6%84%9F%E6%83%B3\" aria-label=\"感想 permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>感想</h2>\n<p>Minecraft に限った要素が多いのかと思ったら、むしろ汎用的な要素が多くて驚いた。</p>\n<p>とくにスキルを検証して保存しておくのは、元々そのようなエージェントを作りたいと思っていたので面白かった。</p>\n<p>GPT-4V がなかったころの論文なので、視覚情報が増えたらどうなるかも気になった。</p>","fields":{"slug":"/posts/voyager","tagSlugs":["/tag/llm/","/tag/agent/"],"autoRecommendPosts":["llm-based-agents-survey","llm-patterns","a-survey-of-agents","mrkl-systems"]},"frontmatter":{"date":"2024-01-29T12:34:25.821Z","description":"「Voyager: An Open-Ended Embodied Agent with Large Language Models」を読んだメモです。","tags":["llm","agent"],"title":"「Voyager: An Open-Ended Embodied Agent with Large Language Models」を読んだメモ","socialImage":null,"recommendPosts":null}}},"pageContext":{"slug":"/posts/voyager"}},"staticQueryHashes":["251939775","3942705351","401334301"]}