Topic | Advancements in Embodied

作者: 与阳光共进早餐 | 来源:发表于2023-12-02 05:21 被阅读0次

Technological Advancements in Ma
具身认知
PHRASE
老丘在学习：铃木俊隆传，序言，Part6，个人魅力与专注的事
2019-03-06
WWDC笔记 - Advancements in the Obj
产品的实体表现与情感体验：Laban运动分析理论在设计中的应用（
Kafka常用shell命令集合
flume同时使用KafkaSource、KafkaSink导致
Kafka命令行操作

1. 写在前面

简略地了解一下基于LLMs的embodied AI进展

2. paper：Embodied Task Planning with Large Language Models （arxiv23）

2.1 basic info

task: embodied task planning
model: TaPA (TAsk Planing Agent) framework is proposed.
main idea: aligns large language models (LLMs) with visual perception models to generate executable plans in physical environments.

2.2 main contribution

Multimodal Dataset Construction

a dataset containing triplets of <indoor scenes, instructions, and action plans>

Grounded Plan Tuning

Finetuning pre-trained LLMs for grounded planning, considering the physical constraints of the scene.

Extending Open-Vocabulary Object Detection
Enhanced detection for multi-view RGB images, crucial for understanding scene context.

2.3 main idea

The TaPA framework integrates LLMs with visual information from open-vocabulary object detectors. It processes human instructions and available object lists to generate feasible action plans for navigation and manipulation tasks.

2.4 results

3. paper： Large Language Models as Generalizable Policies for Embodied Tasks （arxiv23）

3.1 basic info

task: visual embodied tasks
model: Large Language model Reinforcement Learning Policy (LLaRP)
main idea: integrates pre-trained LLMs with egocentric visual observations to directly output actions in the environment.

3.2 main contribution

LLaRP Framework

A new framework that combines LLMs with reinforcement learning for embodied AI tasks.

Generalization Capabilities

Demonstrated robustness to paraphrased instructions and ability to generalize to novel tasks.

Language Rearrangement Benchmark
Introduction of a new benchmark comprising 150,000 training tasks and 1,000 test tasks for language-conditioned rearrangement.

3.3 main idea

image.png

use pre-trained frozen LLM to process text instructions and visual observations;
some blocks (highlighted in red) are trained through reinforcement learning;
then the frozen LLM and the blocks can generalize to novel tasks.

4. else papers

GOAT: GO to Any Thing
CLIP-Fields Weakly Supervised Semantic Fields
Discuss Before Moving: Visual Language Navigation via Multi-expert Discussions

:(之后有机会再针对每篇文章写一些详细的

网友评论

本文标题：Topic | Advancements in Embodied

本文链接：https://www.haomeiwen.com/subject/zklhgdtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

Topic | Advancements in Embodied

1. 写在前面

2. paper：Embodied Task Planning with Large Language Models （arxiv23）

2.1 basic info

2.2 main contribution

2.3 main idea

2.4 results

3. paper： Large Language Models as Generalizable Policies for Embodied Tasks （arxiv23）

3.1 basic info

3.2 main contribution

3.3 main idea

4. else papers

相关文章

Technological Advancements in Ma

具身认知

PHRASE

老丘在学习：铃木俊隆传，序言，Part6，个人魅力与专注的事

2019-03-06

WWDC笔记 - Advancements in the Obj

产品的实体表现与情感体验：Laban运动分析理论在设计中的应用（

Kafka常用shell命令集合

flume同时使用KafkaSource、KafkaSink导致

Kafka命令行操作

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读