{"id":4124,"date":"2025-07-02T04:14:24","date_gmt":"2025-07-02T04:14:24","guid":{"rendered":"https:\/\/techtrendfeed.com\/?p=4124"},"modified":"2025-07-02T04:14:24","modified_gmt":"2025-07-02T04:14:24","slug":"egodex-studying-dexterous-manipulation-from-massive-scale-selfish-video","status":"publish","type":"post","link":"https:\/\/techtrendfeed.com\/?p=4124","title":{"rendered":"EgoDex: Studying Dexterous Manipulation from Massive-Scale Selfish Video"},"content":{"rendered":"<p> <br \/>\n<\/p>\n<div>\n<p>Imitation studying for manipulation has a well known knowledge shortage drawback. Not like pure language and 2D pc imaginative and prescient, there isn&#8217;t any Web-scale corpus of knowledge for dexterous manipulation. One interesting choice is selfish human video, a passively scalable knowledge supply. Nonetheless, current large-scale datasets reminiscent of Ego4D would not have native hand pose annotations and don&#8217;t deal with object manipulation. To this finish, we use Apple Imaginative and prescient Professional to gather EgoDex: the biggest and most various dataset of dexterous human manipulation up to now. EgoDex has 829 hours of selfish video with paired 3D hand and finger monitoring knowledge collected on the time of recording, the place a number of calibrated cameras and on-device SLAM can be utilized to exactly monitor the pose of each joint of every hand. The dataset covers a variety of various manipulation behaviors with on a regular basis family objects in 194 completely different tabletop duties starting from tying shoelaces to folding laundry. Moreover, we practice and systematically consider imitation studying insurance policies for hand trajectory prediction on the dataset, introducing metrics and benchmarks for measuring progress on this more and more vital space. By releasing this large-scale dataset, we hope to push the frontier of robotics, pc imaginative and prescient, and basis fashions.<\/p>\n<p>*Equal Contributors<\/p>\n<figure id=\"figure1\" class=\"\" aria-label=\"Figure 1\">\n<div class=\"bg-gray-light text-base rounded\"><a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/mlr.cdn-apple.com\/media\/grid_v4_converted_b19ed4c45e.png\" aria-label=\"EgoDex dataset overview showing egocentric video examples related to dexterous human manipulation.\" tabindex=\"-1\" target=\"_blank\" class=\"mt-0\"><img decoding=\"async\" src=\"https:\/\/mlr.cdn-apple.com\/media\/grid_v4_converted_b19ed4c45e.png\" alt=\"EgoDex dataset overview showing egocentric video examples related to dexterous human manipulation.\" loading=\"lazy\" class=\"bg-gray-light\"\/><\/a><\/div><figcaption class=\"muted\" id=\"figure-figure1-caption\" aria-hidden=\"true\">Determine 1: EgoDex is an open-source, large-scale selfish video dataset and benchmark for dexterous human manipulation.<\/figcaption><\/figure>\n<\/div>\n\n","protected":false},"excerpt":{"rendered":"<p>Imitation studying for manipulation has a well known knowledge shortage drawback. Not like pure language and 2D pc imaginative and prescient, there isn&#8217;t any Web-scale corpus of knowledge for dexterous manipulation. One interesting choice is selfish human video, a passively scalable knowledge supply. Nonetheless, current large-scale datasets reminiscent of Ego4D would not have native hand [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":4126,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[55],"tags":[3750,1182,3749,395,136,2722,180],"class_list":["post-4124","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-machine-learning","tag-dexterous","tag-egocentric","tag-egodex","tag-largescale","tag-learning","tag-manipulation","tag-video"],"_links":{"self":[{"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/posts\/4124","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=4124"}],"version-history":[{"count":1,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/posts\/4124\/revisions"}],"predecessor-version":[{"id":4125,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/posts\/4124\/revisions\/4125"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/media\/4126"}],"wp:attachment":[{"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=4124"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=4124"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=4124"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}<!-- This website is optimized by Airlift. Learn more: https://airlift.net. Template:. Learn more: https://airlift.net. Template: 69d9690a190636c2e0989534. Config Timestamp: 2026-04-10 21:18:02 UTC, Cached Timestamp: 2026-06-04 07:08:32 UTC -->