Lost In Immersion

Welcome to episode 56 of Lost in Immersion, your weekly 45-minute stream about innovation. As VR and AR veterans, we will discuss the latest news of the immersive industry. Let's go! Seb is not here because he's on holiday. You were on holiday last week, and this is why there was not an episode. But we are back, and I guess you have some great news to share with us. Yeah. Okay, so today I want to talk about this paper that was released by University in collaboration with NVIDIA. And it's actually a continuous work that has been quite evolving over the past years. But I think this new iteration was quite interesting to look at. So if we back up a bit into the goal of this kind of technologies, the idea is to train robots to do things, to walk, to grab an object, you know, all the stuff that we want robots to do correctly. And to, instead of programming the robots and then try the the programmation in a real world, the idea is to use a virtual simulation of the robot and to do the training of the robot in virtual reality, basically. And then to transfer the learnings and transfer the training onto the real robot. And so this has been going on for quite quite time. And what's interesting here is basically there are two main difficult tasks when doing that. So first is to compute a good reward function. So basically how do we calculate that the robot actually wants, actually does what we want the robot to do. And the NVIDIA team has used GPT-4 to actually help the human into creating and designing this reward function. And GPT-4 is actually much much more efficient than a human into creating these functions. So that's one thing. And the other element where AI was used to help is to, so the risk when doing a simulation in a virtual world is that the virtual world is too fixated compared to the real world. Like in the real world you have winds, you have different kind of grounds, the ball here that you are looking at might have different type of pressure. And so in order to correct for that, the simulation in virtual reality has a lot of different parameters that they can adjust. And the GPT-4 was used here as well to randomize the parameters. So the training does not fix on the real world, the virtual world only. So here we go. It seems like they are getting pretty good results. As you can see, the robot is able to walk a bit weirdly but efficiently on the yoga ball. And they have other examples, so they have a lot of videos. And here you can see different type of other training that they experimented on. So here is the virtual world. So as you can see, it's a really basic virtual world. So this is why having a lot of different parameters is very important, because the ground will never be super flat as it is in the simulation in reality. And they have this dexterous manipulation. So here is the robot manipulating a cube. And they actually had in the previous paper, they had the robot doing the spinning of a pen, thanks to virtual training. And I guess, yeah, you can see here what GPT-4 has generated. And just a simple forward locomotion. So, yeah, that's what I wanted to discuss today. I think it's really interesting, because it's a very typical case of LLM, so GPT-4, helping other AI. So the AI that he's building the walking model for the robot, for example. And I think it's a very good use case here of the way that AI can really speed up technology and innovation by having AI, helping other AI to achieve better results. So I hope that you have enjoyed this presentation. So, yeah, that's it, basically. Okay, so several things. So, to comment on your last comment, that you're completely right, the perspective or the world we are taking with AI right now is clearly in mixing different agents with different specialties to get something that is, if we want to get creation or something that is innovative with the AI, instead of just getting something that they've learned, they have to be confronted and to create a debate between different AI. And this is the way of us getting interesting results right now. And this is particularly funny, because it's exactly what we are doing in real life. When you have a brainstorm or you want to create something, you have an expert in one field, an expert in another field, you have a manager, you have a project manager or whatever, where you have different experts around the table, and they are all arguing and presenting the idea. And this is the way we are doing it in real life. So, and they are all arguing and presenting the idea, and they are ideally converging into something creative. And this is exactly what we are doing. So, well, once again, we're just copying what we do in the real world with artificial intelligence. So, I completely agree with your last comment with that. And my other question is, what is the goal of this? We know that the use of robots or research in the robotic field are getting to another level these past weeks with the humanoid or dedicated robots for complex tasks. We know that they are not there yet. They need more training or advanced movement or motors or so on. We saw that Boston Dynamics, they also completely abandoned their latest, their humanoid robot and created another one, which is named Atlas. And we can see that now it still has a human shape, but is not respecting any joint that we have. It's more an augmented human robotic with lots of different movements that we can't do. So, what do you think the goal is behind that, to have a virtual robot agent or a super efficient robot in the near future? Yeah, that's a good question. I don't have a clear answer, actually, but I can think of a lot of applications in the industry to do work that humans cannot do, or all of this kind of usage in the industry where robots are already in use, actually. One thing here that is interesting is that it's speeding up the learning of new behaviors of robots. So, yeah, instead of doing a very, very specific algorithmic programming for robots, they used AI plus training in virtual reality, in the virtual environment, to increase the speed up of training. I guess this would increase the number of things that a robot can do, and maybe the robot can learn things even after it has been designed. Yeah, because I think for this kind of use case, meaning that you have an basically autonomous robot doing very specific or very hard tasks, I guess there are three fields. The first one is the industry or the manufacturers in general, like we've seen with Mercedes, BMW, and so on, that are trying to replace some manual tasks that are being done right now. Specifically by humans. You have the how to help seniors and people in general, like what China is trying to do, they commit themselves to create a new industry of helping robots. And the third one is the military application, which is more on the United States part. I saw that they've already weaponized some of those dogs, those Boston Dynamics dogs, and I don't know if you've seen, but they already succeeded in getting an F-16 completely controlled by AI, so they validated that as well. Yeah, so there are good and maybe less good sides of this, but of course, it is getting... I guess there's a curve of progression between the last maybe 15 years for Byston Dynamics, so we've seen this robot for quite some time now, but the progress we've seen the progress we've seen the past few months is exponential between what they could actually achieve in the past 10 years and right now, so we can hope that their robots will be way more useful than they were in just a few months back. Yeah, I think this is one of the worries around AI, that thanks to AI, the progress will exponentially arrive, like faster and faster, and we will have issues to adapt to this fast innovations and fast technological progress. It's very difficult to say. On some side of the debate, you have indeed people very, very worried about that and asking even how we will be able to regulate all these new technologies that will arrive. Like if we have, I don't know, 100 years of innovation compressed and arriving in five years, how will we be able to handle that? On the other side, there are people saying like the chat GPT is dumb and not able to complete any task correctly. So, it's very difficult to know. I'm a bit in the middle, I think, maybe a bit more on the worried side, but yeah. Okay, so I guess we can move on, or do you have any other subject to share? No, that's it for me. Okay, so as Seb is not here, I'll take his place to showcase some new, sorry, why. My screen is shared. Sorry, some sharing issues. You should look there. Yeah Okay, perfect. So, as I was saying, as Seb is not here, I'm Andol, I'm leading the board for the Gaussian Splatting News. And they just announced that Temporal Games is now able to integrate animated motion, Gaussian Splatting clips. So, you can see here the results of them getting a real-time interactive movie, but well, not interactive, but immersive movie. So, very interesting to see that now a technology that were announced like just a few weeks back, maybe months, but not more than that, can now be completely integrated in video games. So, of course, the capture side is still a bit complicated because you need some kind of a dome of a rig with delegated cameras or captors, but the result is there. You can have real-time rendering and it gets all the lights and shadows of your immersive scene. So, I just wanted to push the debate a bit further than just the announcement of the technology and bring back some old memories. We know that right now in video games or immersive application where we have reached some kind of a plateau in the realism of NPC or characters in games. I don't know why the metahumans are some kind of example of great rendering of human in Immersive World. However, there's still this, you know, this uncanny effect that we have a lot of difficulty to correct and create a very realistic humanoid characters. So, I was thinking maybe this kind of technology is the answer for us to get those very realistic rendering in video games or application. And by linking those ideas, I remember a time where we were doing this kind of stuff. If you remember the Mortal Kombat game and all the trend when we were filming actors to integrate them in video games, it was 2D, of course. But maybe we are on the verge of reliving this epoch where we were filming actors in a dedicated space like a dome and bring back those inside video games. And I can't talk about this without mentioning Toonstruck which is one of my personal favorite games where you had Christopher Lloyd integrated in the cartoon video games. Very great games if you don't know. But yeah, it's it's been quite 14 years apparently. More than that, I guess. And what do you think? Do you think this animated motion splitting could be the answer for us to get more immersive or realistic avatar or NPCs in our application? Yeah, I think so. I have two ideas on this one. The first one is actually linked to what we discussed just before. And I think you said it, Gaussian splatting the first paper is maybe like one year old or something. Yeah. So like when we hear people saying like AI progress is going slowly and so on and so on. It's like, wow, it's just been one year. I mean, it's like nothing in the big scheme of time. So yeah, it's impressive to see how fast the progress is going when it's only one year. So I think we talked about Gaussian splatting in movies. And here, yeah, we can see the opportunity in games. I think so as well. The only thing is, I don't think this can be animated. So maybe, I don't know, actually, if like a rig can be a movable rig. Yeah. So the user wouldn't have to do all the kind of different movements. And they would love maybe to do, I don't know, 10 minutes of all the movements and some AI would interpolate all the other different movements. But yeah, I think definitely when I see this, I also think about the music videos. So in the Vision Pro, and actually on the Quest as well, it's available on both platforms. There is an app called AmazeVR. So it's a video clip that were recorded specifically for VR. And it's a fixed frame. So I really see the next step as being able to move really around and to move inside the music clip. It would be super amazing as well. So, yeah, a lot of opportunities there. I totally agree. And do you think, I guess this is still an old ongoing battle between two technologies. Do you think we will have more of a Gaussian avatar animated through motion capture or a completely captured video like we've seen, which is more like an FMV or what we did in the past? I guess both of them have their advantages and their cons as well. But this is basically what has been going on. And so basically this is, do you want the Apple technology, which is called HUGS, when you are scanning someone in Gaussian splatting and then animated them through animation clips or this kind of technology? And do you think that if this technology becomes the real thing, we'll see some bigger rigs or completely warehouse, completely equipped with very high end cameras to get those whole scenes that we are talking about? Yeah, I think on the latest technology that you mentioned, I think I've seen, I think it was Intel that they had a huge one where they could have more than two actors, maybe a car or something like that, like some bigger elements in the dome. So that could be a really nice investment for the movie industry. Yeah, because we know that they spent a lot of money back in the day to have a facial motion capture of famous actors, for example. It could be way easier for them to have this facial capture in Gaussian splatting. So yeah, I guess it could completely transform this whole motion capture field in movie and in video games as well. Very interesting to see that maybe the actors will be brought back to the front of the scene because of this technology. And yeah, very interesting to see that someday we are making some improvements and we are getting the same ideas that we had in the past to correct, for example, like 20 or 25 years ago with this video capture for video games. It was a way to get more realistic results and we are doing exactly the same right now, but in 3D, backed with AI. So it's very funny to see that we are circling but moving forward at the same time. Yeah, so if I understand correctly, the question is, will Gaussian splatting and all of this reach the level of quality that metahumans and virtual humans cannot reach? Like, which one will win? Yeah, we've seen already the spatial memories that Seb shared, I guess it was last two weeks ago. When you see the quality of this, once again, we don't know the time they spent cleaning up the whole data, but on the result side, it was awesome. We didn't have the ability to zoom in to see if we can see the skin and all the defaults, but yeah, I guess we are on track to get to this point, because I don't think that 3D, at some point, can compete against very high-end definition pictures. I can't see how, instead of, like, when you have 20 cameras, 8K cameras, getting all the very small details of a person, getting the same kind of result in 3D would be very, very complicated, or the 3D-generated AI could be doing so. But I tested Instant Mesh last week, and of course, they can get some 3D very quickly, some assets, you can bring them on a virtual application very fast, but it's still very rough right now. So, the curve between what we got with Neat Journey two years ago and what we can have now is basically the same step, but it tends to move slower than in 2D, because yeah, 3D is way more complicated than that. But we can hope, I guess there will be this competition for a long time, or maybe they can both live in the same space between this Gaussian splatting very high-end and very high-definition, and 3D AI very high-definition as well. So, we'll see which one gets the trophy at the end, but it's very interesting. It's pure speculation right now, but yeah, it's very interesting to see those two technologies moving forward at quite the same pace or whatever. Yeah, I mean, when we see the Persona and the Vision Pro, and so we don't know when they will release it, but the avatar for that meta demoed like a couple of months ago. I think they did a podcast with... With the Mark Zuckerberg interview, yeah. Yeah, that was using these avatars. So, yeah, I guess it's like a blend of these technologies that maybe will succeed. But the Persona and the Vision Pro is pretty cool. Well, and I guess it's getting better, it will get better in a few months because they are still improving this. So, just to finish, I will give you the mic for a reflection you had, I guess, last week, because once again, the mainstream media are trying to kill the Apple Vision Pro, like they did with the Metaverse. Their tendency to trash-talking technology is becoming like something that we are used to now. So, I guess you had some stuff to say about this. Yeah. So, as you said, I think the headline was, oh, Apple is killing the Vision Pro. They are not selling it. Slashing the sale, or what is this? Yeah, very violent title. Yeah. And it seemed strange to me when I've heard the news. So, I've looked at the numbers, and so, one year ago, a bit less than one year ago when the Vision Pro was released, I think the numbers were like, I don't have the numbers, it's in the article that I wrote, but it's like 400k or something like that, one year ago. And then, in January, it was maybe 600k or something, and now they are back at 400k, like worldwide sales in 2024. So, I think there was a lot of maybe speculation in January, just before the release, and then now it's maybe more like a return to more nuanced estimations. So, I think what it looks like to me, it's just like Apple learning from their sales number. And of course, it's such an expensive device that it was expected, I think, maybe in hindsight, it's easy to say, but that a lot of people will buy it at first, people like us, like developers and tech savvy people. And then, after all of these guys have purchased it, then the sales will go down. So, yeah, that was... It's funny to say that at the beginning, the articles were like, yeah, it's way too expensive, nobody will buy it. And oh, yeah, people are buying it. And now that, as you mentioned, the curve is getting to a more linear one. Yeah, it's the end of the Apple Vision Pro, because now it's not as exponential, but it can't stay that exponential, especially with that price tag. But what I really like in your article is that when you are doing the math between the sales of the Apple Vision Pro and those made by Meta, they are basically the same, or Apple is even making more money than Meta at this point. So, it can't be called a failure or... Yeah, Apple is making money with this, whatever the case. And when they are telling us that there won't be an Apple Vision Pro in 2025, maybe they're just adjusting the technology because it's evolving as well, and the price of the component and the rarity. So, we know that it's really hard to get some graphic cards and components right now. So, maybe they're just adjusting their view. We know that there won't be the front screen anymore because it's getting the headsets heavier and the effect is not that good. So, we already know all of this. And yeah, it's really a pity that all those articles... And they are just copy-pasting each other. I don't know. If you are making a research about Apple Vision Pro, this is the same article over and over again saying that, yeah, they are slashing the sales and the delivery, and then that maybe there won't be any more Apple Vision Pro. But we know that the strategy here, they spent billions of dollars. It's 10 years of research. They won't put this in the trash like this, especially after the meta announcement two weeks ago that they want to be the king of VR. Of course, there will be a response by Apple on that field. Yeah. I'm really curious. I saw just today an interview of... Sorry, his name is slipping my mind. The guy whose company was bought by Apple to create the Vision Pro. Oh, Bertrand Honneveux? Yes. So, I saw he released an interview just a couple of days ago. I didn't have the time to look at it yet, but I'm really curious to know what he thinks because I hope the interview will ask him about it. Yeah. He worked at Apple and he's still having specific information. So, he's putting them here and there during his interview. So, you have to follow him to know exactly where it's going. But yeah, you have some very interesting information about what Apple is doing and will do in the upcoming months. So, we'll have a look at that. Oh, and speaking about price, I saw also that Meta has fixed the Quest 2 at $199, I think. So, even a very, very low price. Well, for that device. So, I don't know. Maybe they just want to... Yeah, I guess they just want to get rid of the last bit of stock they have. And we know that there were some leaks about a Meta Quest 3 for a low-cost Meta Quest 3 or Quest 3S. I can't remember the real name. But yeah, I guess they are surely decreasing the Quest 2 stocks and trying to sell as much headsets as possible. And then they'll be releasing the new Quest 3. I still have one. Yeah. They are still doing upgrades, so I guess it's still something. We just hope that they won't do the same as the Quest 1 that became a brick just a few months after this kind of strategy. It would be very frustrating for those that are buying their Quest 2 at $200 right now and just six months after this. Well, too bad it's not working anymore. We'll see what they'll do. But as we discussed, it's a few months now, but what was very funny was that when they had new innovation, they were showcasing the innovation with the Quest 2 and not the Quest 3 back in the days. So it's very interesting to see that they are still working on with this. So we'll see the future of the Quest 2, I guess, in a few months as well. Yeah, I was discussing with some people that are in the VR field a couple of weeks ago and a few of them said they are still using the Quest 2 to test. So I think I have a selection bias because it was people in VR. But yeah, it's a device that is still used. Yeah, I'm still working with this as well. So very cool device, very easy to use. Yeah. Okay, so I guess this is it. Anything more to add? No, we're good. So we'll see you next week with Seb, hopefully. And have a nice week and see you for our next episode of Lost in Immersion. See you. Thanks. Bye.

Episode #{{podcast.number}} — {{podcast.title}}

Transcript

Subscribe

Episodes

Credits