Lost In Immersion

Welcome to episode 57 of Lost in Immersion, your weekly 45-minute stream about innovation. As VR and AR veterans, we will discuss the latest news of the immersive industry. We are welcoming back Seb from his holidays and we'll start with Fabien, as usual. Cool, thanks. It's been a long time since we discussed AI, I think. And there was a very interesting unfolding yesterday. So, today or tomorrow, depending on where you are in the world, is the Google I.O. conference, which we guessed a few weeks back that there will be some VR or mixed reality or announcements, but it seems to be more directed to AI. And they released this teaser yesterday, which shows what we think is Gemini, the AI having a real-time visual understanding of what's happening in the camera view. In the camera view. So, and like describing what it is seeing. So, that sounds great, but we guess and we usually know that with Google, you know, there is a bit of doubt about the actual push to production of what they are showcasing. And so, it's interesting, but let's see how they release it. And then, the same day, OpenAI say, stop, Google, hello, here is the new DPT4 model. So, which is actually to me is really a revolution. So, it's just an O, not a number, but it's really, really an amazing update. So, basically, DPT4 was when we took to DPT4, there is a speech to text and then a model that looks at the text and then the text is transcribed back to speech. So, it takes a lot of time. There is a lot of latency. You cannot like interrupt it. All of this is gone with O, which O stands for Omni, like everything. So, the model understands in real-time audio, visual and text. And so, you can have a look at all the demo video. It's pretty amazing. So, on this one, someone is using DPT4 O to ask about their styling. So, DPT4 O say, oh, you should arrange your hair and they put a hat and the AI answers like, oh, yeah, it's cool with a hat like that. They do translation. They have two AI discussing with one another at the same time. The new apps that will be released, so there will be a Mac OS app as well. They have screen capture capabilities. So, here, you are seeing DPT4 O helping a student solving a math problem. They show also a demo. I don't know if it's there, but yeah, here, a demo of transcription and summary of a meeting, an online meeting. Oh, and yeah, so, DPT4 O has a better memory of the previous conversation than DPT4. So, yeah, let me look at my notes. Oh, yeah, and so, you can interrupt it and it's faster and cheaper than the previous model. So, down on the page, yeah, they show some visual capabilities as well. It seems like what we judge as creativity has increased. So, the ability to generate images or to write poetry, poetry, like poetry with image. So, it seems to be pretty amazing. Here, you see a transformation of a picture into a caricature. So, yeah, I think there are a lot of things to say, but I will stop here for now. And I'm curious to hear what you think. I don't know if you have had the time to look at the videos because it's pretty recent, but yeah, really curious to have a try at this new model. So, yeah, Seb, what do you think? Yeah, I saw the news yesterday and saw the video too, and a lot of them, there's a lot of other videos that you are showing that has been released. And what shocked me is the responsiveness, the reactivity, and the speed it responds to your comment or whatever you're asking to do. And the fact that the AI is using a natural language and making fun and having a mood, which changed a lot compared to what I saw previously where it was more robots answering you. Here, it feels really like a human being answering to your comment and having fun with whatever you are saying. Yeah, and the capability of showing something with your phone in real time and having the phone react to it, guiding you on how to answer a mathematic problem or stuff like that. Yeah, that sounds very usable at this point, compared to what was before, and enjoyable to use. At least that's what I saw. What about you, Guillaume? Yeah, sorry, just one more thing that I forgot to mention that was pretty impressive is in, I think, one of the first videos. So, the model is answering a question that the guy asked, and someone comes behind him and goes out while the model is still answering. And the guy interrupts the model and asks, oh, did something strange happen now? And the model is able to answer correctly. That was like, wow. Yeah, I was going to talk about that too. I had the same feeling, the fact that there is a memory of what happens and that it's able to answer it to you. It's like you could be able to wear something and also put your keys somewhere, move somewhere else, and it should remember where you put it before. So, yeah. It's very keen to use it and test it too. So, yeah, I guess there are a lot of things to say. First, I guess it's a bit sad that now we are very suspicious of what Google can showcase and that we are way more trustworthy towards OpenAI. It seems like all the different communication in the past made some damages. So, yeah. But in short, I guess OpenAI is here delivering what all the other companies just announced or tried to deliver in the past few weeks. They are just simply creating the virtual assistant that the AIP or the R1 Rabbit just tried to deliver. They just did it on the cell phone and it's basically working as they showcased or they would like it to work in the past. So, yeah, very interesting to see that. Once again, we can see there is a real-time battle between the big player of the AI world. They are just trying to cut the road every time someone is trying to make an announcement. They are just announcing it very quickly. So, I don't know, once again, if it was scheduled to be this day like Meta did with the Horizon OS, for example. We know that Apple is on the verge of announcing something as well with the M4 chips and also an AI assistant. So, this is a very entertaining war between them and at the same time the technology is advancing very, very fast. Why I'm telling you that maybe it's not, it was not supposed to be announced that quick is that it's just one week or two. The announcement of the chat GPT memory and its ability to remember your conversation and so on, so it's a very fast way of announcing new stuff. Maybe it was not supposed to happen this fast and this is maybe why we know that when OpenAI is announcing something, it's basically available the day after or in the following hours, which is apparently not the case as you tried and some of my colleagues as well tried it and we can have access to the 4.0 but it's not the real one behind that. So, they are just plugging stuff in the background right now. So, we'll see if maybe there will be, it will be the answer to our question if it's a few days or a few weeks before we get the full version of the 4.0. It will respond to this question of if it was supposed to happen this fast. And lastly, because they announced other stuff as well in the past days, is the ability to have your conversation public. They want to post all your saying with the UI change with the AI to be public, meaning that you can search it like on Google and you can find the conversation back. So, we have this kind of assistant that people will be using like 24-7, I guess, depending on your phone life, but this is an AI that will scan your everyday life in the very precise details. So, of course, it's revolutionary. It's a great new technology, but in some way, it's the complete end of your privacy at this point, depending on what it analyzes, it sends back to the server and what it can remember as well. So, it's very strange because on the one hand, it's exactly what we want VR virtual assistant to be doing, meaning that it must know you exactly, know your environment, know what you're doing, remember what you put stuff, for example, like Seb said. But on the other hand, yeah, if it goes public, everybody can see and know who you are and what you're doing. Yeah, that was the topic that I wanted to discuss as well, is security, security privacy with this kind of assistance and human-like interaction that it seems to show. It's not, okay, I will make a quick guess and let me know if it's crazy, but it's not far that we have real people falling in love with ChatGPT or maybe I'm a bit extreme in what I think, but at least if the ChatGPT is showing emotions, then we will start to kind of build a relationship with it. And, you know, I don't know, I don't have enough knowledge into that to be really sure, but there are also, I see also some issues happening there, maybe. Yeah, your question is answered very fast because there are already some companies trying to sell this virtual boyfriend-girlfriend experience and it would be better with this kind of assistant, of course. So, yeah, for some, it's an answer to loneliness. And, yeah, I don't know what can be done on the human mind at this point. We talked also about the fact that you can keep a deceased person alive, quote-unquote, because of this AI and you're fine-tuning your AI regarding all the pictures and the different souvenirs you have with this person. So, a whole new world is just opening before our eyes. So, very, very curious to know what it will become in the next months. And I guess, remembering for the AI, remembering all your actions, all your ways to speak, all your language, and mimicking that and mimicking yourself afterwards should be pretty easy with all the information that it gathers around you. So, yeah, it's going to be another scary part. Yeah, I think we were mentioning last week or two weeks ago that some people say, oh, the AI kind of plateau into the progress. I was like, not really. But, yeah, what I wanted to discuss as well is, here, I really see the usage with glasses. Like, if you have, because here you see, like, he's holding a smartphone in front of the dog. But if you have glasses, like, it's a seamless integration. Yeah, we are encountering the same issue with AR and tablets, meaning that the device is not the right one for this kind of use cases. And, yeah, of course, yeah, the smart glasses are making sense now. Yeah, definitely. Because the reactivity is good enough and the quality of the voice that responds to you is fun and, yeah, you can appreciate it, really, like another human being. So, yeah, it's really starting to be usable, like I said. Okay, great. So, we'll see if we can access it in the next few hours. And maybe we'll be doing some feedback next week about this, I guess. But, yeah, maybe Google will announce something else as well. But we are, once again, we are very doubtful of what they can do now compared to what OpenAI is announcing now. So, I wouldn't like to be in the Google shoes right now. Okay. So, Seb, if you want to... Sure. So, for me, I wanted to continue on the AI part and I wanted to talk about Microsoft and the VasaOne model, which allow you to take a picture, either generated by AI or one that you have on yourself, and put it in the model. And either use AI to generate the voice or directly record your voice. And it will generate directly the video of yourself or the picture you uploaded talking with quite a lot of emotion and quite realistic behavior. So, it's called VasaOne and it's a model that allows you to take a picture of yourself and put it in the model. So, it's called VasaOne and it's, yeah, it seems pretty amazing. There's a lot of samples with picture uploaded and different voice that are uploaded too with the picture. And, yeah, the emotion, you can really look at the face and you have all the emotion depending on what you're talking about or the prompt that you add to say which emotion you want the character to have. And it is very, very realistic. It's quite amazing. This whole video, there is a whole video of this lady and all the characteristics of the face gestures, it's really natural and quite, yeah, realistic. So, you're meaning that, is this available? Because I saw the paper a few days or week back and there was not any platform for us to try it. Is this available now? I'm not sure. So, at the bottom of the paper, they have a whole paragraph on the risk and responsible AI considerations and the last paragraph is, I quote, we have no plans to release an online demo API product until we are certain that technology will be used responsibly and in accordance with proper regulations. So, never. If it's not them, it will be someone else, I guess. But, yeah, the progress, if you mix that with the chat GPT 4.0, yeah, you start to get really realistic persons in front of you. Yeah, well, at this point, it depends if we can get this kind of results in real time, but I guess it will be the case. Like always, in a few weeks, in a few months, it would be the case. Like with lip-sync or text-to-speech and speech-to-text, you can have it in real time right now. So, I guess this kind of technology would be in real time as well. Okay, Fabien, despite the privacy and ethic part, nothing to add on this. But, yeah, I saw lots of articles that were very scared or afraid of this, but it's strange because the deep fake already very, very efficient. It's a technique that is different here, meaning that you can do this from a picture and it generates emotion and so on. But if you are getting a live stream or a video, you can have basically this kind of result from a real face. So, they're just like, oh my God, the deep fakes will be invading our world, but it's already the case. So, it's just a technique that is different and maybe more efficient in some way. I don't know, but the deep fakes are already a reality and you should not trust all the things that you are seeing on the internet because it could be fake. Yeah, it's just the simplicity that becomes one picture. Sorry, what you said is what is actually really worrying is if they don't, what is it someone else will? Yeah, there are already some other papers about this. They are not that efficient in the demonstration, but yes, of course, other people are working on this. And yeah, it will be released either by Microsoft or somebody else. Yeah. What really impressed me is that based on one picture, it seems to really be able to mimic the face gesture based on bones and stuff that needs to be embedded into it. But yeah, the way the face reacts and moves seems really realistic and based on one picture, that's kind of amazing. Kind of amazing. Yeah, it's the amplitude of the movement that is very intriguing because the face can move a long way, a long distance from where it has been taken. So, we know that this kind of algorithm is based on the 3D reconstruction at some point when you are doing a deep fakes. So, they are creating the whole face in 3D and mapping the video on it and then generating the missing part. So, very impressive. Yeah. So, no, it's not available. So, maybe they took only the face that worked great and take the one that goes with a huge mouse and yeah. But yeah, we'll see when we can test that someday. And the other topic I wanted to talk about is the Futuroscope or the Tornado. So, Tornado Chaser. And it received the world trophy. And so, the world ID is that people are on a huge platform in front of a huge LED screen. Like, it's a 20K resolution with a rounded LED screen, but which represents if it was on the floor plane, it would be 60 meters long. So, quite a huge screen. And they are simulating wind and smoke. So, they send smoke on the people. And all the platform is moving according to in 6DOF, according to the movie that is displayed in front of the user. So, the experience is quite amazing. And I didn't know that it was available. So, now I want to go to Futuroscope to test it and see how it looks like. But yeah, it's amazing the amount of technology they put in it. It's very, very impressive. So, yeah, that's it for the Futuroscope. It's very impressive. And yeah, the platform is rotating at 30 kilometers. So, even the reactivity of the whole platform is impressive. I don't know if you saw the news and if you have any comments on that. Fabien? No, I was like... Yeah, go on, go on. No, the first question that comes to mind is why didn't they use VR headsets? It would have been maybe more immersive. Because once again, when you see the images right here, you can see that you don't have the ceiling. And when you are seeing these kind of images with tornadoes, of course, the ceiling is one of the main parts. So, I guess they are missing some stuff here. It would have been better, of course, with VR. I guess they didn't want to manage all the headsets and so on. So, of course, the big screens are always an answer to this. But I would be curious to know why they didn't do a half dome here, just to cover the whole stuff. It's like they did the whole experience, but they are missing a few feet from the finish line to get the real result. Because we know now with, for example, the sphere in Las Vegas, that you can do this kind of shapes of sphere with very small LED panels, like they used here. So, I guess it's not just a budget issue, and that they were missing the few hundred Ks to finish the whole experience. But yeah, it's very impressive and it's a massive platform. I'm curious also to know the maintenance on this kind of attraction, because it must cost a lot to get this working all day long for a long period of time. So, yeah, very impressive. Very interesting to see that these theme park are still innovating and trying to do some new concept and immersive application. I think to answer to your question, the fact that they did not do a dome, it's because everything is moving with the screen. So, the screen is moving, I think, with the whole system. And they are putting smoke from the above, like you saw on the video. So, there is a lot of systems on top that are blowing smoke and air inside the dome, inside the theater. Yeah, go ahead Fab, sorry. Yeah, it looks great indeed. And so, it's the screen is fixed and the platform is moving, or like both are moving? I think the platform only is moving, right? Yes. Well, it's hard to tell, because the platform is also moving in height. So, it's simulating also this kind of movement. So, I guess the screen must be attached to it to move together. Okay, yeah, that was strange to me. But usually, when I see these kind of screens, like the sphere at Las Vegas or what? This one's obviously much, much, much smaller. But it's also about content. Like, okay, wow, we have a large screen now. So, the content production also must be really thought about for this type of media. So, I don't know if they have like multiple movies that they plan to release, or yeah, really curious to know if they have some kind of capabilities to port some existing movies into that by adding some movements on it. I don't know. But I think content is also a big part of this kind of project. And not to be like, oh, we have a big screen, but we have nothing to put on it. So, yeah, I stand corrected. I'm just looking at the other video and the screen is staying fixed, but only the platform is moving. We can see it here from above. So, they could have done the dome one. Okay, anything else? No, that's it for me. Okay, final topic. Oop, so, final. So, because this is one of the features that was presented by Apple, with the release of the Apple Vision Pro. So, if you have any questions, please feel free to reach out to me. And I'll be happy to answer any questions that you may have. So, thank you for watching. And I'll see you in the next video. Bye-bye. Bye-bye. Bye-bye. Bye-bye. Bye-bye. Yeah, I think it's very interesting to see that So yeah, they are doing plane first, I think, because the plane movements are very steady, like in the car, in the trains, a lot of actually pretty huge movements. So it's much, much more difficult. And I don't know if with the current hardware, they can do it, actually. Yeah, basically, it's always the same question, is will people use it? So there needs to be a very good reason. One that I can see is to watch movies, but as you said, should be in VR. And hopefully we'll have the ability to download everything before, because they say it in the article, like, oh, if you have complimentary Wi-Fi, it's like, yeah, you need to pay a lot for Wi-Fi. So hopefully it will work offline. I don't know. Yeah, I think it's a mandatory update. They should support this kind of experience. And I see they have a partnership as well with, I'm sorry, I forgot which airline. Lufthansa. Thanks. So I think it makes sense from these companies to propose more different entertainment onboard and maybe charge more. What do you think, Seb? Like you said, it was a mandatory update, I think. But then, like you said, I don't see myself putting it on the headset, except for watching maybe an horror movie with my son next to me in the plane, so he doesn't get scared about it. I don't see myself playing games inside the plane and having movement with my controllers. If so, it means that it's content that I don't want other person to see. Or it's for walking and then seeing my computer without having to deploy it and being able to type stuff and work directly in a comfortable environment. But then right now with the Quest 3 or the Quest 2, the quality of the hand tracking is not good enough or the easiness of connecting to your computer or to your Mac is not that easy to do. So the complexity of doing all that makes that not usable for me. It should be like you open it and then you have direct access to your computer and then you can work from there. But without all that step being simplified, I don't see myself using it in the plane. Which I don't take that often. I try to avoid it as much as possible for environmental reasons. If it was in trains, there would be more users like in the metro or commuter trains. One usage that I can see myself do is actually to watch movies because the screen is so small on the airplane that having a huge screen could be nice. I would like to have noise cancelling headphones as well. It's a must, I think, in an airplane with the noise. Actually, I'm just thinking about it. How does the Apple Vision Pro think it should connect to AirPods? I don't know. I need to check that. As you said, if we start to play a game like the one we reviewed a couple of weeks ago, Asgard Rise 2, with a bow and arrow. No, but they are showing very simple use cases like Tetris. Very simple games. Otherwise, you're going to get into real fights. If you are completely immersed, I wonder how you catch your stop. Also, if you're in a train, you have to be really careful of the arrows and when you want to remove the headset to catch your exit. That's where Mixed Reality makes sense. Or with an AI assistant that says, hey, your stop is coming up. Stop. Stop. Stop. Stop. Stop. Stop. Stop. Stop. Stop. Stop. Stop. He says, hey, your stop is coming up. Stop playing, dude. Okay, guys, I guess this is it for today. I think we lost Guillaume, right? I don't know. I'm not seeing him anymore. Well, neither. Okay. Yeah. So thank you, everyone. I don't know how it will look like because we lost our recorder, Guillaume. But see you next week. See you next week, Sky. Bye.

Episode #{{podcast.number}} — {{podcast.title}}

Transcript

Subscribe

Episodes

Credits