Lost In Immersion

Welcome to episode 48 of Lost in Immersion, your weekly 45-minute stream about innovation. As VR and AR veterans, we will discuss the latest news of the immersive industry. Hello, guys. And let's jump right in. So, Fabien. Hello. So, this week, I want to talk about a quite large topic. And we've discussed that over the past podcast, and I thought it was good to maybe do a larger discussion on it. So, over the past year-ish, we've seen a lot of devices that jumped into the AI, generative AI hype. So, you know, we've seen the frame glasses that we spoke about last week, the human AI pin that you are supposed to wear here. Of course, ChatGPT has the voice now on the mobile app, and also the small square rabbit device. Very similar in the kind of usage. And last week, I saw that the company called DID, which did quite a buzz last year, as they were one of the first companies of doing generative video from just a picture. You can just send a text, and they generate a video of the face talking with the text that you generated. And they changed their homepage with this tagline. They say, interfaces evolved. And what they are talking about is, from their perspective, in the future, we will not use the mouse or keyboard, but we will use natural user interface. What they mean is, you know, eyes, voice, ears, like the way we usually interface with other humans, but to interface with tech. So, yeah, I thought, first, it's very interesting to see them pivoting from generating AI videos to a kind of user interface company, which I didn't really understand, and it's not really well described on their website. And, yeah, so, you know, I just mentioned all of these devices as well. So, I'm really curious to get your input on, first, do you think it's a trend or just a hype? Or is it just like the natural evolution of interfaces, and ultimately, we won't need this keyboard or mouse that we all have? So, oh, or maybe, you know, in I don't know how many years, we'll just have a brain implant and we won't even need to talk. But, yeah, so that's the topic I would like to discuss with you guys today. And Seb, I'm curious to know what you think. Yeah, I think it's a new way of interacting with a device that we are not at all used to right now. So, it gets some training, but it seems like it was the same with the Siri app on the iPhone. There was a time of adoption, I would say, and now I see more and more young users using that as a standard way to interact with their phone. I'm not keen to do that. I think all people will not be in the situation where they don't want to talk to your device. So, like you said, maybe brain implant or stuff to directly drag the focus of your device. And another way would be still needed. But having that as one of the ways to interact with the device would be nice. Now, having a real avatar interacting with you and reacting to your gesture and the way you talk to him or the way you behave, it seems like there is a long way to go to be there. I don't know if they shared a video of what they have, but it seems not. So, we'll see how much uncanny it is. I guess while it's still a long behavior, maybe if it takes time to react to what you're saying, what you're asking to the character, maybe it won't be adopted well. Until that is completely realistic, I would say. Yeah, I can imagine the difference between, let's say I want to search for a restaurant on Google and say, hey, can you search for a restaurant that has this and this and this? I don't know. We're just typing the type of restaurant in the search field is maybe faster, but maybe less personalized because while talking, we can say all our preferences. I don't know. Yeah. What do you think, Yom? Well, I think this company is just jumping on the spatial computing train, meaning that probably their first business case was not that efficient or working as well as they would like it to be. So, this is a complete business change. And yeah, they're just doing what every new tech startup is doing right now and putting the buzzwords like spatial computing and new interfaces and so on. So, that's the first thing. And about the viability of using speech eye tracking and hands for our daily work, as Seb mentioned, this is a long road and it's far from being usable. The example I'm always taking for this kind of new way of interacting is how would you do an Excel tab with this kind of interaction? So, first, this is a personal view of this, but I think the Excel tab is awful to do. And just imagine making someone do your Excel tab through speech and eye tracking. So, yes, please, can you change the cell IG number four? Yeah, well, it would be a mess. It would take hours. So, no, and we already talked about this, about the fact that we are all dreaming about the minority report interface and we know that putting our hands in the air for a long period of time is not compatible with our biomechanics. We get very tired very soon. This is what Apple Vision Pro users are experiencing as well. A lot of them are just, yeah, they did it for a purpose, but the camera pointing downward for you to have your hands laying on the table or next to your legs for your hands to get support. So, yeah, it can be a complementary tool to what we have now. Of course, the mouse and keyboard needs to be improved at some point because we've been using them for decades. So, I'm sure we can do better than that, especially by using other interaction interface, but by trying to erase everything that the spatial computing would like you to do. Once again, Apple Vision Pro with their keyboards, we can see people like typing with two fingers in the air. And when you are writing an email, the people that are actually working with the Apple Vision Pro, they have their portable Bluetooth keyboards in their backpack and they are typing like they used to. A lot of ideas, but unfortunately, this is far too complicated or not mature enough to be applied. So, I'm not sure about the chances of success of this company. I don't know what products they will be releasing, but I'm a bit doubtful about this. Yeah, I can totally understand. I don't know about you guys, but when I write an email, I usually erase something, go back and change some words. And of course, if I was just good enough to say my email perfectly the first time, maybe that would work. But as you were saying, I'm like, oh, can you change this word by this one? Oh, this is not what I meant. I was like, that can go into a lot of complexities. But at the same time, yeah, I was like, can you recommend me a restaurant around that has space for five people? I don't know, maybe there are use cases where it's good enough. You were talking last time, I think, Guillaume, about some people you know that are using ChatGPT like this. Do you know what kind of usage they are doing? It's everyday questions and work questions as well, because they are, well, in fact, they are typing at the same time they are talking to the ChatGPT instance. So they are multitasking instead of just using this speech conversation. Well, it is a conversation, but they are typing as well. They are talking to it. So, yeah, that could be one of the usage of these kind of devices. Yeah. So it's not a replacement. It's an additional layer. Yeah. Instead of working alone, if you have something to ask or if you need help, if he needs help, then he is simply asking to ChatGPT that is in his pocket. So, yeah. Okay. And, of course, he is verifying the information and doing all the work because, yeah. One of the usage we talk about is also for disabled people to be able to locate themselves in a place if they have issues or remember stuff that they placed somewhere in their space if they can't remember correctly things. But for this one, having an avatar that talks to you, I wonder if some people will not be addicted or will feel like they are really their friend and it will cause maybe disconnection with other people. Yeah, there is some fear on some tests, I think, laboratory tests to do with this kind of technology to make sure it doesn't cause any other issues for people. Yeah. I think, actually, so I'm quoting this by memory. So maybe I would be saying this wrong, but I think I saw an article. Someone has like 10 AI girlfriends already. So it's coming. It's coming. Nice. Okay. So anything more? No? Okay. So, Seb. Sure. So this week I wanted to talk about the fact that there is a new trend coming is to generate video with the new OpenAI SORA video system and generate Gaussian splatting from there to generate directly a 3D model that you can use in your 3D engine. So this is a video of a first step that has been done. So it's showing that it's coming quite fast and allows to generate 3D model of environment that doesn't exist and that has been properly generated by AI. Here's another test. It was done a bit before, so it's not working that well on those tests except for the museum here. So museum of art that does not exist. And it's really the first test that has been done. With, I guess, more direction on the way you generate the video in SORA, having more panoramic video where you turn around, go closer to the wall and closer to the environment can, I guess, generate better Gaussian splatting. So, yeah, that's it for that. Guillaume, do you want to comment on this? Yeah, sure. The first thing is that SORA is not available yet. So we are just working with the demo video that are available on their website. I don't know if there is a release date. I didn't see that because it's fresh news. Basically, they announced SORA because it was the day of the Gemini launch, just to bypass old communication towards Google, which works because nobody heard about Gemini much. So their strategy just blew everybody's mind with this text to video generation AI. So very good move there. I was impressed at first with this. And then I asked myself, what could be the use cases of this? Despite the fact that you could generate artificial environment very, very fast. We know that Gaussian splatting is not perfect yet. So, yeah, it's a very promising step towards, like, massive 3D content generation. Especially if we can do the transition between Gaussian splatting and mesh, that would be, like, not the end of 3D artists, but, yeah, maybe the same, like, earthquake that we had with mid-journey and stable diffusion on the 2D part will get the same on the 3D one. And we see that Luma AI is clearly well positioned on this because they are doing AI 3D generation, mesh 3D generation. They're also doing the Gaussian splatting side. So I guess they are well seeing what is going on, and they are, yeah, they are positioning themselves in this business of global 3D AI generation, which would be, like I said, some kind of a revolution in the 3D artist world as well as the 2D classic artists. But if you have innovative use cases, you're welcome. No, the one you listed is clearly the one I have in mind too, but like you said, it's not there yet, but it's moving quickly toward it. Yeah, I think the key word here is yet, like, one year ago, we said, oh, the video generation is not there yet, you know, that Will Smith eating spaghetti video. And now you can see the video that are having 60 seconds long and quite stable during that duration. So, yeah, that's impressive. Assuming, again, that the actual product will be able to generate similar videos on a regular basis, not like they just did the selection of the best ones. But yeah, and as a side joke, it's a joke, actually, I saw that OpenAI is looking like to 7 trillion investment or something like that. And I saw someone doing a calculation of, okay, GPT-2 costs this amount of dollar, GPT-3, and then GPT-4, 5, 6. And yeah, it makes sense. Like, if you want to do GPT-10, well, the costs kind of add up pretty fast. So, yeah. You're mentioning the price. And I think that this is a point that everybody just missed about this AI video generation is that for generating a 60 second clip, how much will it cost? Because we know that OpenAI is not free. So, when they are saying that this kind of technology can just, yeah, turn upside down the whole movie industry, I'm very curious to see how much a movie would cost by making this through AI. Because the computation power should be huge to get this kind of results. And people are just used to very low cost tokens. I guess this won't be the same for this kind of technology, but maybe I'm wrong. Yeah. And the power consumption of this kind of RenderFarm will be enormous. Yeah. There will be a bottleneck somewhere. Right. The next topic was some videos of the Vision Pro scan that seems to showcase that there is not a lot of limitation in terms of size of the scan that you can have. You can see people walking from the restaurant and seeing, like, some window in the street far away from where they instanced it. And how fast it scans. The environment is quite amazing. And based on that, we can see some, yeah, even here in the stairs. So, it keeps on adding to the mesh and seems to be accurate through time. So, that kind of reassured me a lot in the capacity of the headset to be able to, yeah, to scan a complete environment and be able to go into a different room and go through a complete scenario. And here there is, I think, the same person that shared that he was able to instantiate a realistic 3D model of pool balls and have them occluded by his hand, the environment, and also collisioning with the environment quite well. So, yeah, it sounds promising. What do you think about that, guys? Guillaume? Yeah. Well, this is funny that it's always the same use case that we've been seeing since the HoloLens 1. Of course, the meshing is better. I think the method that they are, I don't know if it's the developer that chose this representation, but it's hard to see the real quality or definition of the mesh because everything is green at some point. But, yeah, sure, it's better than all the old techniques that we've seen in the past. So, yeah, it's promising, but which is some frustrating in some way that this technology is here. I've been here for years and nobody really took advantages of this. So, we've always seen this demonstration of ball falling around or liquid just pouring on stuff. And you can see that the environment is recognized, but we don't have any very valuable use case with this. And I don't know why. Usually, yeah, just a few apps took advantages of this. And the other point that I would like to mention is that the occlusion works great unless he is moving too fast. And you can see that it should be right there, here. There you can see that the occlusion is not working as well as when he is static. So, of course, there is some improvement there. But, yeah, technology is there. We're just waiting for great apps to take advantage of this. Despite the fact that Meta is doing some games with this. Yeah, I saw a video as well of someone putting a point in space and walking into a parking lot. Something like, I forgot the exact distance that they were walking, but 100 meters or something. And then back to the point that they started at. And it looks like from the distance, the point kind of shifts. But as soon as they got closer to, I think it was 10 meters or something, the point switched back to their position. So, yeah, it seems like, as you were saying Guillaume, the technology is there. And maybe having a processor dedicated to that is also helping a lot. So, yeah, pretty cool. Yeah, in terms of technology, they seem to use the same kind of thing that was available on the HoloLens 2, where you could have a different unshore in your space and do the unshore precise when you get close. Yeah. Well, stuff in the environment can also react to your position, like lights and butt kickers, a vibration, a door that opens when you're in front of them and you enter the code in mixed reality, stuff like that. But for a use case at home, well, it's nice that you can position stuff in your space and when you come back, it's still in place. It's a kind of use case. And also for training in a work environment, being able to go to a different room and show information about the mechanics and stuff you are looking at can be nice. Something about that that I've seen and hopefully we'll be able to confirm that soon. But it seems like the guest mode on the Apple Vision Pro is pretty limited, which seems like between each person, even if you remove it for like two seconds, there is the scan that needs to happen before the user can use it. And the owner needs to be there, I mean, for the initial launch. So hopefully they will have upgrades or, you know, event mode or whatever they want to call it. Kiosk mode, I think it's called on iOS, where switching from one user to the other would be way better and more seamless. So, yeah, stay tuned on that. Hopefully we'll be able to test soon. Great. Do you have anything more to add, Seb? Just a quick info, I received the VIVE Ultimate Tracker and I did a couple of tests yesterday with it. And yeah, they are nice. I'm using it with the VIVE XRLit. And I need to retry to do another calibration, just to make sure it's working correctly, because right now I had to, yeah, there is a small issue with the OS, so I had to reset my headset to make sure they are tracked. But now they are working correctly and yeah, the position is very accurate. And having your character really moving and seeing your legs moving at the same time is quite impressive. I'm also using and testing the VIVE Face Tracker and mostly right now I did not use the camera to test the expression on an avatar. That's the next step that I want to test. Right now I'm only using the eye calibration and the fact that you can select and highlight the button that you want to select first by looking at it. I'm quite impressed. It's directly taking over the... It motorizes the calibration of your IPD, so it allows to directly adjust it dynamically. And yeah, it's a nice add-on compared to having to do it by hand. Having this and it can be done automatically. So if you provide it to someone else, you can set up in the device to automatically adjust the IPD. So it starts directly when it sees that the IPD is not the correct one, it directly starts the IPD measurement. So yeah, it's a nice add-on to the device and the way they made it is quite clever with this plug. Although the magnetic system is not that strong, so it tends to be removed easily when you put it on. So that might be something to do better for the next generation if they move forward on this way of doing the headset with modular stuff that you can add on it. But yeah, the Vive Ultimate Tracker is very nice and I will dig down into it to see how to implement that into Unity and make some more tests in the next coming week. Great. We can't wait to see that. Yeah, let's do this. I would like to talk about our friend Mark Zuckerberg, who did his own Apple Vision Pro review. So I took some notes about what he said during this three minutes and a half intervention. So basically, it's an advertisement for MetaQuest 3. This is not surprising. But yeah, he's pushing where it hurts for the Apple Vision Pro, especially on the comfort side. He emphasized that Quest 3 is better in terms of weight. So yeah, we know that for sure. It's lighter. The weight distribution is better as well. The fact that the battery can stay without any way of plugging it also is more comfortable in the use of the headset. He said several times that it's seven times cheaper than the Apple Vision Pro. So yeah, it's an easy one. And of course, they have VR, they have controllers, they have games and a lot of apps. I can defer about the app theme, because we know that Meta and Quest 3, despite the fact that they have a lot of games, they don't have those key apps that we are waiting for, especially for everyday life use, and for people to get back to their headsets as often as they would like it to do. So yeah, this part I agree with. The other one is about the eye tracking. So yeah, the Apple Vision Pro eye tracking is good, but yeah, we did it already on the Quest Pro. And a quick word that it's not available anymore on the Quest 3. But a great announcement, he said that it will be back on the next iteration of the headset. So interesting that the competition forced them to bring back the eye tracking, and we still don't know why they didn't keep it on the Quest 3. Another thing that he's saying is not completely right, is that the hand tracking is better on the MetaQuest 3. But with Seb's video, we can see that the occlusion is way better on the Apple Vision Pro, and globally the hand tracking seems to be way better on the Vision Pro. So it's not an honest review about this. And finally, he's saying that the motion blur is quite obvious on the Vision Pro, and the image is better and crispier on the MetaQuest 3, which is very weird because we've seen the distortion with the MetaQuest 3, and this is way worse than what we've seen with the Apple Vision Pro review. So yeah, not so sure about that one either. So one interesting thing is that the video started with their presentation of the ability of MetaQuest 3 to have what is called now the spatial computing feature, meaning that you can put windows all around you and you can work with this. So they just showcased that they can do that as well as the Apple Vision Pro. Well, as well as has to be moderated because we know that Apple has the ecosystem and all the interconnection between Mac, iPad, and the Apple Vision Pro that Quest doesn't have. And finally, in the end of the video, Mark Zuckerberg simply took S about Apple boys and Apple fans, just saying that they're just a bunch of crybabies and they'll be like screaming about what he's saying, but he's right. Yeah, it's a bit childish from his part, I guess. And the final thing that he said is that, well, he's simply comparing Meta and Microsoft and they're saying that Meta is the new Microsoft and they will be winning the war against Apple in the end. Yeah, very, very strange two last comments of this video. But this is very interesting to see that on the communication part that he's putting his own face in the defense of the MetaQuest 3. So as he did in mostly all the immersive products announcement, he is taking full responsibility of the failure or success of these devices, which is very interesting to see. So his implication is not to be proved anymore about his vision about the immersive industry. So what do you think about this? Well, sorry, I think so. I'm not a communication expert at all, but it's very interesting to put in comparison this video where, you know, he's sitting on the couch and just I think it's recorded with a phone, it seems. It's recorded with the Quest 3. Oh, yeah, OK. Yeah, even better. So with Tim Cook in the cover of Vogue, I think, or some magazine, fashion magazine like that with the Apple Vision Pro on his head. I mean, I don't think I need to go deeper into that. You just see the difference of communication strategy, which I mean, the Quest 3 is a great device. I'm using it a lot, actually, mostly for gaming, I have to say. But it's a really great device as well for, as he said, much, much cheaper. So, yeah, I don't know if we will ever see like a battle like, you know, in between Blu-ray or DVD or whatever, this kind of battle happened in the past. But I think in terms of number for now, Meta is winning for now of number of headsets sold. So, yeah. Seb, what do you think? Yeah, it's funny that they enter this kind of war when they should mostly try to find a way for users to use it all together and not compete against each other. Of course, it's different kind of headsets, different kind of pricing, but both doing the same kind of interaction. So, Apple is trying to segment the market with their own naming and their own way of doing things. I think WebXR is not available or WebAR, only WebVR is available on the device right now. So, even this is strange for me when the community is not that big yet. And like Guillaume said, they're all trying to have more people using the headset on their everyday life. So, doing that, I think it's important for them to understand that for this to happen, people need to find their own app easily and interact the same way using different kind of headset because it will evolve in time anyway. So, yeah, I think it's not a good way of going through this. Now, like you said, the Quest 3 is awesome. I have it also here. I use it for mostly project on my side on my everyday work. And I saw news about the Vision Pro. You were saying that right now Meta is winning the battle in terms of number of headsets sold. I saw news that a couple of people are returning their Vision Pro right now because they don't see an everyday use case for it and it's too heavy for them to wear an everyday use case. So, yeah, it's interesting to see that there is a couple of bad feedback around that are coming. Yeah, it was expected because a lot of people just bought the headset for the buzz or the attention. And once again, I guess the media are enjoying this to see that some people are returning it. We don't have the numbers and we don't know if it's a lot or just a small portion of the users. So, yeah, as always, this is a very emotional market, so we have to be cautious about those news. It's not everybody that is returning their headsets. And of course, people are not ready yet for... I guess some people just bought the commercial speech and thought that it would be like a revolution. But as a first iteration, we know that it's never the case. And of course, some people are not enjoying it as some VR or AR people that are already experienced are appreciating the technological step that they've done with the Apple Vision Pro. I wonder if they had any experience with that for any of their devices before. Maybe their watches didn't have... Yeah, Apple watches, they had the returns, yeah. Because once again, they maybe oversold it at some point. People were not pleased with what they had for the price. I guess the price is to be put in the picture as well. If you have the possibility to get your cash back, you will do it, especially if you don't really see the point of using this. You did your picture, you did your video, people saw that you had the headset, so now you can send it back and get your money back. Okay. Sorry, to create a new business, we buy 10 Apple Vision Pro and we want it for people just to take pictures and search on it. Well, you just have to go to an Apple store, you can have the demo, take pictures and go back home if you want. Okay, great. So, I think this is it for today and we'll see you guys next week for another episode of Lost in Immersion. See you guys. See you. Thanks.

Episode #{{podcast.number}} — {{podcast.title}}

Transcript

Subscribe

Episodes

Credits