Lost In Immersion

Welcome to episode 72 of Lost in Emotion, your weekly 45-minute stream about innovation. As VR and AR veterans, we will discuss the latest news of the immersive industry. Hey guys, how are you? Hello. Hello. Fine. So, let's start as usual by you, Fabien. Okay, thanks. First, I want to say a bit of updates about the Vision Pro. So, we talked about how painful the guest mode is. So, the Apple Vision Pro is tuned to your own eyes and your own hands. But if you want to give it to try out to someone, you have to activate the guest mode and have to go through the eye calibration and the hands calibration. And if they remove it just like one second, the guest mode gets deactivated. And if they want to do it, they have to do it again. So, that's really, really not like it's really a personal device, not made for events or even for sharing. But there is an update on the guest mode, which is, it's not huge, but it saves the previous guest configuration. So, if you have two person, for two person, it can work well, assuming you're not giving it to a third person in the meantime. But when you activate the guest mode, the guest can choose from the latest guest configuration. So, it won't solve all the problems. It won't make this device really easy to be used for, you know, events or for many users. But still, you know, it's a small improvement. Yeah, they really targeted the family or friends as they showcased in their video. Because you are like a group of four or three and you're playing all together. So, unless you're very selfish and you kept your headset while the others are just watching. It's not what they are aiming at for this device. So, yeah, it's a natural way. Do you think you could do like a medium calibration for all? Meaning that you have a guest profile that would fit more or less to everyone? I don't know because I tried the headset with the latest guest mode of someone else. And it's really off. Like, it's really difficult to activate anything, actually. So, I think, yeah, even if all our differences are pretty small, it's still significant for eye tracking. Okay. Okay. Seb, anything about this? No, I can't wait for them to open it up even more. Yeah. And make that really, really easy for being able to share it to other person. Yeah. Cool. Okay. My next topic is about OpenAI. They released a couple of days ago a new model and actually a new series of model. And we can see that because they reset to number one. So, you know, DPT 2, 3, 4, 5 and now. It's really like a different class of models. Mostly dedicated to professionals. So, it's really different from chat DPT where, you know, you can ask many different questions about almost everything. Here, the O1 is really dedicated to scientific problems. So, resolving math, code encoding and, like, complex tasks that require reasoning. It's dedicated to that. And the way it works is, like, they train models to think and rethink. So, it's not just one cycle. What they are doing is, like, over multiple what they call a chain of thought. So, I don't know exactly the details of what's happening into this chain of thought. But it seems to be, like, the key difference from the DPT models. And so, they give a few performances, like. So, DPT 4 is 13% of problems on the math Olympiad. And O1 is 83. You know, they have a lot of different pretty impressive, it seems, qualities on this model. And they showcase, for example, you know, in different areas of scientific research. You know, quantum physics. There is a video about genetics. I think this one. I'll just look at it. When you see, like, someone using the model to analyze some genetic materials data. It's pretty impressive. And for coding, they actually recommend to use the O1 mini. Which is faster and cheaper. If you use the API. That gives still pretty impressive results on coding tasks. Something else that I think is interesting is about safety. So, when they explain here is they purposely hide the chain of thought process from an external observer. So, someone that would try to jailbreak O1 would not be able to see what the actual process of thinking is. They are only able to see the end results. And what they explain, the reason behind that is it's a bit going into, you know, the AI doom kind of framework. But if a model would have the purpose of, like, lying to a human, and if they knew that the human would also be able to look at the chain of thought. It would be much easier for them to lie than if, like, it's kind of a hidden process and we can guess from their answer if they are actually lying. So, they kind of removed one layer of deception that could be done by the LLM. You know, there are a lot of discussions about if it's exact and if it has really accuracy on that. But anyway, I thought it was pretty interesting. So, yeah, that's it. So, OpenAI O1 model. And what do you think, Seb? I haven't looked at the video yet, the one you shared on this page. I have to look at it to see the results they are getting. But on paper, it seems quite impressive. And it's funny how now models are becoming more specific to specific tasks. Like, this one is for physics research coding. We know that some other models are more for creativity. Yeah, they are all starting to be more focused on one task. Like, the one, the mini for coding, the other one for more complex PhD knowledge stuff. So, yeah, that's still impressive. But now we foresee that there will be many more models for different tasks. And we need to check out which one is the best when we start a project to see which one we need to use. And a lot of license to pay if you want to use their model. Yeah, actually, I didn't mention it, but the API for this one is much expensive than the other models. And something I forgot to mention as well is that in terms of practicality of use, it's still a bit behind GPT-4. Sorry, GPT-4.0. Because, for example, you cannot yet upload an image or O1 has no access to internet yet. So, it's something that they are planning to add in the future. I think I saw that the training was done until the end of 2003, 2023. I think that's a good question. I didn't look at it. Yes, so I took some notes about your remarks about the global approach of AI. You're completely right. Since we've discovered that the goal of a global AI is not reachable yet. Because as we are putting more and more data to the training model, the AI is not becoming smarter for this reason. So, the best approach we have now is, as you mentioned, AI agents specialized in a certain field. As we can compare with what humans do. We are all experts in our fields and we can't be experts in everything. So, it's really a human approach of AI that is the target now. About the OpenAI, its codename is Strawberry right now. And their goal is to use this model to train another one, which is called Orion. It's not the chat GPT-5. It's another version. And it should be dedicated to a more global use than just mathematics and logics. About the chain of thoughts. The new trend right now is to use AI to autocorrect itself. There are several models right now that are feeding themselves with their outputs to check if it's correct or create a more complete answer. So, I guess this is what they are doing. We just don't know how many times they are doing this and what kind of prompts they are adding to the output-input loops. About the use or how smart is Strawberry right now. It is compared to a PhD student. So, if you remember the last versions of AI or especially 4 and 4.0, it was more like an intern or a young developer kind of answer. So, now we are just stepping up to a PhD student slowly but surely. Fabien, you said that it would be more expensive. They thought about putting it to $2,000 a month. So, just 100 times more than what you are paying for your access to chat GPT. So, why $2,000? So, we will see if it's as powerful as they are advertising and if it's compatible with this kind of budget. And finally, something that I found very interesting is that Sam Altman made a statement based on an AWS research paper. That right now, 57% of the content that is published on the internet comes from AI. And the problem that Sam Altman is enlightening is that as AI models are based mostly on what is going on on the internet. If 57% right now or 60% is now generated with AI, we will be soon in some kind of a loop where we won't have anything original posted on the internet. So, basically, AI won't be able to learn anything more from there. So, very, very good question and statement or whatever. And I guess we could have thought about this when we first get those AI models in the street. Because, of course, people will be using it to generate content because it's faster and easier. So, of course, it would be the most way or source of new content. So, what do you think about this? That's interesting. I think that's why... Yeah, I think that... Sorry, go ahead. Okay. Yeah, no, I think this... So, first, about the model asking itself to check. I noticed this on GPT-4, actually, when I asked for an image and I asked for a correction in that image. And it generated a new image. And then immediately afterwards, without me saying anything, said, oh, it seems like I didn't manage to fix the issue. Do you want me to try again? So, it's kind of, yeah, the self-check, which actually, I think, correlates to what you were saying about the data on the Internet. It's like how... That's like the big question is how AI can learn new things. And is just, as you were saying, adding new data, just a way of improving the models? Or it just doesn't work? And, well, using an Internet anyway will not work in making the AI much smarter. Maybe it's a new thing, different from LLMs, that will go to the next step of intelligence. We don't know yet. Seb? Yes, I was going to say that there is a lot of model checking also in the content that they are looking at is generated by AI or not. So, maybe that's one way to go to when you try a new model to check where this was coming from and what the content and push away whatever is generated by AI. Or that might be a way to go. Yeah. A lot of new techniques, I guess, to invent, maybe. Not by us, I guess. Okay. Anything else to add on O1? No. Okay. And just very quickly before handing it over to you, Seb, is quite a big news, actually, from AFOL that lowered their price by a lot. So, it was 3,000 per ounce, and now it's 7,000. So, that's great news. It's still expensive, I think, but that's much, much more affordable. So, that will make AFOL actually affordable for a short-time project. If it's an activation that's online for a couple of weeks, that's perfect. So, it's great news for any AR developer. Okay. Thank you. Bye-bye. Bye-bye. With the student. Okay, Seb. It's not an object to display in the space. So, it's great that they are lowering the pricing. So, hopefully, we can make more experiences for our clients and sell it more easily. Yeah. Yeah, just to correct, it's 700 per month. You said 1,000, but yeah, it's 7,000 a year, a complete year. So, it's less than 10K a year. So, it's clearly something that can be achievable by a small company and companies that would like to start in the AR field. So, great news. I think they heard our other podcast where we were making a great review about AFOL, but the main issue was the price. So, I guess they corrected it. Yeah, we are famous. Yes. Okay, cool. Simon? Yes. So, on my side, I want to talk about the Unity that is canceling the runtime fee. So, they changed their mind. And for Unity 6, however, they will rise a bit or quite a bit the Unity Pro license to plus 8% and the Unity Enterprise to up to 25% more. So, great move for the person that was with games and releasing them. Bad news for those that are doing some experiences and not selling directly the app, but just developing with the tool. But yeah, I don't know. I don't have a lot to say about that. But it's funny to see that they changed their mind. And lost maybe a year of trying to sell that and going back because the community were not okay with that and switching maybe a lot to Unreal. So, yeah. What is your thought on that, Guillaume? No, yeah. But it's basically what the community was asking for. They changed its CEO. So, yeah. It took a lot of time to change their mind. And yeah, as you mentioned, as you said, it confirmed that they may have lost a lot of project and users in favor of Unreal during this period of time, I guess. Yeah, it's about a year. So, yeah. It's way too long to be honest about this. And maybe, yeah, they just find another way. They're just making the licenses more expensive. So, we'll see if it fits the new market. And for all the projects that I'm following, most of them are under Unreal right now. So, we'll see what will be the future of Unity after that. Yeah. Not much to add on this. Yeah, it's super late. And I will echo what you were saying, Guillaume. I hear around me a lot of people using Unreal in China as well. I guess the best marker for this is for you to check some dev, 3D real-time or immersive video on the internet. A few years back, everyone was using Unity to create very quick mockups and stuff like this. And now, all the new YouTubers, they're all using Unreal. And even on the AI part, there is lots of projects for you to be able to use AI building blocks inside Unreal so that you just have a low-code, no-code method for you to create immersive experience. So, it's a very interesting approach. And once again, Unity is not on that page yet because they have some issue with their AI model and people were not very willing to have their project taken by Unity for training and stuff like this. So, I guess the global approach is way better on the Unreal part now than it is on the Unity. And yeah, you just have to check what people are doing and they are doing Unreal right now. Okay. All right. So, next subject is this paper. What is Dual Gaussian Splatting for Immersive Human-Centric Volumetric Videos? Which is basically a way to record with cameras. There is a lot of rig of cameras. Record someone and being able to have that generate 3D Gaussian splats that you can use in a VR headset with a very high-quality output. So, there is a couple of videos here on this page that is showing their results. And yeah, as you can see, they can relight the Gaussian splats apparently. All of the characters here have been shot, I guess, in the center of Unreal, like this one, from different cameras. And their technique, the Dual Gaussian Splatting, means that they have one Gaussian splat that is taking care of the skin and rendering. And the other for the movement and the animation. And they merge the two afterwards. So, technically, I'm not completely sure on how they do that. But the end result is really impressive. So, I don't know what is your thoughts on that. Guillaume, maybe? Yeah, I guess the most impressive thing about the rendering you're showcasing right now is the rendering of very, very small parts. Like the chords on the instrument. Or the blade as well, when you're seeing the woman here. I guess that the precision is on a level that can't be obtained with anything else right now. You can't do this with 3D modeling. And I guess this is exactly what Gaussian Splatting should be targeting. Is that if they want to impose or to make this technology used on a high scale or high level, it should be showcasing this kind of stuff. And really showcasing that you can do things that can't be done without using Gaussian Splatting. And this is exactly what they are showcasing now. Basically, at this stage, we predicted maybe two years ago now, when Gaussian Splatting started. Or a year and a half. Meaning that it's a new media and maybe a new way of doing movies or interactive content. So, we'll see where it will be going. But this is exactly what we forecast. Meaning that you have a scene with several agents. And you can move around them. Like if you are part of the scene. So, yeah. This is exactly going where we would like it to go. So, yeah. Very happy with it. Yeah, the frame rate is impressive. The rendering is impressive. Yeah, it opens up a new way of creating the experiences and having something really realistic. Yeah, quite impressed with the end result. Fab? Yeah, it's very impressive. Just back to Ace Wall. I saw that they are also starting to support some types of Splats. So, it's really nice to see that it's also going to the web. The quality is impressive. And I guess if we look at what's next. Now they are using... You showcased this huge ring with I don't know how many cameras that they are using. How close are we to be able to do that with a much cheaper set? Maybe two or three iPhones. Yeah, to make this more affordable. And that can really scale up. So, yeah. Impressive. And as you said Guillaume, the rendering of the details is really awesome. I guess the next step is to make that communicate with the AI. So, you can shot with three phones and try to generate with AI the other point of view. That could be done. And then generate the Gaussian Splat with a high quality level. But yeah, that's coming fast I think. Alright. And the last one. So, if you can just show them. Yeah, good ideas always come from Ackerton. And this is one of them. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. So, from Ackerton and this kind of events, it's always the simplest idea that makes a big impact. I don't know if you use cases, but yeah. It's fun to see that you can copy what you would be doing in Photoshop or GIMP, for example, and just apply it in real life. It's a very fun way of doing this. It could be great if you could paint directly a 3D model that you have, so you can directly pick the material that you have in your real environment, how it looks in your real environment, and apply it to your model. Fabien, any thoughts? No, not much. Really cool, but yeah. Alright, so I guess it's your turn now, Guillaume. Okay. So, last but not least, I would like to come back to what Qualcomm said a few days back, I guess it was like 10 days ago, maybe 8 days. They talked about their project that we know should be coming next year. We know that there is a device in preparation in early 2024, but they postponed it following the Apple Vision Pro release. And now we are all asking ourselves what this kind of device could be, and Qualcomm gave us some hints. It should be a pair of small glasses. So, yes. The hints that were given to us during the Google I.O., for example, where they showcased the Gemini AI, the competitor to the 4O. And at some point, they were not using their smartphone anymore. They were using some kind of glasses, and apparently it is the device that they are willing to release. So, here you have a quick shot of what it could look like. So, it looks like a normal pair of glasses, maybe a bit more bulky. And the idea here is to have information, and for information and AI as an overlay for you to be able to use Gemini. So, basically, if Google is in the loop, you will have Gemini in it, as it is their main goal to put AI everywhere they can. So, very interesting to see if it will be some kind of Google Glasses 3. Hopefully, it will be better than the previous project. But, yes, very interesting to see that maybe their goal is to have this kind of very light device for us to use on a daily basis. So, we'll see what we can do with the current technology. Seb? Yeah, I can't wait to see it coming and see what they are really doing. I saw some different glasses that were released in Asia. I think there was a SIGGRAPH event. And there was a couple of companies that released Augmented Reality Glasses, but that does not do any tracking, just displaying a layer in front of the user. As a hover display, as a display, I mean. So, we'll see if they went this way also, because I don't see a lot of cameras on it and a lot of ability to track your position in space. So, yeah. Now, from the latest photo or leaked picture, apparently, it is more like smart glasses with the AI overlay and the ability to film or to get a picture of the external world. So, it's more like a ribbon glasses with the display on the lenses. I guess it is the only thing that we can do right now with the technology. So, if you have a better battery and better CPU, GPU inside of it, it could work. So, given that Qualcomm is in the loop, we can be quite confident about this. But maybe at this point, having that would be… So, we'll see. It's indeed interesting to see the trend. So, if we look back when a couple of months slash a year ago, there was a lot of, you know, like this human pin device or this frame. I think they were called frame, where just like a way to interface with an AI by putting the interface on the glass basically. And now we are seeing the giants. So, Meta in a couple of years, Qualcomm, Google, Samsung, going into kind of a more advanced device than this simple interface. So, you know, again, I think the question is, will the mass market find this useful? Are they ready to wear this kind of glasses every day? I mean, from the success that the Ray-Ban, the Meta Ray-Ban has, it seems like it's something that the user are more willing to do than wearing a Vision Pro from 8am to 9pm. So, we'll see. But yeah, I think it's pretty, I would guess pretty easy to predict that they will sell more than headsets. It depends on the pricing and if the AI model is running locally or online, if you have a license to pay every month to use that kind of device and capabilities. Because if it's running online, then I guess the server cost and stuff like that to compensate. So, I don't know what will be the business model on that. And the impact on the environment. Because if everyone is streaming their videos and have AI model running on servers to understand what you are doing every day with the glasses, that's going to be a lot of massive amount of increase of video streaming and stuff like that. It's okay. The OpenAI O1 or maybe O2 will solve climate change anyway. Okay. That's a joke then. Okay. Okay. Okay, so I think we lost Guillaume, but I think we were done anyway. So, thank you all for listening and we will be back next week for another episode of Lost in Immersion. Next week will be videos, right?

Episode #{{podcast.number}} — {{podcast.title}}

Transcript

Subscribe

Episodes

Credits