Subscribe to get the latest
#144 Science Behind Digital Twins
on Thu Jun 22 2023 00:00:00 GMT-0700 (Pacific Daylight Time)
In this episode Darren explores the science and use cases behind digital twin technology with the principal architect of Intel's ScheneScape.
Have you ever wondered how robots and machines navigate the physical world around them? It’s all about accurately representing the natural world in a digital form called a “digital twin.” A digital twin has a standard coordinate system enabling different applications to make sense of a real space or environment. It’s like a virtual version of the natural world, allowing more efficient and effective data reuse across systems.
Digital twins may not be a term we commonly use, but the concept is becoming increasingly popular, especially in the manufacturing, retail, and security industries. A digital twin is a virtual replica of a physical object, process, or system that can be used to monitor and control it in real time. One example is Google Earth, a highly latent digital twin of the world. However, as technology advances, there is potential for reducing latency and creating near-real-time digital twins for more efficient control and monitoring.
The applications for digital twins are vast and varied. For example, factories can use digital twins to improve safety and optimize production lines by monitoring where products and where people are for security purposes. Digital twins can also be used in augmented and virtual reality, enabling users to walk through spaces that may be inaccessible or dangerous in the physical world. Even everyday tools like Google Maps use a form of a digital twin to provide real-time traffic updates and information on accidents.
Digital twins are becoming increasingly important in the development of machine-based AI. Just like humans need spatial awareness to make sense of the world around us, machines need digital twins to navigate and interact with the physical world. The possibilities are endless for this technology, and it’s exciting to think about how it may shape our future.
The implementation of digital twins requires the integration of multiple sensors and calibrating their data into a common representation or digital twin. This process can be complex and requires standard units to ensure consistency between different industries.
Interestingly, the gaming industry has inspired the development of digital twins due to their experience creating virtual worlds with complex physics engines. By adopting existing standards used in the gaming industry, such as the Universal Scene Description format, it is possible to develop a universal representation of physical spaces.
Digital twins also have the potential to enable closed-loop control in various applications, introducing near-real-time current rules to systems. Perhaps in the future, we will have intelligent scenes similar to those in the Iron Man movie, where one can speak to their home’s smart assistant and control devices through it. The possibilities of using digital twins seem endless, and we will likely see more of them affecting our daily lives.
Efficiently solve complex problems in various industries. Intel’s Scene Scape is a product that came from the company’s efforts to develop a vision that would enable the transformation of pixel-based units into real-world units and cameras. The product is meant to help turn sensor data into virtual models of the natural world, known as digital twins, that can be used to drive better outcomes and operational efficiencies. The technology relies on multi-modal tracking and motion modeling. It can monitor and track people, vehicles, and equipment across various use cases, including transportation, healthcare, retail, and factories.
One of the exciting aspects of Scene Scape is its ability to estimate where someone will go and the next camera they should be showing up in. This is useful when trying to cover large spaces with cameras or sensors. However, there is always an error bar on the measurement, which means different sensors may need to agree on where something of interest is. To address this, Scene Scape uses a motion model to extrapolate movements, allowing for accurate tracking and monitoring of subjects.
Overall, digital twins are a technology still in its infancy, but the potential for their use is enormous. As technology continues to improve, we will likely see more applications for digital twins and more industries leveraging their use to drive better outcomes.
Are you curious about how digital transformation can benefit you and your family? In this episode of Embracing Digital Transformation, Rob discusses the various use cases for digital tools. One exciting example he brings up is using technology to track your children. While this may seem controversial, Rob argues it is a responsible use of digital tools. Monitoring your child’s location and activity can give parents peace of mind and help ensure their safety.
However, this is just one example of countless use cases for digital transformation. Rob encourages listeners to think about how technology can improve outcomes for themselves, their businesses, and their communities. From streamlining processes and increasing efficiency to enhancing communication and delivering better customer experiences, digital tools can provide many benefits.
Hello, this is Darren
Pulsifer, chief solution,architect of public sector at Intel.
And welcome to Embracing
Digital Transformation,where we investigate effective change,leveragingpeople process and technology.
On today's episode, The Science behind
Digital Twins with special guest
Rob Watts, principal architect of Intel's
Rob, welcome to the show.
Thanks, Daryn. Good to be here.
I'm glad I'm glad you came. You.
You're workingon some really cool technology at Intel.
I've seen it in action.
It's super cool.
I've seen it on stage in lifedemos, super cool stuff.
But we're going to tease everyone and say,you got to wait.
You got to waitbecause we want to hear about Rob.
Tell him,tell my audience your background,and then we'll dive into why,why you're doing what you're doing.
Well, it's kind of, you know,everybody has their story, but I come tothings from more of a physics backgroundrather than like your traditional I.T.background. Soreally,
I have a master's in applied physicsand did some work.
And, you know, I did myyou know,some graduate stuff at Los Alamos lab.
And then I went into sellingsemiconductors for about five years,and then that's then some entrepreneurshipand then landed at Intel.
So focusing more on like, you know,holistic solutions,not just one particular discipline.
How do we bring togetherconnectivity, wireless, Iotsensing in orderto do something really special?
So you're a physicist.
Well, I have a masters.
That's what you learned in school?
Yeah, I did.
I did study physics or applied physics.
And at the time it was a concentrationand like microwave theoryand applications of of electrodynamicsand that.
Oh, don't get me started.
That was my hardest class in college.
Yeah. It's because I was.
I was a electrical engineergoing into the power option.
And my hardest classeswere my electromagnetic classes.
And you probably eatthat to eat that for lunch.
That's no big deal for. You, right?
You know how things go afternot using somethingfor a while,but the principle is remain, right?
Actually, one of the moreone of the more challenging classes
I took in grad school was plasma physics,which somehow managed to combinethe hardest parts of quantum mechanics,fluid mechanics and electrodynamicsinto into one discipline.
So, yeah, that's.
It sounds that sounds horrible.
I'd like to ask the class,but I'm not sure I.
Oh, I was happy I passedmy electromagnetics class with a C plus.
Yeah, I was like, That's a passing grade.
So. All right, So, Rob,tell me a little bit about how yougot involved in what you're doing now,which is mostly aroundmimicking the real world in software.
You're creating digital twins,basically, right?
Well, it really comes down to what iswhat are we trying to do with sensor dataand and coming at itfrom a physics perspective of how do you,you know, how can you make the dataas reusable and and valuable as possible.
And so what it comes down to it for me ishow do we get it in the right unitsand in standard units.
So like, let me give you an example. Andin a typicalapplications that uses uses video,what we do iswe run it through a deep learning modelthat's running some inferencing that drawsa little bounding box around a personor a dog or a vehicle or whatever.
To object detection, right? Yeah.
So yeah, detection draws a bounding boxor you could do fancier stuff likesemantic segmentation that actually, youknow, is connects the dots around that.
The, the object or like classificationwhich is saying what's in the whole image.
But ultimatelythe context for thatdetection is in the camera.
So the interesting thing iswhen you're going to a manufacturinguse caseor health careor transportation, it's not enough to say
I detected something in the camera.
If you don't know where the camera is,the data is essentially useless.
When you when yousay you don't know where the camera is,you mean in the physical space and.
The physical world.
If I don't know the pose of that cameraand I can't transpose the datafrom a bounding box into somethingthat isn't real, world coordinatesthe data doesn't really mean much.
I never thought of it that way.
I thought, you know,because the camera's really onlytaking a 2D picture right?
And the bounding box says, Hey,
I've got something in in, in there.
I could even say, I have Rob here.
I could say that right?
Yeah. What, what is here?
Where is here and what is here?
Yeah, I guess that matters. Yeah.
So I guess I guess the thing that we'retrying to answer is is three things.
What, where and when.
You know,what is it that we saw or detected?
Where was it and when was it?
And, you know, time, coordinationand, and,you know, precisiontime stamping is pretty well understood.
You know, you can look up an A.P.server and get a timestamp,but doing the same thing,you know,if you had G.P.S., that's helpful.
It gets you in within a certainradius of where you think something is,but you don't always have it.
You might be, you know, might be indoorsor something like that.
But you also don'tget like a precise pose.
You don't get like the actualdirection the camerais pointing or thethe object is pointing, shall we say.
And if you have the cameraand you have the exact pose of the camera,then thenthat's some additional informationthat essentially what itwhat we're really trying to do is,is well, let me back up.
If we do create like a digital twinthat's an overloaded term, but a digitaltwin or a mirror world of the real world,if we have a digital representationof that real world,the real question is how do we projectthe sensor data onto that virtual world?
So you on?
I got it. I got it. Done this down.
So I understand.
Okay, So, so what you're saying is
I've got cameras and sensors out therethat are taking pictures.
You're telling mewhat's at their location,but that's meaninglessif I can't superimposethat in the real worldor a representation of the real world.
Otherwise I can't see the real interactionbetween those objectsthat I'm detecting in the real world.
Yeah, maybe we can think about iteven a little simpler way.
Is that in orderto make sense of the of the sensor data,
I need to know where that the virtualrepresentation of the sensor is.
So essentially, you know, when the sensor.
Right, So like at my house, I'mgoing to even dumb it downeven more At my house,
I got security camerasbecause someone stole something offmy porch.
Amazon, you know, someone stole something.
So all of a suddenthe ring camera was not enough.
And I'd put up five more cameras aroundmy house and I have them labeledright front side back, Whatever.
You know what that means?
I know what that means,but the computer doesn'tknow what that means.
It has no concept of exactly right.
So me, myself, when I'm looking at allthe cameras, I can see
I saw this person walk through the front.
Come onto my side yard, into my backyard.
I know thatbecause I know where I put the camerasand I name the camera front side and back.
What if you don't speak English?
Yeah, that's bad, right?
Or Mike, my computer.
If I wanted my computer toto watch my kidssneaking out of the house at night.
I have three teenagers. Right.
Then I want to know which path they took.
The computer can't tell me thatwithout having some notionof where those cameras arewith respect to the house.
That is exactly right.
Okay, so my kids, if you're watching thisshow, Rob's going to help me catch you.
I'm just saying.
You know, it's not the first time
I've heard that.
Yeah. Oh,so that.
So why did you get in this space?
I mean, why does it matter that we havewhy does it matterthat the computer has an accurateor semi accurate representationin the real world?
Well, it's really about data reuse.
How do you how do you represent the datain a way that that more than just oneapplication can make sense of it.
So the idea is like if I put,you know, cameras up in a factoryfor monitoring wherewhere a product is or monitoringsomething aboutthe space like where people arefor security purposes, typicallyyou put up security cameras over hereand then you put upmachine vision cameras over there.
And those two systemslike one for tracking a productand another one for tracking.
People don't talk to each other.
They don't represent the differences. Yes.
And another one would be like
I put up a security camerato monitor this entry way hereand I have an HGV that has cameras on itcoming down a corridor.
And thenyou you know, you have a blind corner.
One system knows that the personis about to be run over,but the other system doesn't know it.
There is a digital version of somethingin a computer somewhere over hereand Robot can't talk to it, doesn'tknow how to how to connect the dots.
So the idea is that if you havethis digital version that is essentiallya a common coordinate systemfor everything to work in, then thethe robot can slow down because it knowsthat the person is there right?
All right. I gotcha.
So this commonality or this common wayof describing the real worldends up being very importantwhen I'm especially in factories rightnow, factories where I've gotsafety systems, I've got security systems,
I've got product quality,all these things that are running.
Would it be greatif they could all share the samevirtual instance of the world?
Yeah, I'm saying it's essentially tryingto bringthe spatial awarenessinto into the machine machine.
It's like, how do you know if,if you think about how we as humansmake sense of the world, we create thismental model of the space around us.
It's in 3D. It's in 3D plus time actually.
And we use our senses, audio touch, visionin order to build that 3D understandingof the world around us.
And then we can determine how to actand really,
I just don't see how we're going to driveforward machine based A.I.without having that spatial understandingas as one of thatcore fundamental components of that.
So I really like the premiseof I think is super cool.
Could we leverage thisalso in the VR, AR, VR and AR worldas well, where, hey, maybe letlet's talk about Chernobyl, right?
If I had a accurate modelof Chernobyl and sensors currently,
I could go in there with a VR headsetand a robot and move around and workin that environment.
Without going in the environment.
Do you seethat is one of the possibilities here?
It's it'sthis notion of being able to well,
I like to say air for when you're there,
VR for when you're not.
So okay yeah I like that.
Yeah that the amazing thing aboutyou know, I actually think aboutaugmented reality and roboticsas being closer togetherthan augmented realityand virtual reality.
Augmented reality and robots,they have to operatewithin the physical space.
They need to.
They need. To.
Yeah, VR doesn't, Right. Right, right.
But that makes use of datathat is capturedfrom the live sceneto create that that digital twinof of the physical space to mirrorthe physical spacethat you can walk throughin virtual reality when you're not there.
So there's a sort of continuumthat happens when you're in that space.
But then if you can store and and maintainthe history of that over time and maybe,maybe push up the data and near real time,you can ultimately walkthrough your factory,walk through Chernobyl, walk through thatarea where humans can't particularly gofor one reason oranother, as if you were therein virtual reality.
And ironically,the virtual reality headsetis creating a digital twin of the spaceyou're in. Yes,you could in turn go somewhere elseand then walk through and, you know,so it all comes together.
So I need
I need the sort of map and twinthat the space where I am operating there.
And in both cases when I'm remoteand when I'm am on premor in the in the physical scene.
So you've really taken your physics degreekind of to the nextthe next level, right?
Applied physics, right.
Because now you're saying,
I understand the physical world.
Do you understand,you know, our understanding of physicsin the physical worldand now you're saying, let's see if we cancapture that in the virtual world.
Yeah, in a in a twin of the real world.
What other use casesdo you see that we can usewith, with this kind of technology?
Is it, is itjust, it's super cool.
But obviously, I mean,what other things can I use for thisthis type of technology?
Well, I would say that there are usecases out thereright now that we use every daythat you don't quite realize.
But some things are more more emergentlike autonomous driving.
They need to work against amore real time digital twin.
But every time you use Google Maps,for example, you're using a digital twin.
You're using Oh, right.
Because it tells me traffic. Yeah, right.
It tells me accidents.
So people are already using digital twinalready.
They just don't necessarily call it that.
So there's there's a few thingsthat that you really want tothink about is one of themis like the latency aspect of it.
But Google Earth, for example, is a highlylatent digital twin of the world.
You know thatthey probably.
Look at the front of your house, right?
If you look at the front of your house,it's not the same cars that you had.
Yeah, a year ago or whateversomeone was visiting that day.
You're no longer friends with them. Yeah.
You're like,what is their car doing from my house?
And that's how I cut that tree down.
And I can't.
I forgot we had that tree there. You know?
So it's sort of a historical thing,
But imagineif we can reduce that latencyand get to the pointwhere your you're near real time.
Now we can start enabling closedloop type controls and maybe,maybe the implementation of that of thatis running closer to the scene itself.
And I like to think that maybe,maybe in the future will actually havethe truly intelligent scenes likein that Iron Man movie, like the Jarvis,you know, it's like it's in the house.
It's like you talk to him,he knows where everything isand where the robots areand knows who's there.
You can say, Hey,you can imagine thissituation where you can monitorwhere you are and the lights arejust kind of following youand you know, whatever.
You know, that real time current closedloop control aspect of it is coming.
So, Rob, have you already set this upin your house?
To some extent.
My truth,your true applied physicists, right?
Yeah. It's funny.
One of our lead developers, Chris,he has a set up as at his house andand so he can monitor you knowwhen there's a delivery truck versusnot a delivery truckor when the mail arrived or whatever.
And he has he has this notificationbeing sent to his phoneand it gives a certain alertwhen it's the UPS truckor whatever the Amazon truck versuswhen it's something else.
And his dog has learnedwhich which sound it is.
So you just get inthere.
It is my dog uses right, my dog usesscene scape, my dog uses a digital toy.
Right. I love it.
That's a headline. So it's like,you know, it's.
To make the papers.
They learned that that it'sa certain sound so it's doing the dog'sjob for itso I think that's just hilarious.
So factoriesbig, huge use cases in factories,security, inventory management,all that detail.
Yeah we retail Yeah
Can I, I mean Amazon is using itare they using something similarto this in their, in their know.
What are they. They're no touch stores.
Like Amazon go.
Are they using this kind of technology.
I think any case where you'reyou're having multiple sensorscoming together to try to trackwhere objects and people areand everything using different modalities.
I don't know of any other way to do it.
You really need toto calibrate all of these sensorstogetherinto the into the same coordinate system.
And if you were to introduce that,that's sort of commonrepresentation or digital twinor whatever you want to call it.
That is a fundamental way to doit, really.
It's just getting it into SCI unitsand think back to the physics things likehow do we get it from pixels into metersor from from pixels into a geospatial?
You those sorts of like thinkingabout the unitsfirstprinciples is really a good way to go.
I think you know that that reminds me,and you probably know the storybetter than I do.
There was a a probe that we sent to Marsthat had the wrong units, right?
It was it was feet over.
Than meters or something like that. Yeah.
And because they weren't in a standard,
I setwhat happens every you know,we crash it right into the ground.
Yeah. This Well.
Yeah. Right into the Mars ground
I should say. Right.
And it's because we had people from Europeworking with people from America.
And there is just this assumptionthat, you know, of the unit.
So it's it's evenworse in the computer industry, Right.
As far as the different types of datathat's collected in the nonnon conformity to any kind ofspatial units or whatever,whatever the case may be.
I guess it depends on the the discipline.
But like with with an Iotand like the type of sensordata that comes off of Iotor even things like,like temperature and humidityand those sorts of things,some things are more maturein terms of the,the standardization around itor there's a standard and one industrythat's different than a standardin another industry.
So the real idea is to focus onhow do I get to as a unit and representthat bear, but also look outsideof your industry and like for example,when you're creatinga, you know, a digital twin, so calledmaybe there is another industrythat's already donethis and an example would be scene graphsfor gaming or rendering.
They already have a representation forfor a physical space or a virtual space.
And if we just say, Oh,this is just the way to transportthe data, like here for a universal scenedescription format USD.
Now there is a way to transport that data.
That's, that's standards based.
So let's just adoptthat and maybe extend in.
Is that what you guys say?
Because this sounds kind of funnybecause I know the answer to this,because I've, I've talked toyou guys before.
You guys actuallywent to the gaming industrybecause there's a lot of peopleworking in gamingand they create incredible worlds, right,that all have physics involved in it.
So you just went there, right, and said,
Why don't we just use what already exists?
Why reinvent the wheel?
There's decades of developmenthappening there.
And so and maybe theone thing I like to think about is like,remember, we used to play these gamesback in the day, right?
And really pixelated or 2D.
Yeah, yeah, yeah, yeah.
And then and thenlike the first 3D games came alongand they were really low poly,you know, it's like blocky but fun, right?
It's like you got a sense of motion.
I remember playing this is a 2D game, butplaying Prince of Persia back in the day,
I don't know if you.
Oh yeah, yeah.
They, they like Rotoscoped the captureof, of the player and it was so lifelikeeven though there's just pixels.
And then,and then fast forward to like todaywhere you can look at a game and, and,and it looks just like it's photorealisticand then introduce like Gans into thatand really had this like superlike indistinguishablefrom the real world.
I think the same thing we're going tostart to see the same thing in this mirrorworld approach is where todaythe sensor data is not that great.
You know, you can sort of see,see where things are.
It's low poly, butwe can get lots of value out of it like,like with autonomous cars or or,you know, factories or retail or whatever.
But with Moore's Law,with better sensing and bettercompute the fidelity of that twinand that and the additional use casesthat will come alongthat will drive this virtuous cycle of of,you know, bettersensing, better compute, better twins,more value back to being able to invest.
And it'll be this this massive,you know, virtuous cycle that will drivedrive this technology forward.
So it's what do you doyou think that the gaming industrythan the standards that they use,are they sufficient enoughto to model the real worldor they're missing things in therethat that you need to add to it?
Or can you take it for just as it is?
Well,that's that's a really good question.
I think thatwe should try to useas much as we can, buttypically,like a render works on a frame wise basis.
So it's sort of like, you know, a clock.
Is ticking, slices in time.
You know,it's like just rendering forward withwith a real world, it's a little messier.
You might have really asynchronouslike some low latency, somesome high latency,you know, sources of datathat all coming in at different timescales and and everything.
And you need to be able to coordinateall of that and sort ofbring in the accuracy of these thingsor the thethe error bars inand be able to make sense of that model itand maybedo some things like say, well,how late into something,how much time do I haveto get the data in into my model.
Maybe I can I can model with less.
I can run my my understanding of the worldnear real time at relatively low fidelity.
It's a little choppy,little, little chunky,but I can store all of that dataand render it in higher fidelity,you know, for historical analytics,you know, so it's definitely messier.
But there we're standing on the shouldersof giants here.
I mean, a lot of the workthat's been that's there in the gamingindustry can be reused for sure.
Oh, that's awesome.
So my my next question is that the productthat Intel has,it's called scene scape, right?
Something you've been working onfor for some time.
This is not like new stuff.
This is not oh, Rob did thisin his basement with a couple otherpals at Intel.
Tell me, I mean,is this a productthat people can look at right now?
And then what what use casesare you guys using it for today?
Yeah, well,it kind of did start as something that wewe developed on nights and weekends.
You know, it's like a vision.
It really did come around saying,how do we get from pixelbased units into real worldunits and cameras?
And we quickly realized thatif we can separate things outinto the multi-modal tracker, along with,you know, motion modeling and everything,now we can support other modalitiesand we can make things better.
But yes, same scape intel,same scape is the result of that work.
And, and hopefully,you know, with our customersand our partners,we can drive this vision forward to,to, you know, really improve outcomesfor our customers.
But we quickly realized that that trackeruse case, the multimodal trackingis a sort of neededcapability across industries.
So it quickly extended into thingslike health care, for example.
Imagine a scenario where you can trackwhere the instrumentsare in an operating room to make sure.
Nothing gets left in. So yeah.
If I track if I track somethinggoing into the patient,
I need to make sure I'm tracking it out.
For example, that's a high fidelityuse case that requireslots of high resolution camerasto do and 3D detection and all that.
But really it's the same problemas trackinga person, moving in a factoryand making sure that I don'tturn on a robotwhen that person is nearby or whatever.
So these are safety use cases,but I think there are some someother use cases like just to improveoperational efficienciesthat that we can do sooner,you know, like or just knowinghow many people are in line at the storeor knowing if I need to turn the lightson using a common digital twinrather than special purpose sensors.
So I find this interestbecause you talked about multi modalitiesand one of the demosthat you guys have shown me isyou can estimate where someone'sgoing to goand the next camerathat they should be showing up in.
And when they don't, then, you know,because a lot of timesyou can't cover your whole spacewith cameras or sensors,but you know where the gaps arebecause you have a 3D model of it.
And when someone's walking infollowing a path, you know where they'reyou know, where they are in the factory.
That's the idea.
And and it's it's another another reallyanother really interesting aspect of thisis that notwo sensors agree on onwhere something will interest.
It's like there's always some error barsback to the physics thing.
There's always an error baron that measurement.
And so if I have if I have camerasthat overlap or other sensormodalities that that cover the same area,
I need to be able to reconcilethe the measurementabout where somebody is.
And then if I can set up a motion model,
I say, well, here'sthe math, here's the velocity, here'sthe maximum acceleration that.
So back to physics, right?
All comes down to physics,the maximum accelerationthat this thing can can handle or perform.
Then I can set upand do some extrapolation.
I can say, yeah, this person ismoving in this direction andthen it's probably the same personthis hour.
Previously,if they showed up in the next camera,if I don't have coverage in that areaor a measurement of the track.
So yeah, I mean the physicsdefinitely comes into it.
So, you know, a lot of peoplethat think about digitalwhen they say, just put cameras upand you'll know everything about.
Rob, thanks for explaining it.
It's a complex problem. It's not easy.
Yeah,and think about the fact that it's it's ait's a continuum.
It's like you have low, low latencydigital twins for real time type stuffand higher latency ones like Google Mapsthat for longer term historical stuff.
But really in the end, it's all goingto sort of compress into the real timeand, you know, really improve outcomesfor for a lot of,you know,
I hope that like a lot of technologists,
I hope that we can use this technologyresponsibly andand really improve outcomesfor for ourselves and for.
That the use cases.
My mind's just going crazy on use casesright now, mostly tracking my children,but that's.
Actually a responsible use.
I'm not sure.
That's a totally responsible rightas a parent.
Yeah, it's still there. It's fun.
I love it.
All right, Rob, hey,thanks for coming on the show today.
Dan, take care.
Thank you for listeningto Embracing Digital Transformation today.
If you enjoyed our podcast,give it five stars on your favoritepodcasting site or YouTube channel,you can find out more informationabout embracing digital transformationand embracing digital.
Dawg Until nexttime, go out and do something wonderful.